Many projects rely heavily on CI jobs, e.g. based on github actions. We already had ideas for tight integration of openQA into such workflows for years, e.g. in https://progress.opensuse.org/issues/48641
- G1: A pull request in github triggers actual openQA tests
- Review what had been done already, e.g. in https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/master/Makefile#L167 and also see https://progress.opensuse.org/issues/77698
- Extend openQA-in-openQA tests to trigger tests in o3 cloning an existing test as baseline or trigger a new one
Unfortunately I could not invest any reasonable effort. for this project on the first two days of HackWeek so I could only quickly gather what we have done for related work in the past: In https://progress.opensuse.org/issues/63712 with https://github.com/os-autoinst/openQA/pull/2618 we extended openqa-clone-custom-git-refspec to parse github pull request descriptions for special commands. We never came to use it in automatically triggered CI jobs and likely many have forgotten about this feature or never knew it existed in the first place
Brainstormed with szarate about related work. Found https://github.com/marketplace/actions/pull-request-comment-trigger and https://github.com/orgs/community/discussions/25389. Put it into os-autoinst-distri-example github action workflow. Created an openqa-mon container build OBS project with:
osc meta prj home:okurz:container:openqa-mon -e osc meta pkg home:okurz:container:openqa-mon openqa-mon -e cd ~/local/osc/ osc co home:okurz:container:openqa-mon vim Dockerfile osc add Dockerfile osc ci -m "Update Dockerfile"
After a quick waiting time we can now do
podman pull registry.opensuse.org/home/okurz/container/openqa-mon/containers/tumbleweed:openqa-mon
or for a complete command
podman run --rm -it registry.opensuse.org/home/okurz/container/openqa-mon/containers/tumbleweed:openqa-mon openqa-mon --continuous 10 --exit --follow https://openqa.opensuse.org/t3088179
Found a typo in openqa-mon, fixed with https://github.com/grisu48/openqa-mon/pull/86
I will have yet to see if the command also works in a non-interactive github action environment.
Then I found https://build.opensuse.org/package/show/home:okurz:container/curl-openqa-mon so likely I don't need the other project anymore. I can use
podman run --rm -it registry.opensuse.org/home/okurz/container/containers/tumbleweed:curl-openqa-mon openqa-mon --continuous 10 --exit --follow https://openqa.opensuse.org/t3088179
I added three secrets into
https://github.com/organizations/os-autoinst/settings/secrets/actions/new: "OPENQAAPIUSER", "OPENQAAPIKEY", "OPENQAAPISECRET" which I read out from the o3 database. Following http://open.qa/docs/#_personal_access_token I can now trigger openQA jobs from github actions with
curl -u $OPENQA_API_USER...
To be able to test in my fork without creating and updating pull requests I need those variables in my own scope as well so I added the variables to https://github.com/okurz/os-autoinst-distri-example/settings/secrets/actions/new as well.
My development workflow looks like this: In vim I edit the workflow file, save and call
!git amend -a && git rp using my personal git aliases, e.g. see https://github.com/okurz/dotfiles/blob/master/.gitconfig#L49, and then monitor https://github.com/okurz/os-autoinst-distri-example/actions/workflows/openqa.yml which updates without reloading. I just need to open a browser tab for details on each run, e.g. https://github.com/okurz/os-autoinst-distri-example/actions/runs/4065078531
https://github.com/okurz/os-autoinst-distri-example/actions/runs/4065138648/jobs/6999439038 shows a successful trigger of an openQA job and outputting the URL. https://openqa.opensuse.org/t3088418 is the result on o3. The github variables that should be used to construct a proper test name and build could apparently not be read. https://github.com/okurz/os-autoinst-distri-example/actions/runs/4065167928/jobs/6999505509 shows the triggering&monitoring of a job.
In https://github.com/grisu48/openqa-mon/issues/39 I asked already about two years ago to have a nice way to use openqa-mon to monitor and output the aggregated result of jobs. It seems there is a problem though as
openqa-mon --continuous 10 --exit --follow https://openqa.opensuse.org/t3088419 returns an exit code of zero for that incomplete openQA job when it should be non-zero according to https://github.com/grisu48/openqa-mon/issues/39#issue-856674318. Reported that in https://github.com/grisu48/openqa-mon/issues/39#issuecomment-1412135651 . I can consider using https://github.com/os-autoinst/scripts/blob/master/monitor-openqa_job as alternative.
Regarding the default github variables seems I got it wrong. https://adamtheautomator.com/github-actions-environment-variables/ shows that I need to use
https://github.com/okurz/os-autoinst-distri-example/actions/runs/4065927433/jobs/7001301775 is showing good success now with a custom job state&result polling using:
while :; do read state result < <(echo $(curl -sS $job | jq -r .job.state,.job.result)) && [[ $state = 'done' ]] && break || echo "job state of job ID $job_i d : $state, waiting…" && sleep 10; done
this waits correctly for the job to finish and return with the exit code corresponding to if the job passed or not:
job state of job ID 3088453 : scheduled, waiting… job state of job ID 3088453 : running, waiting… job ID 3088453 : state: done, result: incomplete Error: Process completed with exit code 1.
Now the next step is to fix the openQA job so that it actually does something more useful, either for this repository or immediately in https://github.com/os-autoinst/os-autoinst-distri-openQA/
jobs like https://openqa.opensuse.org/tests/3089732 incomplete with the message
Reason: setup failure: The source directory /var/lib/openqa/cache/openqa1-opensuse/tests/example/needles does not exist
which also aborts the openQA job without even saving the vars.json which could have helped the user to debug.
Maybe with os-autoinst-distri-openQA I am more successful.
isotovideo -d CASEDIR=https://github.com/os-autoinst/os-autoinst-distri-openQA.git#use_podman_everywhere DISTRI=openqa NEEDLES_DIR=https://github.com/os-autoinst/os-autoinst-needles-openQA SCHEDULE=tests/install/boot
locally at least looks fine.
Small improvement PR https://github.com/os-autoinst/openQA/pull/4991
But back to the example distri.
isotovideo -d CASEDIR=https://github.com/okurz/os-autoinst-distri-example.git#feature/hackweek22_trigger_openqa_in_ci DISTRI=example NEEDLES_DIR=needles SCHEDULE=tests/boot
looks fine as well. As expected starting the module "boot" and failing after 30s timeout as there are no needles.
So triggered again from github action as yesterday but with
DISTRI=example NEEDLES_DIR=needles and then monitoring
but this still shows
Reason: setup failure: The source directory /var/lib/openqa/cache/openqa1-opensuse/tests/example/needles does not exist
To debug and test I am disabling normal worker classes on openqaworker20 in /etc/openqa/workers.ini and trying again with explicitly triggering on
WORKER_CLASS=openqaworker20. I am still struggling with the needles dir as there is no copy of os-autoinst-distri-example on our o3 infra in the cache. http://open.qa/docs/#_triggering_tests_based_on_an_any_remote_git_refspec_or_open_github_pull_request and https://github.com/os-autoinst/openQA/pull/4851/ explains that I should be able to use
NEEDLES_DIR=%CASEDIR%/needles. With curl
%CASEDIR% turns into broken characters. Likely I need to escape the
%CA… but did not succeed so far. But also
openqa-cli api --o3 -X POST jobs CASEDIR=https//github.com/okurz/os-autoinst-distri-example#feature/hackweek22_trigger_openqa_in_ci TEST=test-push-okurz/os-autoinst-distri-example#feature/hackweek22_trigger_openqa_in_ci BUILD=okurz/os-autoinst-distri-example#feature/hackweek22_trigger_openqa_in_ci DISTRI=example WORKER_CLASS=openqaworker20 NEEDLES_DIR=%CASEDIR%/needles
is not helpful ending up in the same error and a weird replacement leading to URL + "/needles" referencing a local directory. Might work in isotovideo but I did not get that far.
It looks like we have the following problems here:
1. test distributions that have an in-repo needle link like os-autoinst-distri-example in combination with caching openQA workers and in combination with using a CASEDIR pointing to a git repo currently can not be triggered without error leading to "setup failure: The source directory … does not exist". NEEDLES_DIR=%CASEDIR%/needles
should work here but the variable expansion happens before Worker/Engines/isotovideo.pm looks for%CASEDIR%
in the variable value which is not there anymore at that time
2. Even if the URL pointing to the same test distribution to be used for needles would come out fine os-autoinst can not parse url+branch+relative folder
3. curl messes up the%CASEDIR%` so I would need to find proper escaping
Maybe it's good if I continue to find a way to define a custom github action that I can reuse in other test distributions, e.g. openQA-in-openQA
Added according secrets in https://github.com/okurz/os-autoinst-distri-openQA/settings/secrets/actions .
So find runs in https://github.com/okurz/os-autoinst-distri-openQA/actions/workflows/openqa.yml now
Oh, and of course finally I find this one git branch where I already did more or less the same experiment but 2 years ago: https://github.com/okurz/os-autoinst-distri-openQA/actions/workflows/openqa.yml facepalm. At least this includes the "trigger something on openQA from github actions" but of course there is more to it.
Found https://github.com/nektos/act to ease testing of github actions. Thinking to move the workflow definition to an github action repo similar as we did with https://github.com/openSUSE/backlogger/blob/main/action.yaml. I tried to install act as "gh extension" following https://github.com/nektos/act#installation-as-github-cli-extension but that failed with "failed to run extension: fork/exec /home/okurz/.local/share/gh/extensions/gh-act/gh-act: no such file or directory". I reported https://github.com/nektos/gh-act/issues/2, maybe I am doing it wrong.
So I downloaded the binary into /usr/local/bin and execute from there. Now executing
read -s OPENQA_TOKEN export OPENQA_TOKEN while_inotifywait timeout 30 act -s OPENQA_TOKEN
but somehow this never comes to the point that the curl-call would return, kinda hangs there. Well, at least it's usable for a dry-run. Just to get openQA trigger variables right I can use the shell commands by themselves:
OPENQA_HOST=https://openqa.opensuse.org && job_id=$(curl -sS -u $OPENQA_TOKEN -X POST -d CASEDIR=https://github.com/okurz/os-autoinst-distri-openQA#feature/hackweek22_trigger_openqa_in_ci -d TEST=test-push-os-autoinst/os-autoinst-distri-openQA#feature/hackweek22_trigger_openqa_in_ci -d BUILD=os-autoinst/os-autoinst-distri-openQA#feature/hackweek22_trigger_openqa_in_ci -d DISTRI=openqa -d NEEDLES_DIR=openqa/needles https://openqa.opensuse.org/api/v1/jobs | jq .id) || echo "Failed: $job_id" && job=$OPENQA_HOST/api/v1/jobs/$job_id && while :; do read state result < <(echo $(curl -sS $job | jq -r .job.state,.job.result)) && [[ $state = 'done' ]] && break || echo "job state of job ID $job_id : $state, waiting…" && sleep 10; done && echo "job ID $job_id : state: $state, result: $result" && [[ $result = 'passed' ]]
this works to trigger and start a test which then later fails because we never specified an HDD. Cloning an existing job would work for this case but then the tricky part is to find the most recent job as template. For openQA we have https://github.com/os-autoinst/scripts/blob/master/trigger-openqa_in_openqa but this is also not the most straight-forward approach to general. We could try to teach openqa-clone-job to work with the "latest" links.
I can use
HDD_1_URL=http://download.opensuse.org/tumbleweed/appliances/openSUSE-Tumbleweed-Minimal-VM.x86_64-kvm-and-xen.qcow2 but this is missing a gnome environment which so far in the openQA-in-openQA tests is required. We could work around that by setting up an openQA job, e.g. as part of the usual Tumbleweed snapshot tests, that triggers the upload of a fixed-named HDD image that we can use then.
By the way the above only works because there is a directory /var/lib/openqa/share/tests/openqa hence the
NEEDLES_DIR=openqa/needles. That wouldn't work for a test distribution that is not already provided within that folder of openQA.
I managed to add GitHub action workflows that trigger and monitor openQA jobs on every push or pull request. Theoretically they would also trigger on new or edited comments in a pull request, a concept that could be explored further. I did not manage to come up with a generic way to trigger openQA jobs from an arbitrary repository due to limits in what openQA currently supports. So I can state that I reached the goal as originally defined although that was not reaching very far. I could identify limits and hence points where we could continue the work for openQA development as this HackWeek project was rather near to what we usually do within the SUSE QE Tools team anyway. Suggestions for features or bug fixes to look into:
- Support openQA test distributions with a simple needle subfolder included within the test distribution, e.g. os-autoinst-distri-example. At best those repositories should support simple jobs post calls with custom git repos like
CASEDIR=https://github.com/okurz/os-autoinst-distri-example.git#feature/hackweek22_trigger_openqa_in_ciand no need to specify variables just to "fix" the use of needles
- Learn how to escape
%CASEDIR%to be able to use curl
- Extend openqa-clone-job to understand links like
openqa-clone-job --within-instance "https://openqa.opensuse.org/tests/latest?arch=x86_64&distri=openqa&flavor=dev&machine=64bit-2G&test=openqa_install%2Bpublish&version=Tumbleweed"
- In openQA support triggering test distributions with CASEDIR pointing to a git repo when the test distribution and according needles are not provided already on that openQA instance
- Generalize the action into a reusable action in its own repo, extend with instructions how to use
Looking for hackers with the skills:
Nothing? Add some keywords!
This project is part of:
Hack Week 22
This project is one of its kind!