-
Notifications
You must be signed in to change notification settings - Fork 8
ci: Run QEMU tox integration tests in GitHub workflow #46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
|
Hmm, given how broken linux-system-roles/cockpit#200 is -- ISTM that we should either spend some time to actually fix them and get them into CI (I'm happy to help there of course), or drop them. I'm not yet sure what the primary upstream way of testing the roles is -- is that through tmt/TF? Update: At least some of the failures also happen in existing tests, see linux-system-roles/cockpit#200 (comment) -- they just lag behind a bit, they don't test F42 Update 2: Discussed in chat, tox tests are current, valid, and desired. Let's fix them. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #46 +/- ##
=======================================
Coverage ? 52.31%
=======================================
Files ? 1
Lines ? 346
Branches ? 0
=======================================
Hits ? 181
Misses ? 165
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
.github/workflows/ansible-tox.yml
Outdated
|
|
||
| - name: Run tox integration tests | ||
| run: | | ||
| if ! tox -e ${{ env.TOX_ENV }} -- --image-name ${{ matrix.image }} --make-batch --log-level=debug -- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The challenge here will be to publish the logs for each test in a way so that you can easily see from a high level which test failed, then examine that log for that particular test to find the error that caused the test to fail.
Looking at the artifacts/ansible.log is much more difficult and confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you do --make-batch, by default, the log for each test is written to tests/tests_name_of_test.yml.log
You can control the location of the log file, but then you cannot use --make-batch - you will need to create your own batch file https://github.com/linux-system-roles/tox-lsr?tab=readme-ov-file#batch-file-batch-report-batch-id and add a --log-file .... for each test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point -- we can parse the results and only cat the failing ones, with some easy-to-search separator like ====== , WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to think about this some more.
Maybe there is a way to upload logs to https://dl.fedoraproject.org/pub/alt/linuxsystemroles/logs/ like we do for the testing farm logs. If not, then maybe there is a way to have separate log files in some sort of github action result artifact. https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/storing-and-sharing-data-from-a-workflow If not, then we can use github action collapseable sections https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/workflow-commands-for-github-actions#grouping-log-lines
and at the top, have a summary e.g.
- tests_foo.yml - PASS
- tests_bar.yml - FAILED
- tests_xxx.yml - PASS
- ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gh action artifacts are a great thing to transport files from one workflow/job to another, but they are unwieldy for human consumption: they always need a token, and are hard to find.
Uploading logs to dl.f.o would be possible, but makes it harder for contributors as that wouldn't work on forks -- you'd have to change the workflow.
Summary and grouping sounds nice! I re-enabled Fedora-41 for the time being to get a failed test run (which is more interesting for the logging). Of course I'll re-disable this for the final version.
You can see a successful c10s and a failing f41 run here.
7124555 to
bacf7c6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
.github/workflows/ansible-tox.yml:57
- The command line for running tox includes an extra trailing '--' which might lead to unintended parameter parsing. Consider reviewing and removing the extra '--' if it is not required.
tox -e ${{ env.TOX_ENV }} -- --image-name ${{ matrix.image }} --make-batch --log-level=debug --
|
[citest] |
|
Ack, thanks @richm for review and your suggestions! I put back the previous Fedora 41 skipping so that the PR goes green. I updated linux-system-roles/cockpit#200 with the exact same commit/workflow to prove it's not specific to sudo, and also sent it to linux-system-roles/.github#94 |
|
Renamed workflow as per linux-system-roles/.github#94 (review) |
This comment was marked as outdated.
This comment was marked as outdated.
|
... which fails in CI with
This must be a regression in tox-lsr itself, investigating. Running tox locally works fine, but I do have There are only a handful of commits between 3.4.0 and 3.5.0. With the topmost commit 51d7adc7b8f7c9b8b84f982a3e7b63d1d21f5264 (which is tagged 3.5.0) it still works, i.e. with but with pip3 install "git+https://github.com/linux-system-roles/tox-lsr@3.5.0" it fails on this fmf error. WTH?? That is the very commit referenced by the release: https://github.com/linux-system-roles/tox-lsr/releases/tag/3.5.0 . I could at most imagine that the Going back to 3.4.0 for the time being, but I also pushed some optimizations which reduced the installation of test deps from 1.5 to 2 mins to ~ 20 seconds. |
5bcebba to
da6f8de
Compare
GitHub workflows recently grew the ability to set up nested QEMU kvm, but without much fanfare (see [1]). It just needs to create the `/dev/kvm` device node, but the host has it enabled and the kernel supports it. With that we can run the QEMU tox tests. They are not covered anywhere else, and currently have some bugs, so let's cover them in CI and keep them green. [1] https://github.com/orgs/community/discussions/8305
|
Bumped tox-lsr to 3.5.1. |
|
linux-system-roles/.github#94 landed, so this can land as well. |
GitHub workflows recently grew the ability to set up nested QEMU kvm, but without much fanfare (see [1]). It just needs to create the
/dev/kvmdevice node, but the host has it enabled and the kernel supports it.With that we can run the QEMU tox tests. They are not covered anywhere else, and currently have some bugs, so let's cover them in CI and keep them green.
[1] https://github.com/orgs/community/discussions/8305
I know that this should eventually go to https://github.com/linux-system-roles/.github , but I want to test the run against an actual project. What is the official workflow for that? In recent .github project PRs there are zero test runs -- is there some magic to say "Run this PR against two roles, to prove it works"?
I developed this in my fork, and you can see the results in a recent run. Fedora 42 and CentOS 9 work. CentOS 10 has a broken image ref, I sent linux-system-roles/linux-system-roles.github.io#120 to fix that. Fedora 41 shows an ansible bug where it tries to use libdnf5 to query for an installed package instead of dnf4. (This reproduces fine locally, and I confirmed that 40 and 42 are ok -- should we just ignore that? This is tracked in https://issues.redhat.com/browse/RHELMISC-10110)
This was an initial step towards https://issues.redhat.com/browse/RHEL-85668 to figure out /dev/kvm in a GitHub workflow, but I believe it is useful in its own right. I don't see any CI coverage of the tox tests, neither in workflows nor in tmt. Of course I may have missed something, but given the above two errors there is clearly some coverage missing.
@richm @spetrosi Do you think this is useful, or does it duplicate something?