chore: add UI tests with mock Deadline server and CI workflow#1102
chore: add UI tests with mock Deadline server and CI workflow#1102crowecawcaw wants to merge 10 commits intoaws-deadline:mainlinefrom
Conversation
86854fc to
7df77be
Compare
- Add mock Deadline server (test/mock_server/) for local testing - Add UI tests for config gui, CLI operations, and bundle gui-submit - Add hatch ui environment with PySide6 and xa11y dependencies - Add GitHub Actions workflow for UI tests on Linux, macOS, and Windows - Fix mock server GetFarm response to include costScaleFactor - Add list_queue_environments endpoint to mock server Signed-off-by: Stephen Crowe <6042774+crowecawcaw@users.noreply.github.com>
| def _find_app(pid: int) -> xa11y.App: | ||
| """Wait for the xa11y app to appear for the given PID.""" | ||
| end = time.monotonic() + STARTUP_TIMEOUT | ||
| while time.monotonic() < end: |
There was a problem hiding this comment.
This piece of code seems duplicate in config_gui, could we reuse it ?
There was a problem hiding this comment.
Maybe we can abstract some of these repeated test util functions into a unit test helper class?
There was a problem hiding this comment.
Good call. I added some helpers and classes similar to how I've seen frontend e2e test set up where there's a class representing the page/window/widget to be tested.
| return json.loads(result.stdout) if result.stdout.strip() else {} | ||
|
|
||
|
|
||
| def _deadline(*args: str, env: dict) -> str: |
There was a problem hiding this comment.
Are we trying to test a test resource used in test here? Is it necessary ?
There was a problem hiding this comment.
Sort of. These tests verify that basic CLI commands like deadline farm list with the mock HTTP server. I think they give us a little bit more confidence for CLI behavior that unit tests since we're not patching any CLI internals. I do like that they also verify the mock server is working correctly since it's not a trivial test helper.
|
This is a huge improvement to what we have now, but thinking big picture here, how would we implement some of the more complex text cases such as DCC integration points? Do the github runners have GPUs and could we install blender, etc? Or are you imaging that those more complex test cases would not run on CI and would instead be part of the release workflow or some other pipeline? |
| jobs: | ||
| ui-test-linux: | ||
| name: UI Tests (Linux) | ||
| runs-on: ubuntu-latest |
There was a problem hiding this comment.
Is there any value in testing on an older instance instead of latest for validating support on older machines?
Another thing to think about is that the underlying machines these alias's point to will sometimes change which could cause difficult to debug errors from dependencies not being available. Maybe not a huge deal because the system dependencies below seem pretty light, but calling it out as something to think about.
There was a problem hiding this comment.
Good point about older systems. I think that would be worth testing and GitHub offers ubuntu 2022 as a test runner option. I think OS version coverage might make sense on some sort of end to end test which also exercises the installer and DCM? I imagined these particular UI tests being more aimed at making sure the UI code is correct. I don't feel strongly and am open to ideas.
Also good point that the image will change over time and maybe the list of dependencies will not be correct. I don't think it'll be too painful since these tests run when the PR is raised so they're quick to update and validate (as opposed to e.g. CDK managed pipelines with canaries).
Also, on your comment about DCC testing: IMO it'd be great to run larger e2e or DCC tests in GitHub actions too since actions are easy to set up, easy to fork, and support all major OSes. It looks like GitHub has some test runners with GPUs but maybe they require opting in and paying. Or alternatively, there might be ways to get a soft GPU that's sufficient for opening Blender. I think tests like this would belong in a separate suite. But they would be amazing to add.
- Add test/ui/helpers.py with DeadlineApp base class, ConfigDialog and SubmitterDialog page objects that encapsulate process lifecycle and xa11y interaction - Remove test/ui/conftest.py (config_gui fixture inlined) - Deduplicate find_app, close/kill, and dialog constants across test files - close() waits for clean process exit before force-killing - Always use python -m deadline for subprocess launch consistency - Run all UI tests on Linux under xvfb (not just CLI tests) - Pin xa11y to 0.6.* - Add test_button_exits_cleanly to verify Ok/Cancel exit the process Signed-off-by: Stephen Crowe <6042774+crowecawcaw@users.noreply.github.com>
Signed-off-by: Stephen Crowe <6042774+crowecawcaw@users.noreply.github.com> # Conflicts: # DEVELOPMENT.md
xa11y accessibility APIs don't work under xvfb on Linux. GUI tests only run on macOS (and Windows once supported). Signed-off-by: Stephen Crowe <6042774+crowecawcaw@users.noreply.github.com>
Set up dbus-run-session, Xvfb, at-spi-bus-launcher, and at-spi2-registryd so xa11y can interact with Qt accessibility tree on Linux. Based on xa11y's own CI setup. Signed-off-by: Stephen Crowe <6042774+crowecawcaw@users.noreply.github.com>
The env vars must be exported inside the dbus-run-session bash block, not in the outer workflow env. Also add QT_LINUX_ACCESSIBILITY_ALWAYS_ON and AT_SPI_CLIENT to match xa11y's CI setup. Signed-off-by: Stephen Crowe <6042774+crowecawcaw@users.noreply.github.com>
Include list of visible accessibility apps in the TimeoutError message to help diagnose AT-SPI registration issues. Signed-off-by: Stephen Crowe <6042774+crowecawcaw@users.noreply.github.com>
Linux AT-SPI reports PID 1 for all apps instead of the actual process PID. Capture a baseline of visible apps before launching, then match by finding the newly appeared app name. Signed-off-by: Stephen Crowe <6042774+crowecawcaw@users.noreply.github.com>
Qt exposes tab widgets as radio_button on macOS but page_tab on Linux AT-SPI. Signed-off-by: Stephen Crowe <6042774+crowecawcaw@users.noreply.github.com>
Qt may expose tabs as radio_button, page_tab, or tab depending on platform. Try all three and dump children roles on failure. Signed-off-by: Stephen Crowe <6042774+crowecawcaw@users.noreply.github.com>
|



What was the problem/requirement? (What/Why)
What was the solution? (How)
This PR introduces 3 major test components:
localhost:8000, we can run the client against our mock server. Using the mock endpoint gives us higher fidelity testing than unit tests (because the full client is running without any code being stubbed out). The fidelity is not as high as integ tests which hit the actual Deadline endpoint, but we don't need credentials or real AWS resources, so tests are easier to run (especially in GitHub actions) and fast.The initial tests in this PR cover some basic config reading and writing and exporting job bundles. The tests function more like examples in this PR. If we merge then, we can expand coverage.
What is the impact of this change?
We can test the UI.
How was this change tested?
CI actions!
Was this change documented?
n/a
Does this PR introduce new dependencies?
Yes - adds new test dependencies.
Is this a breaking change?
No
Does this change impact security?
No
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.