ITM 1100: Automated Integration Testing Harness For Dummy ADM Runs#149
ITM 1100: Automated Integration Testing Harness For Dummy ADM Runs#149NeilDaniel07 wants to merge 25 commits intodevelopmentfrom
Conversation
nextcen-dgemoets
left a comment
There was a problem hiding this comment.
Some initial comments:
- First of all, I'm sorry it took so long for me to look at this!
- Overall, this is pretty slick! It might even make it easier to set up testing for code reviewers-- I can just configure some groups and tell them to use the automated tester. Maybe that makes it too easy on reviewers? Hmm...
- Given that
automated_testing_paths.jsonis user-specific, maybe it shouldn't be in GitHub and should be added to.gitignore. Thoughts? - I wonder if maybe the
GROUPSconfiguration should be defined inautomated_testing_paths.json(which would maybe then be renamed toautomated_testing_config.json).GROUPSis something you'll want to change locally and not check in changes, which makes it seem like configuration.- On the other hand, it could be that
GROUPSstays inautomated_tester.py, but can be overridden inautomated_testing_config.json. When updatingconfig.ini.template,GROUPS(and the comment that refers to it) would be updated to include current, viable configurations, but users can override that in their config file. - Thoughts on any of this?
- On the other hand, it could be that
- In
automated_testing_paths.json, you say the paths may be relative to the server repo root, but thenrunner_pathlooks absolute, but relative to the client root. Theclient_pythonpath you checked in also looks odd. I set mine to"../itm-evaluation-client/venv/Scripts/python.exe", and that worked for me with a default setup. - It might keep things cleaner to use the name of the branch under test (via
--branch) as a directory name, instead of a file prefix. WDYT? - I'm not sure we needed to support phase 1. I didn't test it (yet). I'm not asking you to remove it, though, because it might be useful when there's a phase 3...
I might have more later.
|
Oh, one more thing-- I merged in the latest from development. You should update |
|
Please remove the |
Sorry for the oversight, the existing |
I removed the checked-in user-specific config file and replaced that approach with a sample
I took the latter approach here: the tester keeps checked-in default groups in
I changed the path semantics so all explicit configured paths are either absolute or relative to the server repo root.
I changed output layout to use the branch name as a directory instead of a filename prefix. Results are now generated under NEW FEATURES:
|
nextcen-dgemoets
left a comment
There was a problem hiding this comment.
I could probably approve it as it is now. Great work.
What would be involved (LOE) to saving the server output (stdout+stderr) in addition to the client output in a separate file in automated_test_results/<branch>?
If we did that, then maybe the filename format would be client_JUNE_OPENWORLD_4.txt and server_JUNE_OPENWORLD_4.txt.
Finally, I guess that automated_tester.py should be updated in any PR that changes the relevant server configurations. To that end, could you update it for the current state of development so that there are two groups:
- "FEB_OPENWORLD", "JUNE_OPENWORLD", "APRIL_OPENWORLD", testing True; and
- "FEB_OPENWORLD", "JUNE_OPENWORLD", "APRIL_OPENWORLD", testing False
We are in a strange case where the DEFAULT configuration isn't currently supported.
| '1': { | ||
| 'cfgs': ["DEFAULT", "FEB_OPENWORLD", "JUNE_OPENWORLD"], | ||
| 'testing': True, | ||
| 'phase': 2 | ||
| }, | ||
| '2': { | ||
| 'cfgs': ["DEFAULT"], | ||
| 'testing': False, | ||
| 'phase': 2 | ||
| }, | ||
| '3': { | ||
| 'cfgs': ["DEFAULT"], | ||
| 'testing': True, | ||
| 'phase': 1 |
There was a problem hiding this comment.
Could you use double quotes here instead of single quotes so that it can be pasted into the "groups" section of automated_testing_config.json? We'll still have to change True to true, but it gets us most of the way there.
Original Ticket: Here
Description:
This PR introduces a robust automated integration testing harness that captures dummy ADM run information across various different server configurations. The idea for this was first introduced in #147, when a large number of configs had to otherwise be tested manually after key changes to server structure. In general, when reviewers were testing new functionality or changes due to a new data collection period, the server had to be manually started and stopped for each possible configuration. Additionally, the client
itm_minimal_runnerhad to be manually entered with a terminal command in theitm-evaluation-clientrepository. Finally, the output of the ADM run was not captured in an outfile and rather just logged to the terminal, which can be lossy and difficult to compare across runs.This harness significantly reduces reviewer effort and ensures more consistent checks across server configurations. First, the script validates the test matrix titled
GROUPSagainst the active config file (swagger_server/config.ini).GROUPScontains entries that each define which config options to run the dummy ADM on, whether to do so in testing mode, and what phase the exercised scenarios belong to. The validation protocol ensures that required keys (cfgs,testing,phase) are present and the types of values associated with these keys are correct. It checks for edge cases such as emptycfgs, duplicate entries inGROUPS,phasenot being ∈ {1,2}, as well as ensures that everycfgcorresponds to a section in config.ini (includingDEFAULT).The testing harness then runs the TA3 server for each
cfgin a chosen group (testing or normal mode), waits for the server to be ready by pinging/ui/in a loop, and invokes the client dummy ADM runner once the server has spun up. To choose the port for the server, the following logic is implemented. A default port of 8080 is chosen. If the user does not want to manually select a port, he or she can enable--auto-portto pick a free local port and to re-pick if a late conflict occurs before an ADM run. To be extra secure, re-validation of port availability takes place before eachcfgis run.For identifying the correct file paths of the client repository root, client Python executable, and dummy ADM runner, command line arguments may be used. Absolute or repo-relative paths are both accepted, with relative paths resolved from the server repo root. If CLI is omitted or invalid, the script falls back to a local
automated_testing_config.json, which is intended to be reviewer-specific and ignored by Git. A trackedautomated_testing_config.template.jsonis provided as the starting point. In addition, defaultGROUPSremain inautomated_tester.py, but they may be overridden locally via agroupsobject inautomated_testing_config.json.Once the file paths have been verified, the script proceeds to start the server for each
cfg, run a dummy ADM through the scenario, and record the combined stdout/stderr into branch-scoped output files underautomated_test_results/<branch>/, named<cfg>_GROUP_<group>.txt. This keeps results isolated by branch while still making them easy to compare between runs, helping ensure that the dummy ADM was able to exercise scenarios in the desired way consistently across several environments.Usage:
GROUPValidation Only:Full Execution Pipeline:
CLI Flags:
--group {…}:Which test group to run (required unless --validate-only)--branch NAME:Branch label used in output directory naming (required unless --validate-only)--port N:Port the server should use (default 8080; must be 1-65535 and available)--auto-port:Ask the OS for a free port; overrides --port--client-root PATH:Path to the evaluation client repo (absolute and relative paths both supported)--client-python PATH:Path to the client venv Python (absolute and relative paths both supported)--runner-path PATH:Path to the runner script; defaults to <client_root>/itm_minimal_runner.py)--validate-only:Validate GROUPS against swagger_server/config.ini and exit (no path/port checks or execution)Disclaimers:
cfgs, With--auto-port, a new port is picked transparently to mitigate the race condition./ui/returns HTTP 200 and contains the words “Swagger UI.” This is how the program determines if the server is ready for the dummy ADM to run on. If this endpoint changes or its contents is updated, the logic here must be modified to reflect this as well.How To Test:
Preparing For Testing
swagger_server/config.ini. Make sure that theSCENARIO_DIRECTORYvariable points to the location of the relevant scenario YAML files.automated_testing_config.template.jsonto a localautomated_testing_config.jsonand updated it to reflect these correct locations. If needed,groupsmay also be overridden there for local testing.Validate Only
python automated_tester.py --validate-only. This will check to make sure that theGROUPS objectis valid given the domain defined by your activeconfig.inifile. It should logValidation Passedto the terminal and then exit without any errors.python automated_tester.py --validate-only --group 1 --branch test. Ensure that the extra, irrelevant arguments have no adverse affect on script functionality.Path Precedence and Validity
python automated_tester.py --group 1 --branch ITM-1100. Ensure that the ADM runs were successful and output files were generated underautomated_test_results/ITM-1100/. This tests the local config-based path option.groupsoverride in place if your activeswagger_server/config.inidiffers from the default checked-in groups. Then temporarily remove the path entries fromautomated_testing_config.jsonand pass all paths via command line arguments using the commandpython automated_tester.py --group 1 --branch ITM-1100 [--client-root PATH] [--client-python PATH] [--runner-path PATH]. Check to make sure that there are no issues with the dummy ADMs being able to exercise the relevant scenario files.--client-python. A warning should be logged, yet the script should recover by falling back to the local config and proceeding.automated_testing_config.json. Runpython automated_tester.py --group 1 --branch ITM-1100. Expect a list of aggregated errors logged to the terminal.Port Handling Logic
python automated_tester.py --group 1 --branch ITM-1100 --port 70000. Expect an error message indicating this issue logged to the terminal and an exit.8080using the commandpython -m http.server 8080. Then runpython automated_tester.py --group 1 --branch ITM-1100 --port 8080. You should see an error message logged to the terminal and graceful failure.8080busy, now test auto port usingpython automated_tester.py --group 1 --branch ITM-1100 --auto-port. Another available port should be picked and the pipeline should proceed without errors.GROUPS Schema Changes
After each of the following changes to either the default
GROUPSinautomated_tester.pyor a localgroupsoverride inautomated_testing_config.json, runpython automated_tester.py --validate-only. You should ensure that an error is logged to the terminal. This makes sure invalid schema is picked up and the pipeline is killed before the dummy ADM is run on any scenarios.GROUPS.testingorphaseparameter from one of the entries."testing": "true").cfgsto be an empty list []cfgslistcfgsarray.Functional Runs
python automated_tester.py --group 1 --branch ITM-1100. Manually inspect the files underautomated_test_results/ITM-1100/to ensure that all scenario IDs/alignment targets were properly exercised for each configuration option and that no runtime errors occurred.gradlew -Pdomain=triage. Run this command in both your server repo directory and client root directory. Next, runpython automated_tester.py --group 3 --branch ITM-1100. Inspect the output file(s) underautomated_test_results/ITM-1100/to ensure scenarios were exercised correctly. This confirms that the runner invocation includes--domain triagewhen required.