-
Notifications
You must be signed in to change notification settings - Fork 29
[CI/mock_uss] Add optionnal 'modification of activated flight' behaviour #1290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Indeed, that would probably be too much. Especially given that this specific different behavior only concerns F3548-21, and out of the 13 CI jobs running scenarios, only 3 or 4 actually concerns testing of this suite. What about a dedicated configuration instead, aimed at testing this alternative behavior? That way we can cherry-pick the scenarios impacted by this different behavior. |
| import_environment_variable(KEY_BEHAVIOR_LOCALITY, default="US.IndustryCollaboration") | ||
| import_environment_variable(KEY_CODE_VERSION, default="Unknown") | ||
|
|
||
| import_environment_variable( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BenjaminPelletier IIRC during the InterUSS weekly several weeks ago, when we talked about #1276, you mentioned something about having some structure to alter the behavior of the mock USS, and I believe you suggested an approach. If so, is that it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can think of a few ways to select different behavior for mock_uss:
- Have a different mock_uss instance for each behavior
- Change mock_uss behavior for each test (or particular tests) at runtime/test time
- mock_uss selects a different behavior automatically based on some characteristic of the interactions
Currently, I think we have some combination of 1 and 2. Some configuration parameters are set per instance; see all MOCK_USS_* values in docker-compose.yaml. Some configuration is set according to locality. Locality is essentially a collection of behaviors/configuration settings. It's called a locality because the largest normal source of USS behavior variations is typically between jurisdictions/localities (e.g., U-space in Switzerland versus UTM Implementation in US). We currently set locality via "dynamic configuration" (API call to change locality) which is why it's in bucket 2, but I think it would be far better to automatically select a locality based on the geographic area in which actions were occurring (bucket 3) since this is more stateless and more similar to how real USSs would behave.
I don't think bucket 1 is a very good idea because it's difficult to manage and keep track of more mock_uss instances. So, I'm not a huge fan of configuring via environment variable since that's effectively configuring via mock_uss instance.
"Locality" is perhaps not an optimal word for "bundle of configurable behaviors", but I think the idea of a single place where configurable behaviors are selected together is still a good one. And, since the strongest candidate for bucket 3 is selecting via geography (and that's the place configuration selection will most likely be done by real USSs), I think "locality" is probably still a decent name for a bundle of configurable behaviors.
In the long term, I think the ideal outcome would be that a mock_uss is brought up with a configuration that specifies multiple configuration bundles (renamed from "localities") and their geographic bounds. Then, tests are conducted in the appropriate geographic areas for the expected outcomes. There would be no need for changing locality at runtime and probably some (but not all) of the MOCK_USS_* parameters would be moved into configuration bundles (especially auth spec, DSS URL, and public key). In this case, we would have a variant locality that looks like another standard locality, but with the different editable-flight behavior.
I'm open to what should be done in the meantime, but I'm not a huge fan of going down the "configuration per instance" route as configuration via environment variable implies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using locality seems a good option, I do have a few questions:
I think we want to be able to also have different behaviors in a same 'geographical location', e.g. to validate that interactions with two different USS is producing expected results. Would that be doable with locality settings or this is something that need to be adapted?
I also think that having a 'bad uss', that don't produce expected result, allowing a (specific) test suite to verify that tests fails as expected to would be a good idea, would you also use a locality in that case?
|
|
||
| if ( | ||
| not webapp.config[KEY_BEHAVIOUR_ACTIVATED_FLIGHTS_EDITABLES] | ||
| and old_status == FlightPlanStatus.OkToFly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that rather we want to check whether the area (existing_flight.flight_info.basic_information.area) has not changed if the old status is OkToFly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we? Because if that the case, I would rename the flag to ACTIVATED_FLIGHTS_AREA_EDITABLE then, would it be ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, because a state change is still OK when the status is OkToFly (e.g. going contingent). From what I've seen, the USSs not supporting modifications don't support modification of the definition of the flight plan when it is in flight. But the status can still change.
I would rename the flag to ACTIVATED_FLIGHTS_AREA_EDITABLE then, would it be ok?
Sounds reasonable
I would be more in favor of running alternates behaviors 'blindly', to be able to catch side effects / errors without having to think about affected tests and potentially missing some / having to maintain that list on each tests change. Assuming we can afford Seems there is a limit of 256 jobs / workflow run, so that may not be viable depending on how many alternate behavior we need. |
Makes sense.
Yeah and the load has to be realistic. If that works out though, sure. Have you given it a try? |
Not yet, I'm not sure how many different behavior we need / will have to create. Do you have some idea already that may have occurred in the past? Notice we probably also want 'bad' uss, to test that test are failed when they are supposed to fail. |
|
CI size is a big concern for us. From a strict standpoint, thorough testing would involve taking each major path (e.g., whether a check succeeds or fails) in each scenario. But, that's O[2^n] and therefore completely infeasible. So, I think we have to be fairly smart and judicious about how and what we test. Certainly "things we thought worked but didn't actually" are always strong candidates to be tested under representative conditions. But, I don't think we can simply add all the combinations that could be easily tested as that would produce a combinatorial explosion that is too large to actually handle. We certainly shouldn't ~double our CI load simply to verify behavior with and without this particular single rather-esoteric behavioral option turned on or off. |
Yes that not an option and it's definitely going to be too big. However, having as most path tested as possible, as long reasonable, would still be great no? In term of balance, having the "full" set ran only in specific cases (eg. only on master branch of before a release), with failures only considered as warnings would be an option? |
Proposition for #1276 for start to have multiple mock_uss behaviour.
A few things to notice: