fix(tests): make integration tests idempotent against live models [LNDENG-4308]#119
Conversation
…DENG-4308] - Replace all juju.ssh() calls with juju.exec() — JAAS does not proxy SSH to units, so juju.ssh is broken on live PS7 models. juju.exec runs the command directly over the Juju API. Affected: wait_for_service (helpers.py), test_landscape_schema_migrated, test_ubuntu_installer_attach_toggle_no_maintenance - Wrap test_modern_database_relation and test_legacy_db_relation in try/finally so restore_db_relations() is guaranteed to run even if the assertion fails, leaving the model in a known-good state. - Skip test_upgrade_action_updates_ppa when USE_HOST_JUJU_MODEL=True. This test mutates PPA state on the live unit and is unsuitable for live model runs; it belongs in a dedicated upgrade pipeline. - Fix USE_HOST_JUJU_MODEL and USE_HOST_LBAAS_MODEL to use bool(os.getenv(...)) instead of os.getenv(..., False) — the old form returned None or str, which made the type misleading even though truthiness was correct.
There was a problem hiding this comment.
Pull request overview
This PR updates the Landscape Server charm integration tests to be safe and repeatable when executed against live Juju models (notably JAAS/PS7), avoiding non-idempotent mutations and replacing SSH-dependent checks with API-driven execution.
Changes:
- Replace
juju.sshusages withjuju.execfor service/psql checks to work on JAAS-backed live models. - Make DB-relation mutation tests idempotent by ensuring relations are restored via
try/finally. - Skip the PPA-mutating upgrade test when running against a host/live model; fix host-model env flags to be actual booleans.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
tests/integration/test_bundle.py |
Switches unit command execution to juju.exec, adds try/finally restoration for DB relation tests, and skips a non-idempotent upgrade test on live models. |
tests/integration/helpers.py |
Updates wait_for_service to use juju.exec instead of SSH for JAAS compatibility. |
tests/integration/conftest.py |
Converts host-model env flags to bool for correct typing/usage in test logic. |
| elif "database" not in initial_relations: | ||
| juju.integrate("landscape-server:database", "postgresql:database") | ||
| juju.wait(lambda status: has_modern_pg(juju), timeout=120) | ||
| juju.integrate("landscape-server:database", "postgresql:database") |
There was a problem hiding this comment.
After switching from the legacy db relation to database, this branch calls juju.integrate(...) but doesn’t wait for the modern relation to actually appear before reading juju.status() and asserting. This is inconsistent with the elif branch (which waits for has_modern_pg) and can lead to flaky failures if relation establishment is not immediate. Consider waiting for has_modern_pg(juju) (or jubilant.all_active) after the integrate in this branch as well.
| juju.integrate("landscape-server:database", "postgresql:database") | |
| juju.integrate("landscape-server:database", "postgresql:database") | |
| juju.wait(lambda status: has_modern_pg(juju), timeout=120) |
| juju.remove_relation( | ||
| "landscape-server:database", "postgresql:database", force=True | ||
| ) | ||
| juju.wait(lambda status: not has_modern_pg(juju), timeout=120) | ||
| juju.integrate("landscape-server:db", "postgresql:db-admin") | ||
|
|
||
| relations = set(juju.status().apps["landscape-server"].relations) | ||
| elif "db" not in initial_relations: | ||
| juju.integrate("landscape-server:db", "postgresql:db-admin") | ||
| juju.wait(lambda status: has_legacy_pg(juju), timeout=120) | ||
|
|
||
| assert "db" in relations | ||
| relations = set(juju.status().apps["landscape-server"].relations) |
There was a problem hiding this comment.
In the branch that removes the modern database relation, the test immediately calls juju.integrate("landscape-server:db", ...) and then proceeds to read status/assert without waiting for has_legacy_pg(juju) (unlike the elif branch). This can be racey/flaky if the relation doesn’t show up in juju.status() immediately. Consider adding a wait for has_legacy_pg(juju) after the integrate in this branch too.
| with pytest.raises(Exception): | ||
| juju.ssh( | ||
| name, | ||
| juju.exec( | ||
| f"systemctl is-active {LANDSCAPE_UBUNTU_INSTALLER_ATTACH}.service", | ||
| unit=name, | ||
| ) |
There was a problem hiding this comment.
pytest.raises(Exception) is very broad here and can mask unexpected failures (e.g., command transport errors) that aren’t actually “service is inactive”. Since wait_for_service() already treats jubilant.CLIError as the expected failure mode for non-zero exits, it would be safer to assert on jubilant.CLIError (and optionally check the output/return code if available) so the test only passes for the intended reason.
…4308] - test_modern_database_relation: add juju.wait(has_modern_pg) after integrating database relation when switching from legacy, so status is polled until the relation actually appears before asserting - test_legacy_db_relation: same — add juju.wait(has_legacy_pg) after integrating db relation when switching from modern - test_ubuntu_installer_attach_toggle_no_maintenance: narrow pytest.raises(Exception) to pytest.raises(jubilant.CLIError) so only the expected non-zero exit failure is caught and transport errors are not
…ailover tests [LNDENG-4308] - test_leader_services_not_on_non_leaders: assert LEADER_SERVICES are NOT active on non-leader units (explicit negative; safe on live models) - test_service_ports_listening: verify all-unit ports and leader-only ports via ss; guards against HAProxy charm config regressions (safe on live models) - test_leadership_change_service_follows: remove leader unit, verify LEADER_SERVICES migrate to new leader and DEFAULT_SERVICES stay up (skip on live model) - test_scale_out_and_in: add/remove a unit, verify service placement throughout (skip on live model) - test_postgresql_failover_services_recover: stop pg leader, verify landscape-server recovers automatically and appserver stays healthy (skip on live model)
Implements LNDENG-4308: Make landscape-server-operator integration tests fully idempotent against live models.
Required before
integration-tests.yamlincanonical/landscape-saas(#141) can run safely against PS7.Changes
juju.ssh→juju.exec(JAAS incompatibility)JAAS does not proxy SSH to units —
juju.sshsimply doesn't work on PS7 live models.juju.execruns the command directly over the Juju API and works correctly.Affected:
helpers.py—wait_for_servicetest_landscape_schema_migrated— psql schema checktest_ubuntu_installer_attach_toggle_no_maintenance— check service inactivetest_modern_database_relation+test_legacy_db_relation— try/finallyrestore_db_relations()was only called on the happy path. If the assertion failed the model was left with a broken DB relation. Wrapped intry/finally.test_upgrade_action_updates_ppa— skip on live modelThis test mutates PPA state (
add-apt-repository) on the live unit and runsjuju.sshinternally. Not safe against a live PS7 model. Skips whenUSE_HOST_JUJU_MODEL=True; belongs in a dedicated upgrade pipeline.USE_HOST_JUJU_MODEL/USE_HOST_LBAAS_MODELtype fixChanged from
os.getenv(..., False)(returnsstr | None) tobool(os.getenv(...))(returnsbool). Behaviour is identical but the type annotation is now correct.