feat(scripts): deterministic state cleanup for repeated cosim runs#12
feat(scripts): deterministic state cleanup for repeated cosim runs#12
Conversation
Introduce run-ID namespacing, resource manifest, failure taxonomy, diagnostic artifact capture, and pre-launch health checks to ensure each cosim test session starts in a clean environment. Key changes: - cosim_lib.sh: shared library with run-ID generation, 12-category failure taxonomy, resource manifest (runtime/artifact), manifest- driven cleanup with post-teardown verification, force-clean orphan handling (dry-run default), pre-launch health check, preflight audit - cosim_launch.sh: COSIM_RUN_ID namespaces all resources (container, socket, VRAM shmem, guest-RAM shmem), cleanup handler rewritten to classify failure then capture artifacts before cleanup, health check between gem5 readiness and QEMU launch - run_cosim_tests.sh: --repeat N flag with fresh session per iteration, per-run artifact dirs, pass/fail matrix with failure category, --keep-alive+--repeat and --all+--repeat rejected, namespace-aware cleanup_session with launcher category handoff - cosim_preflight.sh: standalone preflight resource audit Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6a25b9807d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
- Replace readonly with export for taxonomy constants (SC2034) - Add SC2317 disable for trap-invoked functions (cleanup, on_exit, etc.) - Add SC1091 disable for sourced cosim_lib.sh - Remove unused sub_log variable in repeat mode (SC2034) - Use pgrep instead of grepping ps output (SC2009) - Use if-block instead of single-iteration for-loop (SC2043) - Use grouped redirects for metadata writes (SC2129) Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
P1: Only create test-done sentinel on test pass, not on failure,
preventing failed tests from being reclassified as pass during
launcher cleanup.
P2: Extend parse_size_bytes to handle GB/MB/KB/TB suffixes (SI units)
in addition to GiB/MiB/KiB/TiB, so --host-mem 8GB and similar
formats pass the readiness health check.
Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 778974611c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Filter force_clean_orphans to only match stopped/dead/created containers, not running ones. Prevents --force-clean --confirm from terminating live cosim sessions. Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b5cce938fa
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
xxd is not available on minimal hosts. Use od (POSIX standard) to convert /dev/urandom bytes to hex instead. Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
There was a problem hiding this comment.
💡 Codex Review
cosim-gpu/scripts/run_cosim_tests.sh
Line 74 in 96d31c0
The fallback parser assumes every unknown --* option has a value ("$1" "$2"; shift 2), which now breaks valid launcher flags that take no argument (for example --confirm/--force-clean). With set -u, passing such a flag at the end causes an immediate unbound variable error, and passing it before the test filter consumes the filter as a fake value, so passthrough behavior no longer works for newly added launcher flags.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Add run-ID liveness check to force_clean_orphans for sockets and shmem files. Extract run-ID from resource filenames and verify the corresponding container is not running before listing or deleting. Active resources are reported as "skipped". Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fcee0ab274
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Replace broken ${rid%%-[0-9]*} pattern (which truncated YYYYMMDD-
HHMMSS-hex IDs to just the date) with a regex that extracts the
complete YYYYMMDD-HHMMSS-hex run-ID from resource filenames.
Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
There was a problem hiding this comment.
💡 Codex Review
cosim-gpu/scripts/run_cosim_tests.sh
Line 74 in f219341
The fallback --* parser path always consumes two argv entries ("$1" "$2" + shift 2), so a valueless launch flag now fails under set -u with an unbound $2. This became user-visible after adding new no-argument launcher flags like --force-clean/--confirm: invoking run_cosim_tests.sh --force-clean ... aborts before any work starts instead of forwarding options as documented.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Replace regex-based run-ID extraction with a running-container set approach: collect all active gem5-cosim-* container names, extract their run-IDs, and check if any active run-ID appears in the resource filename. Works with any COSIM_RUN_ID format, not just the default timestamp-hex pattern. Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
Summary
COSIM_RUN_IDnamespace to isolate all per-session resources (container, socket, VRAM shmem, guest-RAM shmem)scripts/cosim_lib.shshared library: 12-category failure taxonomy, resource manifest (runtime/artifact), manifest-driven cleanup, post-teardown verification (10s timeout), pre-launch health check, preflight audit, diagnostic artifact capturecosim_launch.shcleanup handler: classify failure → capture artifacts → cleanup → verify--repeat Nflag torun_cosim_tests.sh: fresh session per iteration, pass/fail matrix with failure category--force-cleandry-run mode for orphan resource managementTest plan
bash -n scripts/cosim_lib.sh scripts/cosim_launch.sh scripts/run_cosim_tests.sh scripts/cosim_preflight.shsyntax check passes./scripts/cosim_launch.sh --force-cleanlists orphaned resources (dry-run)./scripts/cosim_launch.shstarts normally with Run-ID shown in banner./scripts/run_cosim_tests.sh --repeat 20 vector_addpasses 20 consecutive sessions with zero infrastructure failures./scripts/run_cosim_tests.sh --keep-alive --repeat 3 vector_addrejects the combination and exits with errorartifacts/directory contains gem5 logs, QEMU console log, and process snapshot🤖 Generated with Claude Code