Skip to content

feat(scripts): deterministic state cleanup for repeated cosim runs#12

Merged
zevorn merged 8 commits intomainfrom
fix/cosim-repeat-reliability
Apr 29, 2026
Merged

feat(scripts): deterministic state cleanup for repeated cosim runs#12
zevorn merged 8 commits intomainfrom
fix/cosim-repeat-reliability

Conversation

@zevorn
Copy link
Copy Markdown
Owner

@zevorn zevorn commented Apr 29, 2026

Summary

  • Introduce COSIM_RUN_ID namespace to isolate all per-session resources (container, socket, VRAM shmem, guest-RAM shmem)
  • Add scripts/cosim_lib.sh shared library: 12-category failure taxonomy, resource manifest (runtime/artifact), manifest-driven cleanup, post-teardown verification (10s timeout), pre-launch health check, preflight audit, diagnostic artifact capture
  • Rewrite cosim_launch.sh cleanup handler: classify failure → capture artifacts → cleanup → verify
  • Add --repeat N flag to run_cosim_tests.sh: fresh session per iteration, pass/fail matrix with failure category
  • Add --force-clean dry-run mode for orphan resource management

Test plan

  • bash -n scripts/cosim_lib.sh scripts/cosim_launch.sh scripts/run_cosim_tests.sh scripts/cosim_preflight.sh syntax check passes
  • ./scripts/cosim_launch.sh --force-clean lists orphaned resources (dry-run)
  • ./scripts/cosim_launch.sh starts normally with Run-ID shown in banner
  • ./scripts/run_cosim_tests.sh --repeat 20 vector_add passes 20 consecutive sessions with zero infrastructure failures
  • ./scripts/run_cosim_tests.sh --keep-alive --repeat 3 vector_add rejects the combination and exits with error
  • On failure, artifacts/ directory contains gem5 logs, QEMU console log, and process snapshot

🤖 Generated with Claude Code

Introduce run-ID namespacing, resource manifest, failure taxonomy,
diagnostic artifact capture, and pre-launch health checks to ensure
each cosim test session starts in a clean environment.

Key changes:
- cosim_lib.sh: shared library with run-ID generation, 12-category
  failure taxonomy, resource manifest (runtime/artifact), manifest-
  driven cleanup with post-teardown verification, force-clean orphan
  handling (dry-run default), pre-launch health check, preflight audit
- cosim_launch.sh: COSIM_RUN_ID namespaces all resources (container,
  socket, VRAM shmem, guest-RAM shmem), cleanup handler rewritten to
  classify failure then capture artifacts before cleanup, health check
  between gem5 readiness and QEMU launch
- run_cosim_tests.sh: --repeat N flag with fresh session per iteration,
  per-run artifact dirs, pass/fail matrix with failure category,
  --keep-alive+--repeat and --all+--repeat rejected, namespace-aware
  cleanup_session with launcher category handoff
- cosim_preflight.sh: standalone preflight resource audit

Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6a25b9807d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread scripts/run_cosim_tests.sh Outdated
Comment thread scripts/cosim_launch.sh Outdated
zevorn added 2 commits April 30, 2026 00:03
- Replace readonly with export for taxonomy constants (SC2034)
- Add SC2317 disable for trap-invoked functions (cleanup, on_exit, etc.)
- Add SC1091 disable for sourced cosim_lib.sh
- Remove unused sub_log variable in repeat mode (SC2034)
- Use pgrep instead of grepping ps output (SC2009)
- Use if-block instead of single-iteration for-loop (SC2043)
- Use grouped redirects for metadata writes (SC2129)

Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
P1: Only create test-done sentinel on test pass, not on failure,
    preventing failed tests from being reclassified as pass during
    launcher cleanup.

P2: Extend parse_size_bytes to handle GB/MB/KB/TB suffixes (SI units)
    in addition to GiB/MiB/KiB/TiB, so --host-mem 8GB and similar
    formats pass the readiness health check.

Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 778974611c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread scripts/cosim_lib.sh Outdated
Filter force_clean_orphans to only match stopped/dead/created
containers, not running ones. Prevents --force-clean --confirm
from terminating live cosim sessions.

Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b5cce938fa

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread scripts/cosim_lib.sh Outdated
xxd is not available on minimal hosts. Use od (POSIX standard)
to convert /dev/urandom bytes to hex instead.

Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

--*) PASSTHROUGH_ARGS+=("$1" "$2"); shift 2 ;;

P2 Badge Handle flag-only passthrough options in test runner

The fallback parser assumes every unknown --* option has a value ("$1" "$2"; shift 2), which now breaks valid launcher flags that take no argument (for example --confirm/--force-clean). With set -u, passing such a flag at the end causes an immediate unbound variable error, and passing it before the test filter consumes the filter as a fake value, so passthrough behavior no longer works for newly added launcher flags.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread scripts/cosim_lib.sh
Add run-ID liveness check to force_clean_orphans for sockets and
shmem files. Extract run-ID from resource filenames and verify
the corresponding container is not running before listing or
deleting. Active resources are reported as "skipped".

Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fcee0ab274

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread scripts/cosim_lib.sh Outdated
Replace broken ${rid%%-[0-9]*} pattern (which truncated YYYYMMDD-
HHMMSS-hex IDs to just the date) with a regex that extracts the
complete YYYYMMDD-HHMMSS-hex run-ID from resource filenames.

Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

--*) PASSTHROUGH_ARGS+=("$1" "$2"); shift 2 ;;

P2 Badge Handle valueless passthrough flags in test-runner parsing

The fallback --* parser path always consumes two argv entries ("$1" "$2" + shift 2), so a valueless launch flag now fails under set -u with an unbound $2. This became user-visible after adding new no-argument launcher flags like --force-clean/--confirm: invoking run_cosim_tests.sh --force-clean ... aborts before any work starts instead of forwarding options as documented.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread scripts/cosim_lib.sh Outdated
Replace regex-based run-ID extraction with a running-container set
approach: collect all active gem5-cosim-* container names, extract
their run-IDs, and check if any active run-ID appears in the
resource filename. Works with any COSIM_RUN_ID format, not just
the default timestamp-hex pattern.

Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
@zevorn zevorn merged commit de596d2 into main Apr 29, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant