Skip to content

improve: Stale container cleanup, fail-fast parent selection, post-compilation smoke test#10

Open
Ryuketsukami wants to merge 1 commit intofacebookresearch:mainfrom
Ryuketsukami:improve/generate-loop-robustness
Open

improve: Stale container cleanup, fail-fast parent selection, post-compilation smoke test#10
Ryuketsukami wants to merge 1 commit intofacebookresearch:mainfrom
Ryuketsukami:improve/generate-loop-robustness

Conversation

@Ryuketsukami
Copy link
Copy Markdown

Problem

Three robustness gaps in generate_loop.py:

1. Orphaned containers accumulate from crashed runs

If the process is killed between container creation and the finally block, orphaned hyperagents-* containers persist. Over multiple crashed runs, these consume Docker resources (memory, storage layers, network namespaces).

2. Parent selection retries 10 times on deterministic errors

select_next_parent_container() catches all exceptions uniformly and retries up to 10 times. If the evolved select_next_parent.py has a syntax error, every retry fails identically — burning 10 Docker container lifecycles (each with container creation, archive copy, patch application, teardown).

3. Compilation check misses runtime errors

The sole pre-evaluation gate is python -c "from task_agent import TaskAgent" — an import check. A mutation that imports cleanly but crashes in forward() wastes expensive evaluation compute. Research (CMU S3D-25-101) confirms: "non-syntactic mistakes manifest only in specific test cases."

Solution

  1. Container cleanup: At generate_loop() start, scans for and force-removes any containers matching hyperagents-*. Wrapped in try/except so failures never block.

  2. Fail-fast: After exec_run in parent selection, checks if the error indicates a Python script failure (SyntaxError, ImportError, NameError, AttributeError, TypeError, IndentationError, traceback). Deterministic errors fail immediately; infrastructure errors still retry.

  3. Smoke test: New run_smoke_test() in gl_utils.py runs a 60-second-timeout check inside the container: imports TaskAgent, instantiates it (no LLM calls), verifies forward() signature. Called after compilation check. If it fails, evaluation is skipped.

Tests

Added tests/test_smoke_test.py (6 tests):

  • Returns True on success (sentinel present)
  • Returns False on non-zero exit, missing sentinel, empty output, exceptions

Question for maintainers

For the fail-fast logic: is there a scenario where a Python SyntaxError in select_next_parent.py should be retried? The current implementation treats all script-level errors as deterministic (non-retryable).

Three robustness improvements to generate_loop.py:

1. Stale container cleanup: At loop start, scans for and
   removes orphaned hyperagents-* containers from previous
   crashed runs. Prevents resource accumulation.

2. Fail-fast for deterministic errors: In parent selection
   retry logic, distinguishes Python script errors
   (SyntaxError, ImportError, etc.) from transient container
   failures. Deterministic errors fail immediately instead
   of burning 10 container lifecycles.

3. Lightweight smoke test: After compilation check passes,
   verifies TaskAgent is instantiable and forward() has the
   right signature. Catches runtime errors that import-only
   checks miss, at negligible cost (no LLM calls).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@meta-cla
Copy link
Copy Markdown

meta-cla bot commented Mar 25, 2026

Hi @Ryuketsukami!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Mar 25, 2026
@meta-cla
Copy link
Copy Markdown

meta-cla bot commented Mar 25, 2026

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant