Add EnvironmentCheckpoint and CheckpointableEnvironment protocol#96
Add EnvironmentCheckpoint and CheckpointableEnvironment protocol#96joshgreaves wants to merge 1 commit intomainfrom
Conversation
📝 WalkthroughWalkthroughAdds EnvironmentCheckpoint and CheckpointableEnvironment types, a _CheckpointJanitor with atexit cleanup, exposes the new types at package root, adds tests for checkpoint lifecycle and janitor behavior, and prevents duplicate Harbor dataset registrations in presets. Changes
Sequence DiagramsequenceDiagram
participant User
participant Env as Environment
participant CP as EnvironmentCheckpoint
participant Janitor as _CheckpointJanitor
User->>Env: await checkpoint()
Env->>CP: create(restore_fn, release_fn, timestep, step_count)
CP->>Janitor: register(self)
User->>CP: await restore()
CP->>CP: call restore_fn -> new_env
CP-->>User: return (new_env, timestep)
User->>CP: await release()
CP->>CP: call release_fn (optional)
CP->>Janitor: unregister(self)
Note over Janitor: at exit -> _cleanup()
Janitor->>CP: call release_fn_sync for remaining checkpoints
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (3)
src/ares/environments/checkpoint_test.py (1)
25-33: Consider releasing checkpoints in tests to avoid polluting the global janitor.Each
EnvironmentCheckpointcreated in tests auto-registers with_CHECKPOINT_JANITOR. While this doesn't cause test failures (entries are cleared at interpreter exit), it could affect test isolation if the janitor's state is inspected or if tests run in the same process accumulate many entries.📝 Example cleanup pattern
checkpoint = base.EnvironmentCheckpoint(...) try: env, ts = await checkpoint.restore() assert env is mock_env assert ts is simple_timestep finally: await checkpoint.release()Alternatively, use a pytest fixture that auto-releases checkpoints after tests.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ares/environments/checkpoint_test.py` around lines 25 - 33, The test creates an EnvironmentCheckpoint which auto-registers with _CHECKPOINT_JANITOR; update the test to explicitly release that checkpoint to avoid polluting global janitor state by ensuring you call await checkpoint.release() after using checkpoint.restore(). Wrap the restore/assert block in a try/finally (or use a pytest fixture) so that await checkpoint.release() runs even on assertion failures; reference the EnvironmentCheckpoint instance named checkpoint and its restore() and release() methods.src/ares/environments/base.py (2)
270-271: Minor: Consider usingEnvironmentCheckpoint[Any]for the type annotation.The bare
EnvironmentCheckpointwithout a type parameter will work at runtime but loses type information. UsingEnvironmentCheckpoint[Any]orEnvironmentCheckpoint[object]would be more explicit.📝 Suggested annotation
+from typing import Any + class _CheckpointJanitor: ... def __init__(self): - self._checkpoints: dict[int, EnvironmentCheckpoint] = {} + self._checkpoints: dict[int, EnvironmentCheckpoint[Any]] = {}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ares/environments/base.py` around lines 270 - 271, The type annotation for the _checkpoints attribute in __init__ should be made explicit by using a parameterized EnvironmentCheckpoint to preserve generic type info; change the annotation from dict[int, EnvironmentCheckpoint] to dict[int, EnvironmentCheckpoint[Any]] (or EnvironmentCheckpoint[object]) and add the necessary typing import (Any) so the _checkpoints: dict[int, EnvironmentCheckpoint[Any]] declaration in __init__ reflects the generic parameter.
215-227: Consider ensuringunregisteris called even ifrelease_fnraises.If
_release_fn()raises an exception, the checkpoint is marked as released (_released = True) butunregister()is never called. This leaves a stale entry in the janitor that:
- Won't be cleaned up at exit (since
_release_fn_syncmay not match the failed state)- Holds a reference to the checkpoint preventing garbage collection
🛠️ Proposed fix using try/finally
async def release(self) -> None: """Release resources held by this checkpoint. For environments backed by containers, this deletes the snapshot image. For simple environments, this is a no-op. Safe to call multiple times. After release(), restore() will raise RuntimeError. """ if self._released: return self._released = True - if self._release_fn is not None: - await self._release_fn() - _CHECKPOINT_JANITOR.unregister(self) + try: + if self._release_fn is not None: + await self._release_fn() + finally: + _CHECKPOINT_JANITOR.unregister(self)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ares/environments/base.py` around lines 215 - 227, The release() method must always call _CHECKPOINT_JANITOR.unregister(self) even if await self._release_fn() raises; wrap the await in a try/finally so unregister is executed in the finally block, and re-raise any exception from _release_fn so behavior stays the same. Concretely, keep setting self._released = True, then if self._release_fn is not None call it inside try: await self._release_fn() finally: _CHECKPOINT_JANITOR.unregister(self); if there is no _release_fn still call _CHECKPOINT_JANITOR.unregister(self) as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/ares/environments/base.py`:
- Around line 270-271: The type annotation for the _checkpoints attribute in
__init__ should be made explicit by using a parameterized EnvironmentCheckpoint
to preserve generic type info; change the annotation from dict[int,
EnvironmentCheckpoint] to dict[int, EnvironmentCheckpoint[Any]] (or
EnvironmentCheckpoint[object]) and add the necessary typing import (Any) so the
_checkpoints: dict[int, EnvironmentCheckpoint[Any]] declaration in __init__
reflects the generic parameter.
- Around line 215-227: The release() method must always call
_CHECKPOINT_JANITOR.unregister(self) even if await self._release_fn() raises;
wrap the await in a try/finally so unregister is executed in the finally block,
and re-raise any exception from _release_fn so behavior stays the same.
Concretely, keep setting self._released = True, then if self._release_fn is not
None call it inside try: await self._release_fn() finally:
_CHECKPOINT_JANITOR.unregister(self); if there is no _release_fn still call
_CHECKPOINT_JANITOR.unregister(self) as before.
In `@src/ares/environments/checkpoint_test.py`:
- Around line 25-33: The test creates an EnvironmentCheckpoint which
auto-registers with _CHECKPOINT_JANITOR; update the test to explicitly release
that checkpoint to avoid polluting global janitor state by ensuring you call
await checkpoint.release() after using checkpoint.restore(). Wrap the
restore/assert block in a try/finally (or use a pytest fixture) so that await
checkpoint.release() runs even on assertion failures; reference the
EnvironmentCheckpoint instance named checkpoint and its restore() and release()
methods.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b1614895-b5bf-48fe-833f-0269681b9e56
📒 Files selected for processing (4)
src/ares/__init__.pysrc/ares/environments/base.pysrc/ares/environments/checkpoint_test.pysrc/ares/presets.py
| restore_fn: Callable[[], Awaitable[EnvType]], | ||
| release_fn: Callable[[], Awaitable[None]] | None = None, | ||
| release_fn_sync: Callable[[], None] | None = None, | ||
| step_count: int, | ||
| timestep: TimeStep, | ||
| ): | ||
| self._restore_fn = restore_fn | ||
| self._release_fn = release_fn | ||
| self._release_fn_sync = release_fn_sync | ||
| self.step_count = step_count | ||
| self.timestep = timestep | ||
| self._released = False |
There was a problem hiding this comment.
The atexit janitor only calls checkpoint._release_fn_sync and never awaits the async release_fn, so async-only cleanup can be skipped on exit — should we run the async release_fn in a temporary event loop or provide a sync fallback?
Finding type: Logical Bugs | Severity: 🔴 High
Want Baz to fix this for you? Activate Fixer
Other fix methods
Prompt for AI Agents:
In src/ares/environments/base.py around lines 181 to 226, the
_CheckpointJanitor._cleanup currently only calls checkpoint._release_fn_sync and
therefore skips async _release_fn, leaving resources unreleased at process exit. Modify
_cleanup so that for each checkpoint: if _release_fn_sync is present, call it as before;
otherwise if _release_fn is present, run it synchronously by creating a temporary
asyncio event loop and calling loop.run_until_complete(_release_fn()), catching and
logging exceptions. If creating/running a loop fails because an event loop is already
running, fall back to running the coroutine in a separate Thread with its own event loop
and join it, also catching and logging exceptions. Keep the existing warning logging and
ensure the _checkpoints dict is cleared afterward.
| async def release(self) -> None: | ||
| """Release resources held by this checkpoint. | ||
|
|
||
| For environments backed by containers, this deletes the snapshot image. | ||
| For simple environments, this is a no-op. Safe to call multiple times. | ||
| After release(), restore() will raise RuntimeError. | ||
| """ | ||
| if self._released: | ||
| return | ||
| self._released = True | ||
| if self._release_fn is not None: | ||
| await self._release_fn() |
There was a problem hiding this comment.
release() flips _released to True before awaiting _release_fn so failures leave the checkpoint marked and prevent retries, should we postpone marking _released and unregistering the janitor until after _release_fn succeeds?
self._released = True; await self._release_fn() => await self._release_fn(); self._released = True
Finding type: Logical Bugs | Severity: 🔴 High
Want Baz to fix this for you? Activate Fixer
Other fix methods
Prompt for AI Agents:
In src/ares/environments/base.py around lines 215-228, the release() method currently
sets self._released = True before awaiting self._release_fn, which causes the checkpoint
to be marked released and unregistered incorrectly if the async release raises. Change
release() so that it (1) does not set self._released until after await
self._release_fn() completes successfully, (2) only calls
_CHECKPOINT_JANITOR.unregister(self) after a successful release, and (3) keeps the
checkpoint registered and _released False if the release raises (propagate the exception
so callers can retry). Also preserve the no-op path: if _release_fn is None, still set
_released = True and unregister. Ensure no broad exception swallowing; let failures
surface so cleanup can be retried.
| if self._checkpoints: | ||
| _LOGGER.info("Cleaning up %d unreleased checkpoints at exit.", len(self._checkpoints)) | ||
| for cp in list(self._checkpoints.values()): | ||
| if cp._release_fn_sync is not None: | ||
| try: | ||
| cp._release_fn_sync() |
There was a problem hiding this comment.
Cleanup logs lack checkpoint IDs despite CLAUDE.md requiring object IDs for async lifecycle events — should we include id(cp) in those log lines?
Finding type: AI Coding Guidelines | Severity: 🟢 Low
Want Baz to fix this for you? Activate Fixer
Other fix methods
Prompt for AI Agents:
In src/ares/environments/base.py around lines 284 to 289, the
_CheckpointJanitor._cleanup method currently logs only the count and a generic warning
but does not include checkpoint identifiers. Update the info log at line ~285 to include
the IDs of the registered checkpoints (e.g., build a list of ids from self._checkpoints
values and log it alongside the count), and change the warning at line ~291 to include
the specific checkpoint id (id(cp)) when a release fails while keeping exc_info=True.
Keep formatting consistent with existing _LOGGER calls and avoid exposing sensitive data
beyond Python object ids.
|
@coderabbitai full review |
1 similar comment
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
src/ares/environments/base.py (2)
270-271: Consider thread safety for_checkpointsdict.If checkpoints are created or released concurrently from multiple threads, the
_checkpointsdict could face race conditions. While the GIL provides some protection and typical async usage is single-threaded, consider using athreading.Lockif multi-threaded checkpoint management is anticipated.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ares/environments/base.py` around lines 270 - 271, The _checkpoints dict initialized in __init__ can be mutated from multiple threads; add a threading.Lock (e.g., self._checkpoints_lock = threading.RLock()) in __init__ and use it to guard all accesses and mutations of self._checkpoints (wrap reads/writes in a with self._checkpoints_lock: block) inside methods that create, release, or read checkpoints (e.g., create_checkpoint, release_checkpoint, get_checkpoint or any other method referencing _checkpoints) to prevent race conditions; ensure the lock is used consistently and prefer RLock if reentrant locking may occur.
222-227: Exception inrelease_fnleaves checkpoint registered with janitor.If
release_fnraises an exception, the checkpoint is marked as released (_released = Trueat line 224) butunregister()at line 227 is never called. This leaves a stale entry in the janitor, though it's benign since_cleanup()only callsrelease_fn_sync(notrelease_fn).Consider whether to wrap the
release_fncall in try/finally to ensure unregistration:♻️ Optional: Ensure unregistration even if release_fn fails
async def release(self) -> None: if self._released: return self._released = True - if self._release_fn is not None: - await self._release_fn() - _CHECKPOINT_JANITOR.unregister(self) + try: + if self._release_fn is not None: + await self._release_fn() + finally: + _CHECKPOINT_JANITOR.unregister(self)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ares/environments/base.py` around lines 222 - 227, The release path should always unregister the checkpoint from _CHECKPOINT_JANITOR even if the async release function raises: in the method containing self._released, wrap the await self._release_fn() call in a try/finally (or equivalent) so that _CHECKPOINT_JANITOR.unregister(self) is executed in the finally block; keep setting self._released = True as appropriate (before or after calling _release_fn per desired semantics) but ensure unregister is unconditional to avoid leaving stale janitor entries (refer to the attributes _released, _release_fn and the janitor call _CHECKPOINT_JANITOR.unregister(self)).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/ares/environments/base.py`:
- Around line 270-271: The _checkpoints dict initialized in __init__ can be
mutated from multiple threads; add a threading.Lock (e.g.,
self._checkpoints_lock = threading.RLock()) in __init__ and use it to guard all
accesses and mutations of self._checkpoints (wrap reads/writes in a with
self._checkpoints_lock: block) inside methods that create, release, or read
checkpoints (e.g., create_checkpoint, release_checkpoint, get_checkpoint or any
other method referencing _checkpoints) to prevent race conditions; ensure the
lock is used consistently and prefer RLock if reentrant locking may occur.
- Around line 222-227: The release path should always unregister the checkpoint
from _CHECKPOINT_JANITOR even if the async release function raises: in the
method containing self._released, wrap the await self._release_fn() call in a
try/finally (or equivalent) so that _CHECKPOINT_JANITOR.unregister(self) is
executed in the finally block; keep setting self._released = True as appropriate
(before or after calling _release_fn per desired semantics) but ensure
unregister is unconditional to avoid leaving stale janitor entries (refer to the
attributes _released, _release_fn and the janitor call
_CHECKPOINT_JANITOR.unregister(self)).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ecd827c3-9bdb-46dc-a578-6cd520f3d0a7
📒 Files selected for processing (4)
src/ares/__init__.pysrc/ares/environments/base.pysrc/ares/environments/checkpoint_test.pysrc/ares/presets.py
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
src/ares/environments/base.py (1)
282-293: Keep checkpoint state consistent after janitor cleanup.
_cleanup()can runrelease_fn_sync, but it does not flip_released. After manual cleanup, a checkpoint may still allowrestore()even though underlying resources were already cleaned.Proposed fix
def _cleanup(self) -> None: """Clean up all registered checkpoints at exit.""" @@ for cp in list(self._checkpoints.values()): if cp._release_fn_sync is not None: try: cp._release_fn_sync() except Exception: _LOGGER.warning("Failed to clean up checkpoint.", exc_info=True) + cp._released = True self._checkpoints.clear()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ares/environments/base.py` around lines 282 - 293, _cleanup currently calls each checkpoint's cp._release_fn_sync but never flips cp._released, leaving checkpoints usable via restore() after resources were cleaned; update _cleanup to set cp._released = True for each checkpoint after invoking cp._release_fn_sync (do this in a finally block so the flag is set even if the release function raises) and keep the existing logging/exception handling around cp._release_fn_sync and the subsequent self._checkpoints.clear() call.src/ares/environments/checkpoint_test.py (1)
56-117: Add a regression test forrelease_fnfailure semantics.There is no test asserting state after
release_fnraises. Adding one will lock lifecycle behavior and catch release-ordering regressions.Suggested test case
class TestEnvironmentCheckpoint: @@ async def test_release_is_idempotent(self, simple_timestep): @@ assert call_count == 1 + + `@pytest.mark.asyncio` + async def test_release_failure_does_not_mark_released(self, simple_timestep): + async def restore_fn(): + return object() + + async def release_fn(): + raise RuntimeError("release failed") + + checkpoint = base.EnvironmentCheckpoint( + restore_fn=restore_fn, + release_fn=release_fn, + step_count=0, + timestep=simple_timestep, + ) + + with pytest.raises(RuntimeError, match="release failed"): + await checkpoint.release() + + assert not checkpoint.is_released🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ares/environments/checkpoint_test.py` around lines 56 - 117, Add an async test (e.g., test_release_when_release_fn_raises) that uses EnvironmentCheckpoint with a release_fn that raises an exception; await checkpoint.release() expecting that exception to propagate, then assert checkpoint.is_released is True, assert await checkpoint.restore() raises RuntimeError (or raises when called) and assert calling checkpoint.release() again does not re-run the failing release_fn or raise (idempotence). Reference EnvironmentCheckpoint, release(), release_fn, restore(), and is_released when adding the test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/ares/environments/base.py`:
- Around line 222-227: The method currently sets self._released before awaiting
the potentially-failing async cleanup, which prevents retries if release fails;
change the order so you await self._release_fn() first (if _release_fn is not
None), only set self._released = True and call
_CHECKPOINT_JANITOR.unregister(self) after the await completes successfully, and
let exceptions propagate (or re-raise) so the checkpoint remains not released on
failure; update the release() method accordingly to move the _released
assignment and unregister call to after the awaited _release_fn() completes.
In `@src/ares/environments/checkpoint_test.py`:
- Around line 25-229: Tests leak state because EnvironmentCheckpoint
auto-registers with the module-level janitor; add an autouse pytest fixture that
resets the global janitor before each test by creating a fresh
base._CheckpointJanitor() and assigning it to the module-level janitor used by
EnvironmentCheckpoint (then restore/cleanup after the test) so each test runs
with an empty janitor and no shared _checkpoints; reference
EnvironmentCheckpoint and _CheckpointJanitor to locate where to wire the
fixture.
---
Nitpick comments:
In `@src/ares/environments/base.py`:
- Around line 282-293: _cleanup currently calls each checkpoint's
cp._release_fn_sync but never flips cp._released, leaving checkpoints usable via
restore() after resources were cleaned; update _cleanup to set cp._released =
True for each checkpoint after invoking cp._release_fn_sync (do this in a
finally block so the flag is set even if the release function raises) and keep
the existing logging/exception handling around cp._release_fn_sync and the
subsequent self._checkpoints.clear() call.
In `@src/ares/environments/checkpoint_test.py`:
- Around line 56-117: Add an async test (e.g.,
test_release_when_release_fn_raises) that uses EnvironmentCheckpoint with a
release_fn that raises an exception; await checkpoint.release() expecting that
exception to propagate, then assert checkpoint.is_released is True, assert await
checkpoint.restore() raises RuntimeError (or raises when called) and assert
calling checkpoint.release() again does not re-run the failing release_fn or
raise (idempotence). Reference EnvironmentCheckpoint, release(), release_fn,
restore(), and is_released when adding the test.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d9be1ea3-1ed9-483f-9488-417827581c61
📒 Files selected for processing (4)
src/ares/__init__.pysrc/ares/environments/base.pysrc/ares/environments/checkpoint_test.pysrc/ares/presets.py
| if self._released: | ||
| return | ||
| self._released = True | ||
| if self._release_fn is not None: | ||
| await self._release_fn() | ||
| _CHECKPOINT_JANITOR.unregister(self) |
There was a problem hiding this comment.
Do not mark checkpoints released before async release succeeds.
On Line 224, _released is set before awaiting release_fn (Line 226). If release_fn raises, the checkpoint becomes permanently “released” even though cleanup failed, so release() cannot be retried and restore() is blocked.
Proposed fix
async def release(self) -> None:
"""Release resources held by this checkpoint.
@@
if self._released:
return
- self._released = True
- if self._release_fn is not None:
- await self._release_fn()
- _CHECKPOINT_JANITOR.unregister(self)
+ if self._release_fn is not None:
+ await self._release_fn()
+ self._released = True
+ _CHECKPOINT_JANITOR.unregister(self)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if self._released: | |
| return | |
| self._released = True | |
| if self._release_fn is not None: | |
| await self._release_fn() | |
| _CHECKPOINT_JANITOR.unregister(self) | |
| if self._released: | |
| return | |
| if self._release_fn is not None: | |
| await self._release_fn() | |
| self._released = True | |
| _CHECKPOINT_JANITOR.unregister(self) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ares/environments/base.py` around lines 222 - 227, The method currently
sets self._released before awaiting the potentially-failing async cleanup, which
prevents retries if release fails; change the order so you await
self._release_fn() first (if _release_fn is not None), only set self._released =
True and call _CHECKPOINT_JANITOR.unregister(self) after the await completes
successfully, and let exceptions propagate (or re-raise) so the checkpoint
remains not released on failure; update the release() method accordingly to move
the _released assignment and unregister call to after the awaited _release_fn()
completes.
| checkpoint = base.EnvironmentCheckpoint( | ||
| restore_fn=restore_fn, | ||
| step_count=5, | ||
| timestep=simple_timestep, | ||
| ) | ||
|
|
||
| env, ts = await checkpoint.restore() | ||
| assert env is mock_env | ||
| assert ts is simple_timestep | ||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_restore_creates_independent_envs(self, simple_timestep): | ||
| """Test that multiple restores create independent environments.""" | ||
| call_count = 0 | ||
|
|
||
| async def restore_fn(): | ||
| nonlocal call_count | ||
| call_count += 1 | ||
| return {"id": call_count} | ||
|
|
||
| checkpoint = base.EnvironmentCheckpoint( | ||
| restore_fn=restore_fn, | ||
| step_count=0, | ||
| timestep=simple_timestep, | ||
| ) | ||
|
|
||
| env_a, _ = await checkpoint.restore() | ||
| env_b, _ = await checkpoint.restore() | ||
| assert env_a != env_b | ||
| assert call_count == 2 | ||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_release_calls_release_fn(self, simple_timestep): | ||
| """Test that release() calls the release function.""" | ||
| released = False | ||
|
|
||
| async def release_fn(): | ||
| nonlocal released | ||
| released = True | ||
|
|
||
| async def restore_fn(): | ||
| return object() | ||
|
|
||
| checkpoint = base.EnvironmentCheckpoint( | ||
| restore_fn=restore_fn, | ||
| release_fn=release_fn, | ||
| step_count=0, | ||
| timestep=simple_timestep, | ||
| ) | ||
|
|
||
| await checkpoint.release() | ||
| assert released | ||
| assert checkpoint.is_released | ||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_release_is_idempotent(self, simple_timestep): | ||
| """Test that release() can be called multiple times safely.""" | ||
| call_count = 0 | ||
|
|
||
| async def release_fn(): | ||
| nonlocal call_count | ||
| call_count += 1 | ||
|
|
||
| async def restore_fn(): | ||
| return object() | ||
|
|
||
| checkpoint = base.EnvironmentCheckpoint( | ||
| restore_fn=restore_fn, | ||
| release_fn=release_fn, | ||
| step_count=0, | ||
| timestep=simple_timestep, | ||
| ) | ||
|
|
||
| await checkpoint.release() | ||
| await checkpoint.release() | ||
| assert call_count == 1 | ||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_restore_after_release_raises(self, simple_timestep): | ||
| """Test that restore() raises after release().""" | ||
|
|
||
| async def restore_fn(): | ||
| return object() | ||
|
|
||
| checkpoint = base.EnvironmentCheckpoint( | ||
| restore_fn=restore_fn, | ||
| step_count=0, | ||
| timestep=simple_timestep, | ||
| ) | ||
|
|
||
| await checkpoint.release() | ||
| with pytest.raises(RuntimeError, match="released"): | ||
| await checkpoint.restore() | ||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_no_release_fn_is_ok(self, simple_timestep): | ||
| """Test that checkpoints without release_fn work fine.""" | ||
|
|
||
| async def restore_fn(): | ||
| return object() | ||
|
|
||
| checkpoint = base.EnvironmentCheckpoint( | ||
| restore_fn=restore_fn, | ||
| step_count=0, | ||
| timestep=simple_timestep, | ||
| ) | ||
|
|
||
| # Release should be a no-op | ||
| await checkpoint.release() | ||
| assert checkpoint.is_released | ||
|
|
||
| def test_step_count_accessible(self, simple_timestep): | ||
| """Test that step_count is publicly accessible.""" | ||
|
|
||
| async def restore_fn(): | ||
| return object() | ||
|
|
||
| checkpoint = base.EnvironmentCheckpoint( | ||
| restore_fn=restore_fn, | ||
| step_count=42, | ||
| timestep=simple_timestep, | ||
| ) | ||
|
|
||
| assert checkpoint.step_count == 42 | ||
|
|
||
| def test_timestep_accessible(self, simple_timestep): | ||
| """Test that timestep is publicly accessible.""" | ||
|
|
||
| async def restore_fn(): | ||
| return object() | ||
|
|
||
| checkpoint = base.EnvironmentCheckpoint( | ||
| restore_fn=restore_fn, | ||
| step_count=0, | ||
| timestep=simple_timestep, | ||
| ) | ||
|
|
||
| assert checkpoint.timestep is simple_timestep | ||
|
|
||
|
|
||
| class TestCheckpointJanitor: | ||
| """Tests for _CheckpointJanitor.""" | ||
|
|
||
| def test_register_and_unregister(self, simple_timestep): | ||
| """Test that register/unregister tracks checkpoints.""" | ||
| janitor = base._CheckpointJanitor() | ||
|
|
||
| async def restore_fn(): | ||
| return object() | ||
|
|
||
| checkpoint = base.EnvironmentCheckpoint( | ||
| restore_fn=restore_fn, | ||
| step_count=0, | ||
| timestep=simple_timestep, | ||
| ) | ||
|
|
||
| # Checkpoint auto-registers with the global janitor, but | ||
| # we test the local janitor instance here. | ||
| janitor.register(checkpoint) | ||
| assert id(checkpoint) in janitor._checkpoints | ||
|
|
||
| janitor.unregister(checkpoint) | ||
| assert id(checkpoint) not in janitor._checkpoints | ||
|
|
||
| def test_cleanup_calls_sync_release(self, simple_timestep): | ||
| """Test that janitor cleanup calls release_fn_sync.""" | ||
| janitor = base._CheckpointJanitor() | ||
| released = False | ||
|
|
||
| def release_sync(): | ||
| nonlocal released | ||
| released = True | ||
|
|
||
| async def restore_fn(): | ||
| return object() | ||
|
|
||
| checkpoint = base.EnvironmentCheckpoint( | ||
| restore_fn=restore_fn, | ||
| release_fn_sync=release_sync, | ||
| step_count=0, | ||
| timestep=simple_timestep, | ||
| ) | ||
|
|
||
| janitor.register(checkpoint) | ||
| janitor._cleanup() | ||
|
|
||
| assert released | ||
| assert len(janitor._checkpoints) == 0 | ||
|
|
||
| def test_cleanup_handles_exceptions(self, simple_timestep): | ||
| """Test that janitor cleanup handles exceptions gracefully.""" | ||
| janitor = base._CheckpointJanitor() | ||
|
|
||
| def release_sync(): | ||
| raise RuntimeError("cleanup failed") | ||
|
|
||
| async def restore_fn(): | ||
| return object() | ||
|
|
||
| checkpoint = base.EnvironmentCheckpoint( | ||
| restore_fn=restore_fn, | ||
| release_fn_sync=release_sync, | ||
| step_count=0, | ||
| timestep=simple_timestep, | ||
| ) |
There was a problem hiding this comment.
Tests currently leak checkpoint state into the global janitor.
Each EnvironmentCheckpoint auto-registers globally, and many tests leave those entries behind. That introduces shared mutable state across test cases.
Proposed fixture to isolate janitor state
import pytest
from ares.environments import base
+@pytest.fixture(autouse=True)
+def _isolate_global_checkpoint_janitor():
+ base._CHECKPOINT_JANITOR._checkpoints.clear()
+ yield
+ base._CHECKPOINT_JANITOR._checkpoints.clear()
+
`@pytest.fixture`
def simple_timestep() -> base.TimeStep:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ares/environments/checkpoint_test.py` around lines 25 - 229, Tests leak
state because EnvironmentCheckpoint auto-registers with the module-level
janitor; add an autouse pytest fixture that resets the global janitor before
each test by creating a fresh base._CheckpointJanitor() and assigning it to the
module-level janitor used by EnvironmentCheckpoint (then restore/cleanup after
the test) so each test runs with an empty janitor and no shared _checkpoints;
reference EnvironmentCheckpoint and _CheckpointJanitor to locate where to wire
the fixture.
User description
Summary
EnvironmentCheckpoint[EnvType]-- a generic checkpoint object that holds closures for restore and release.restore()creates a brand new, independent environment instance at the checkpointed state.CheckpointableEnvironmentprotocol extendingEnvironmentwith a singlecheckpoint()method returningEnvironmentCheckpoint[Self]_CheckpointJanitorfor atexit cleanup of unreleased checkpoints (mirrors the existing_Janitorpattern incode_env.py)CheckpointableEnvironmentandEnvironmentCheckpointfrom the top-levelarespackagepresets.pyDesign
The environment protocol gains one method:
checkpoint(). Everything else lives on the checkpoint object:The checkpoint takes closures from the environment -- no internal protocol methods are imposed. The environment is free to implement restore/release however it wants. See
local/analyses/go-explore/04-go-explore-sketch.pyfor the full design rationale.Context
Part of Go-Explore support (Wave 1). This PR is independent of the other two Wave 1 PRs and can be reviewed/merged in parallel:
go-explore/snapshotable-container)go-explore/agent-state-serialization)Tests
11 new tests covering restore, independent env creation, release idempotency, restore-after-release errors, and janitor register/unregister/cleanup/exception handling.
Generated description
Below is a concise technical summary of the changes proposed in this PR:
Introduce
EnvironmentCheckpointto capture restorable snapshots and extendCheckpointableEnvironmentso environments can expose checkpoint/restore/release flows along with a_CheckpointJanitorfor cleanup; export these new APIs at the package level. Add focused tests covering restore, release, idempotency, and janitor cleanup to ensure the new checkpoint semantics work and register with the existing infrastructure.EnvironmentCheckpoint,CheckpointableEnvironment, and_CheckpointJanitorcapture state, restore new environments, release resources, and are covered by new async tests for checkpoint behavior and cleanup.Modified files (3)
Latest Contributors(2)
Modified files (1)
Latest Contributors(2)
Summary by CodeRabbit
New Features
Chores
Tests