feat: ContextGC tool — explicit garbage collection of tool results and scratchpad notes#55
Draft
Simon-Free wants to merge 11 commits intoSafeRL-Lab:mainfrom
Draft
feat: ContextGC tool — explicit garbage collection of tool results and scratchpad notes#55Simon-Free wants to merge 11 commits intoSafeRL-Lab:mainfrom
Simon-Free wants to merge 11 commits intoSafeRL-Lab:mainfrom
Conversation
Remove compact_assistant_xml, compact_assistant_xml_selective, _xml_replacer,
_build_tc_lookup and _TOOL_USE_RE. These functions compact inline
<tool_use name=... id=...>...</tool_use> XML blocks inside assistant message
content, which only exist on providers that don't natively support
tool_use blocks (e.g. AWS Bedrock socle in bouzecode). Upstream cheetahclaws
uses the native Anthropic content: [{"type":"tool_use", ...}] format, so
these functions early-returned on every call and the compact_tool_history
branch that invoked compact_assistant_xml was a no-op.
Also fix _apply_context_gc which was wrapped in a double try/except where
the outer pass was unreachable, and which imported only apply_gc while
referencing inject_notes and prepend_verbatim_audit (NameError when
gc_state had entries). Replaced with a single try that imports all three
names and cleanly returns on ImportError if PR SafeRL-Lab#55 isn't deployed
alongside.
Test file drops the TestCompactAssistantXml / TestCompactAssistantXmlSelective
classes that exercised the removed functions.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add gc_state (trashed_ids, snippets, notes) as a real field on AgentState, serialize it in _build_session_data, and rehydrate it via a new helper _restore_state_from_data that is shared by cmd_load / cmd_resume / cmd_cloudsave load. Without this, any /save followed by /load silently drops trashed_ids: the tool_results previously elided by ContextGC re-materialize in the next turn's context window, leaking tens of thousands of tokens on long sessions. Tests cover save/load roundtrip and independence of gc_state instances across AgentState instances. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Remove the compact_xml field from GCState, the compact_xml parameter from the ContextGC tool schema, and the XML-compaction branch of apply_gc. The branch used to dynamically import compact_assistant_xml / compact_assistant_xml_selective from followup_compaction, but those only match <tool_use name="X" id="Y">...</tool_use> strings inside assistant text content -- a shape that only appears on providers without native tool_use support (e.g. AWS Bedrock socle in bouzecode). Upstream cheetahclaws emits native Anthropic content blocks, so the XML branch was an unreachable no-op. The branch also had a latent NameError (compact_assistant_xml_selective was imported under the wrong name), which is why 2 existing tests were red against this branch. apply_gc is now a 3-line list comprehension dispatching to _apply_gc_to_message, which in turn delegates to _stub_trashed_tool_result and _apply_snippet_to_message. Each helper fits on one screen, names its intent, and no longer hides behavior behind a dead early-return chain. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduce a generic config['disabled_tools'] list honoured by the tool registry in two places: - get_tool_schemas(disabled=...) filters disabled names out of the schema list sent to the LLM; the model never learns the tool exists. - execute_tool(...) defense-in-depth: any tool_call whose name is disabled returns an explicit error tool_result instead of running. agent.py passes config['disabled_tools'] to get_tool_schemas per turn. Callers that set disabled_tools=['ContextGC'] now get pre-SafeRL-Lab#55 behaviour with the rest of this PR in place -- which is what makes the ContextGC tool truly opt-out rather than an implicit behaviour change for every existing integration. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
agent.run builds a fresh per-turn config dict at the top of the function. It adds _depth and _system_prompt so tools like Agent can read them, but forgot to add _gc_state. As a result every ContextGC invocation returned "Error: no GC state available" and trashed_ids was never mutated in production. Add "_gc_state": state.gc_state to the merge. Because state.gc_state is the same object across turns and is persisted in /save, ContextGC can now read and mutate it, and its effects carry over a /load. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three scenarios, each driving agent.run through a multi-turn conversation where only providers.stream is replaced by a scripted generator. All tools (echo + ContextGC) execute for real through the registry. - test_llm_trashes_tool_result_via_contextgc_end_to_end: LLM issues echo, then ContextGC(trash=[echo_id]); asserts state.gc_state.trashed_ids. - test_gc_state_survives_save_and_reload_via_session_helpers: same setup + _build_session_data → JSON → _restore_state_from_data roundtrip, asserts trashed_ids still present after restore. - test_disabled_tools_hides_contextgc_schema_from_llm: confirms that config['disabled_tools']=['ContextGC'] removes the schema from the list sent to the stream, proving backwards-compatibility without touching the registry. These cover the integration points unit tests can't see: tool registration, config injection of _gc_state in agent.run, and the schema-filter path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
b9e4213 to
b49064d
Compare
This was referenced Apr 20, 2026
… methodology protection, stub detection, audit improvements
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changes (updated)
Tests
Port of bouzecode context_gc package (flat-file adaptation).