feat: tool_call_alias and depends_on for intra-turn tool scheduling by Simon-Free · Pull Request #50 · SafeRL-Lab/cheetahclaws

Simon-Free · 2026-04-17T19:22:46Z

Summary

Expose two scheduling hints to the LLM on every tool schema, and strip them before they reach tool handlers:

tool_call_alias: string - optional alias the model can use to refer to a tool call by a short name later in the same turn.
depends_on: string[] - list of prior tool_call_ids or aliases. The model uses this to express sequential dependencies between tools that it wants executed one after the other rather than in parallel.

Also coerces string-typed params (sent by LLMs that flatten JSON) into their schema-declared types: "42" ? 42 for an integer property, "true" ? True for boolean, '[{...}]' ? list/dict for array/object.

What's in scope here

This PR only injects the schema fields and performs param coercion + stripping. Runtime enforcement of depends_on ordering in the agent loop is not yet implemented - the model gets a hint and can call tools in order manually, but the registry does not re-order parallel executions based on depends_on. Deferred to a follow-up PR so this one stays small and review-friendly.

Changes

File	+/-	What
`tool_registry.py`	+86	`_SCHEDULING_PROPS` constant, `_coerce_params` split into per-type coercers dispatched through `_COERCERS`, wrapper over `get_tool_schemas` that injects scheduling props, wrapper over `execute_tool` that strips scheduling props and coerces types
`tests/test_tool_scheduling.py`	+104	Unit coverage of schema injection, coercion per type, pass-through on invalid input, and stripping at dispatch
`tests/test_tool_scheduling_e2e.py`	+96	Two e2e scenarios via `agent.run` + mocked `stream`: (1) every schema the LLM sees carries the scheduling props, (2) when the LLM emits a tool_call that includes `tool_call_alias` + `depends_on`, those are gone by the time the tool handler runs

Cleanups folded in

_coerce_params' silent except (ValueError, json.JSONDecodeError): pass replaced by a dispatch table where each coercer explicitly returns the original value on failure, with a comment explaining the intent ("tool handler reports the real type mismatch").
Fix test_scheduling_params_stripped which called execute_tool(name, params) without the required config arg - it was failing since the branch landed.

Ref #43

…Lab#43

…rge file skip)

The three TestTokenSnapshotExtendedFields cases asserted cache_read / cache_creation fields that were removed in 620bbb2 ("fix: remove dead cache_read/cache_creation fields per review"). They have been failing ever since. Delete test_checkpoint_extras.py -- its remaining cases were either trivial (test_store_imports_sys checks 'import sys' exists) or file-source text scans (TestCheckpointPrintsToStderr) which don't test user behavior. Add tests/test_checkpoint_e2e.py with two real e2e scenarios: - Drive agent.run with a mocked LLM that emits a Write tool_call; assert the checkpoint hook created a pre-edit backup of the original content. - Same path but the file exceeds _MAX_FILE_SIZE -- assert the skip message lands on stderr only, not stdout. This is the actual user-visible contract of PR SafeRL-Lab#47 and covers the full wiring agent.run -> Write hook -> checkpoint.store.track_file_edit. The three behavior tests in test_checkpoint_store.py stay -- they cover the store function directly via capsys. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Simon-Free · 2026-04-21T11:11:07Z

Changes in this update`n`n### Removed: type coercion (split to separate PR)`nCoercion is an independent concern -- moved to its own PR with bug fixes.`n`n### Added: ID uniqueness enforcement`nPorted `id_uniquify.py` from bouzecode. Without this, when the LLM reuses IDs like `r1` across turns, you get duplicate `tool_call_id` errors from the API.`n`n### Fixed: `input_schema`-style schema injection`nScheduling props (`tool_call_alias`,` depends_on`) were injected at the wrong level for Anthropic-style schemas. Now correctly targets` input_schema.properties`.`n`n### Tests`n- 7 new ID uniquify unit tests`n- 1 new e2e test (ID reuse across turns)`n- Removed coercion tests (moved to coercion PR)

Simon-Free · 2026-04-21T11:12:50Z

Changes in this update

Removed: type coercion (split to separate PR #62)

Coercion is an independent concern - moved to its own PR with bug fixes.

Added: ID uniqueness enforcement

Ported id_uniquify.py from bouzecode. Without this, when the LLM reuses IDs like r1 across turns, you get duplicate tool_call_id errors from the API.

Fixed: input_schema-style schema injection

Scheduling props were injected at the wrong level for Anthropic-style schemas. Now correctly targets input_schema.properties.

Tests

7 new ID uniquify unit tests
1 new e2e test (ID reuse across turns)
Removed coercion tests (moved to PR feat: type coercion for string-typed tool params #62)

… Python versions)

…RL-Lab#43

Split _coerce_params (20 lines, nested try/except chain) into: - a small orchestrator that walks params and delegates, - four single-purpose coercers (_coerce_int / _coerce_float / _coerce_bool / _coerce_json) dispatched through a _COERCERS map. Each catching coercer still returns the original string on failure -- but the intent is now explicit via a comment ("tool handler reports the real type mismatch"), and the bare `except: pass` silent-pass pattern is gone. Also fix test_scheduling_params_stripped which called execute_tool without the required config arg; it has been failing since the pr4 branch landed. Add tests/test_tool_scheduling_e2e.py that drives agent.run with a mocked LLM: - assert every schema sent to the stream carries tool_call_alias + depends_on (proof the schema injection path is wired through the full agent loop, not just a unit helper); - register a "receiver" tool, let the LLM emit a tool_call with scheduling params + one real param, assert the scheduling params are gone and the real param reaches the handler. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Simon-Free · 2026-04-21T13:11:21Z

Actually depends on #47 (needs conftest.py)

Simon-Free · 2026-04-21T13:18:16Z

Dependency: This PR depends on #47 (pr3-checkpoint-stderr-tokens). Please merge #47 first.nnThe tests/conftest.py shared fixtures (_no_quota) are introduced in #47. This PR extends conftest with scripted_stream + receiver_tool for scheduling-specific e2e tests.

Simon-Free marked this pull request as draft April 17, 2026 19:45

chauncygu marked this pull request as ready for review April 18, 2026 07:51

bot and others added 4 commits April 21, 2026 11:03

feat: capture stderr and token metadata in checkpointsnnRef SafeRL-…

164e6c1

…Lab#43

fix: remove test for non-existent distinct_base field

cbb1367

test: add integration tests for checkpoint store (stderr capture + la…

de52523

…rge file skip)

Simon-Free mentioned this pull request Apr 21, 2026

feat: type coercion for string-typed tool params #62

Open

Simon-Free force-pushed the pr4-tool-scheduling-depends-on branch 3 times, most recently from e6db1fc to e9a96f7 Compare April 21, 2026 12:45

Simon FREYBURGER and others added 9 commits April 21, 2026 14:46

test: add shared conftest.py with scripted_stream and _no_quota fixtures

222b7e2

refactor: move scripted_stream to tests/helpers.py (importable on all…

7bdf2cc

… Python versions)

feat: tool scheduling with depends_on and tool_call_aliasnnRef Safe…

11c02dc

…RL-Lab#43

feat: implement tool scheduling (depends_on, coerce_params)

e4f4f4b

fix: register tools before testing scheduling schema injection

edcbf60

fix: strip coercion, add ID uniquify, fix input_schema injection

4e347b8

fix(tests): monkeypatch quota.record_usage in scheduling e2e tests

757b9a4

refactor(tests): mutualise e2e helpers into tests/conftest.py

805b024

Simon-Free force-pushed the pr4-tool-scheduling-depends-on branch from e9a96f7 to 805b024 Compare April 21, 2026 12:54

fix: use scripted_stream as pytest fixture, remove direct import

362a04b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: tool_call_alias and depends_on for intra-turn tool scheduling#50

feat: tool_call_alias and depends_on for intra-turn tool scheduling#50
Simon-Free wants to merge 14 commits intoSafeRL-Lab:mainfrom
Simon-Free:pr4-tool-scheduling-depends-on

Simon-Free commented Apr 17, 2026 •

edited

Loading

Uh oh!

Simon-Free commented Apr 21, 2026

Uh oh!

Simon-Free commented Apr 21, 2026

Uh oh!

Simon-Free commented Apr 21, 2026

Uh oh!

Simon-Free commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Simon-Free commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's in scope here

Changes

Cleanups folded in

Uh oh!

Simon-Free commented Apr 21, 2026

Uh oh!

Simon-Free commented Apr 21, 2026

Changes in this update

Removed: type coercion (split to separate PR #62)

Added: ID uniqueness enforcement

Fixed: input_schema-style schema injection

Tests

Uh oh!

Simon-Free commented Apr 21, 2026

Uh oh!

Simon-Free commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Simon-Free commented Apr 17, 2026 •

edited

Loading