fix: route /codex:rescue through the Agent tool to stop Skill recursion (#234)#235
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: eb24951c89
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Address openai#235 review comment (Codex bot, P2): the new character-for-character model rule in `agents/codex-rescue.md` conflicts with the `spark` alias line immediately above it. For a user request like "use spark", line 31 says to emit `--model gpt-5.3-codex-spark`, while line 33 says never to emit a model name the user did not literally type. That is nondeterministic and undermines the no-duplicate-call goal. Rephrase the literal-copy rule to explicitly exempt the documented `spark` alias, and mirror the fix in `skills/codex-cli-runtime/SKILL.md`. Pin the exemption with a regression assertion so a future edit can't silently reintroduce the contradiction.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c582eda025
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
…on (openai#234) `/codex:rescue` previously combined two things that together caused a hang: - `context: fork` in the frontmatter, which spawns a `general-purpose` subagent for the command body. - Body prose "Route this request to the `codex:codex-rescue` subagent." without naming the transport. When the main agent called `Skill(codex:rescue)` programmatically, the fork resolved the ambiguous prose by trying `Skill(codex:codex-rescue)` (unknown skill) and then falling back to `Skill(codex:rescue)`, which re-entered this command and hung the session until the user cancelled. No Codex job was ever created. Naming the transport as `Agent(codex:codex-rescue)` alone is not enough: forked general-purpose subagents do not expose the `Agent` tool, so the forked runner cannot reach the subagent that way either. The minimal fix is therefore two coordinated changes: - Drop `context: fork` so the command body runs inline in the calling agent's context, where `Agent` is in scope. - Say explicitly "use the `Agent` tool with `subagent_type: "codex:codex-rescue"`", and call out that `Skill(codex:codex-rescue)` and `Skill(codex:rescue)` are not valid routing paths. Add `Agent` to `allowed-tools` so the call does not prompt for permission. Everything else in rescue.md (resume-candidate check, flag handling, background/foreground semantics, operating rules) is unchanged. The `codex:codex-rescue` subagent itself is unchanged. Tests pin the new allow-list, the explicit `subagent_type`, the ban on `Skill(codex:codex-rescue)`, and the absence of `context: fork`. The existing "run the `codex:codex-rescue` subagent in the background" assertion continues to hold since that sentence still reads correctly with the Agent-tool transport. Fixes openai#234
c582eda to
cd4cb60
Compare
|
A note on the Original topology (with The bug: forked After the fix: the command body runs inline in the caller, which does have Consistency check: every other command in Trade-off acknowledged: the calling agent now sees the rescue.md body and the Alternative considered: keeping the fork and having it call |
Fixes #234.
What was broken
/codex:rescuewas markedcontext: fork, so the command body ran inside a forkedgeneral-purposesubagent. The body said:When the main Claude Code agent called
Skill(codex:rescue)programmatically (as opposed to the user typing the slash command), the forked runner read that ambiguous prose and guessed the transport. It triedSkill(codex:codex-rescue)— unknown skill — then fell back toSkill(codex:rescue), which re-entered this same command and recursed until the user cancelled. No Codex job was ever created.The user-facing symptom, quoted from the issue:
Why the fix is two coordinated changes and not one
The issue's "Suggested fix" section lists two options:
Agenttool).allowed-toolsto excludeSkill.In practice, neither alone is sufficient. Forked
general-purposesubagents do not expose theAgenttool, so instructing the fork to "useAgent" just trades the hang for a clean failure — the main agent then retries on its own. The minimum fix is therefore two coordinated changes:context: fork. The command body now runs inline in the calling agent's context, where theAgenttool is in scope.codex:codex-rescueviaAgent(subagent_type: "codex:codex-rescue"), addAgenttoallowed-tools, and call out thatSkill(codex:codex-rescue)/Skill(codex:rescue)are not valid paths.Everything else in
rescue.md— resume-candidate check,--background/--wait/--resume/--freshhandling, operating rules, model/effort pass-through — is unchanged.agents/codex-rescue.mdandskills/codex-cli-runtime/SKILL.mdare not touched.Diff
plugins/codex/commands/rescue.md: 4 lines changed (one frontmatter swap, two body sentences rewritten).tests/commands.test.mjs: pin the newallowed-tools, pin the explicitsubagent_type: "codex:codex-rescue"prose, pin theSkill(codex:codex-rescue)ban, and assertcontext: forkis absent. Total +9 / −1.Not in scope
During iteration on this fix I observed adjacent duplicate-
task-call patterns insidecodex:codex-rescue(model-name drift retried on Codex 4xx, long prompts failing on zsh shell-quoting, main-agent retry on upstream failure). Those are separate bugs and deserve separate issues and separate PRs where they can be discussed on their own merits. I deliberately kept this PR scoped to #234's routing hang to keep the diff small, reviewable, and directly tied to the filed issue.Test plan
node --test tests/commands.test.mjs -t "rescue command absorbs continue semantics"passes.rescue.mdrestored, the new assertions fail onallowed-tools→Agentmissing, then would also fail on thesubagent_typeand no-context: forkassertions.node --test tests/commands.test.mjsas a whole: 7/8 — the one failure is a pre-existingresult "$ARGUMENTS"quoting assertion unrelated to rescue (introduced in fix: quote $ARGUMENTS in cancel, result, and status commands #168, also fails on pristine upstream/main).Commits
One commit:
cd4cb60 fix: route /codex:rescue through the Agent tool to stop Skill recursion (#234).