refactor: robust prompt handling in subagent dispatcher and build orchestrator#142
refactor: robust prompt handling in subagent dispatcher and build orchestrator#142RinZ27 wants to merge 8 commits intoSynkraAI:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughReplaces shell-echo piping with direct Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels: 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @.aios-core/core/execution/subagent-dispatcher.js:
- Around line 653-661: Add an 'error' event handler on the spawned process stdin
to prevent unhandled stream errors: after creating the child via spawn(...) and
before/around child.stdin.write(prompt) attach child.stdin.on('error', (err) =>
{ /* log with context and safely ignore or handle EPIPE/stream errors */ });
also guard the write by checking child.stdin.writable (or wrap the write in
try/catch) so that the code using child, spawn, and child.stdin.write cannot
throw an unhandled exception if the pipe breaks or the child exits early.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @.aios-core/core/execution/subagent-dispatcher.js:
- Around line 651-653: In executeClaude, validate the prompt before any
childProcess.stdin.write occurs: inside the Promise wrapper at the top of
executeClaude check if prompt === null || prompt === undefined and immediately
reject the Promise (or throw a TypeError) with a clear message instead of
allowing "null"/"undefined" to be written to stdin; update callers only if
necessary, and ensure the check is placed before the code that builds args and
calls child.stdin.write so the guard prevents downstream issues.
🧹 Nitpick comments (1)
tests/core/claude-execution-robustness.test.js (1)
35-63: Only happy-path tests — consider adding error and edge-case coverage.Both test blocks verify the success scenario exclusively. For a robustness-focused test suite, consider adding cases for:
- Non-zero exit code from the spawned process (verifies reject path)
spawnemitting an'error'event (e.g.,claudebinary not found —ENOENT)- Prompts containing edge-case content (empty string, very large input)
This is especially relevant since the PR title emphasizes robustness.
Also applies to: 66-88
33578b1 to
dce23fb
Compare
dce23fb to
229b80b
Compare
nikolasdehor
left a comment
There was a problem hiding this comment.
Excellent security improvement. Approving.
What this fixes
The previous implementation used shell pipe (echo '${escapedPrompt}' | claude) which is vulnerable to:
- Command injection via crafted prompts containing shell metacharacters
- Quote escaping failures — the
replace(/'/g, "'\\\\''")approach is fragile and breaks with certain Unicode characters or nested quotes
The new approach uses child.spawn('claude', args) with stdin.write(prompt) which completely bypasses the shell, eliminating the injection surface.
Code quality observations
- ✅ Input validation added (
if (!prompt || typeof prompt !== 'string')) - ✅ Graceful stdin error handling (
child.stdin.on('error', ...)) - ✅ Guard for writable check (
if (child.stdin.writable)) - ✅ Both
SubagentDispatcher.executeClaudeandBuildOrchestrator.runClaudeCLIare updated consistently - ✅ Comprehensive test suite covering success, validation, error, and spawn failure scenarios
- ✅ entity-registry.yaml checksums updated to match the new file hashes
Minor note
The child.stdin.on('error') callback is empty — this is fine since errors are handled via the child.on('error') / child.on('close') events. A debug log there could help troubleshooting in the future, but not blocking.
Well done — this is exactly the kind of security hardening the codebase needs. 🔒
|
@RinZ27 is attempting to deploy a commit to the Pedro Valério Lopez's projects Team on Vercel. A member of the Team first needs to authorize it. |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
.aios-core/core/execution/subagent-dispatcher.js (1)
665-668: Silent stdin error handler loses observability context.The no-op handler prevents the unhandled-exception crash (past comment addressed ✅), but an EPIPE or write error on stdin is completely invisible — no log entry, no metric. In the EPIPE case the close handler will ultimately reject, but you lose the root-cause signal. A single structured log call here would make debugging much easier in production.
🔍 Suggested improvement
- child.stdin.on('error', (err) => { - // Handled via child 'error' or 'close' events - }); + child.stdin.on('error', (err) => { + // EPIPE/ECONNRESET are expected if the child exits early; + // the 'close' handler below will reject the promise with the exit code. + this.log('stdin_write_error', { error: err.message, code: err.code }); + });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.aios-core/core/execution/subagent-dispatcher.js around lines 665 - 668, The current no-op error handler on child.stdin ('child.stdin.on("error", (err) => { ... })') swallows EPIPE/write errors and loses observability; update that handler to emit a single structured log (e.g., processLogger.error or logger.error) including the error object and contextual identifiers (child.pid, command/args or task id available in the surrounding scope) and then return without rethrowing so the unhandled crash is still avoided; ensure the handler does not alter control flow (no throw) and keeps the existing behavior of letting the child's 'error' or 'close' handlers handle lifecycle.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.aios-core/core/execution/build-orchestrator.js:
- Around line 536-544: The current stdin error handler for child.stdin
(child.stdin.on('error', ...)) silently swallows errors and write/end are
conditional on child.stdin.writable, which can both mask EPIPE failures and
leave the child hanging; update the logic so that (1) the child.stdin 'error'
handler captures the error into a scoped variable (e.g., stdinError) and does
not ignore it, (2) the Promise resolution in the child's 'close' handler checks
for that stdinError and rejects with it (or includes it in the
rejection/returned error), and (3) always signal EOF to the child (call
child.stdin.end()) even if the writable flag is false/unreliable—ensure end() is
invoked in a safe pathway (e.g., try/catch/finally or a fallback setImmediate)
after attempting child.stdin.write(prompt) so the child cannot block waiting for
input; reference child.stdin.on('error'), child.stdin.write(prompt),
child.stdin.end(), and the child's 'close' handler to implement these changes.
- Around line 528-533: The child process 'child' created via spawn('claude',
args, ...) has a 'close' event handler that only inspects 'code' and logs
"Claude CLI exited with code null", which hides timeout signals; update the
'close' listener to accept both (code, signal) parameters and include the
'signal' value in the error/log output (e.g., when code === null or non-zero,
report both code and signal and special-case SIGTERM/SIGKILL to indicate a
timeout) so timeout diagnostics are clear.
In @.aios-core/core/execution/subagent-dispatcher.js:
- Around line 671-674: The current branch that writes to the child's stdin
checks child.stdin.writable but lacks an else path, which can leave the Promise
hanging; modify the block around child.stdin.writable so that if it's false you
immediately reject the surrounding Promise (include an Error with context) and
ensure any necessary cleanup is performed; specifically update the logic that
calls child.stdin.write(prompt) / child.stdin.end() to add an else that calls
reject(new Error(...)) (reference child.stdin.writable, child.stdin.write,
child.stdin.end, and the child 'close' event) so the caller is not left waiting
indefinitely.
---
Duplicate comments:
In @.aios-core/core/execution/subagent-dispatcher.js:
- Around line 652-654: The prompt validation in executeClaude (the guard `if
(!prompt || typeof prompt !== 'string') { return Promise.reject(new
Error('executeClaude requires a non-empty string prompt')); }`) is correct and
requires no changes—leave the check as-is since it safely handles
null/undefined/empty/non-string inputs and returning Promise.reject from this
non-async function is valid.
---
Nitpick comments:
In @.aios-core/core/execution/subagent-dispatcher.js:
- Around line 665-668: The current no-op error handler on child.stdin
('child.stdin.on("error", (err) => { ... })') swallows EPIPE/write errors and
loses observability; update that handler to emit a single structured log (e.g.,
processLogger.error or logger.error) including the error object and contextual
identifiers (child.pid, command/args or task id available in the surrounding
scope) and then return without rethrowing so the unhandled crash is still
avoided; ensure the handler does not alter control flow (no throw) and keeps the
existing behavior of letting the child's 'error' or 'close' handlers handle
lifecycle.
There was a problem hiding this comment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In @.aios-core/core/execution/build-orchestrator.js:
- Around line 564-570: Update the child.on('close', ...) handler in
build-orchestrator.js to accept both parameters (code, signal) and use signal
when code is null so timeout/kill cases are reported clearly; if signal is
present, reject with a message like "Claude CLI terminated by signal SIGTERM"
(instead of "code null"), and also include any existing stdinError variable in
the rejection message or log to surface IO errors—modify the resolve/reject
logic inside the close callback to prefer a human-readable signal description
when code === null and to append stderr and stdinError details for complete
diagnostics.
- Around line 535-545: Replace the current stdin error-swallowing logic: install
a variable (e.g., stdinError) outside the handlers, keep the
child.stdin.on('error', ...) handler but make it assign the error to stdinError
(not just console.debug) so the Promise/close handler can see it; always attempt
to close the stream (call child.stdin.end()) in a safe try/catch/finally path
even if child.stdin.writable is false (so the child won't hang), and wrap
child.stdin.write(prompt) in try/catch to capture write failures and set
stdinError; finally, update the process close handler to fail/reject or
propagate an error when stdinError is set (rather than treating a zero exit with
empty stdout as success). Ensure references to child.stdin.on('error'),
child.stdin.write(prompt), child.stdin.end(), the stdinError variable, and the
close handler are updated accordingly.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.aios-core/core/execution/build-orchestrator.js:
- Around line 535-550: Attach the child process 'error' listener before you
check stdin.writable so the listener (child.on('error', reject)) is always
registered even if you take the early-return path, and replace the direct
console.debug call in the stdin error handler with the BuildOrchestrator logging
method (e.g., this.log.debug or this.log with appropriate verbosity) so debug
output respects this.config.verbose; keep the existing stdinError capture and
the reject call in the non-writable branch but ensure the error listener is
already attached prior to potentially returning.
In @.aios-core/core/execution/subagent-dispatcher.js:
- Around line 659-663: executeClaude currently spawns Claude without a timeout
causing hung processes; add a configurable timeout (e.g., claudeTimeout or
subtaskTimeout) to SubagentDispatcher (initialize in the constructor with a
sensible default like 10 minutes), expose it on the instance so callers can
override, and pass that value into the spawn options inside executeClaude
(include timeout and keep signal handling intact). Also ensure the existing
close handler in executeClaude (the branch that checks for signal) will be
reachable on timeout by confirming the spawn timeout triggers the child
termination and that any try/catch around the spawn/stream handling wraps and
rethrows errors with context (including when timeout occurs). Ensure dispatch
callers that create SubagentDispatcher can opt in via the new constructor
parameter or config field.
- Around line 665-680: Move the child process "error" listener to immediately
after the spawn so it is registered in all paths (e.g., add child.on('error',
reject) right after creating child) instead of after the stdin writable check;
ensure the listener uses the same reject handler used elsewhere. Also replace
the console.debug call that logs stdin errors with the class logging mechanism
(this.log or dispatchLog) so diagnostics follow existing logging conventions
(use stdinError variable and the same message text when calling this.log).
nikolasdehor
left a comment
There was a problem hiding this comment.
Re-review: Novos commits desde 18/02
Revisitando apos 5 novos commits adicionados desde meu review original.
O que mudou
Os commits 70933742, 6b0e158b, e 352027e0 endereçam diretamente o ponto que levantei no review anterior sobre o callback vazio no child.stdin.on('error'). Especificamente:
-
Debug logging no stdin -- Agora ambos os arquivos logam o erro com mensagem e code:
build-orchestrator.js:this.log(\Claude stdin stream error: ${err.message}`, 'debug')`subagent-dispatcher.js:this.log('stdin_write_error', { error: err.message, code: err.code })
-
Diagnostico de exit por signal -- O handler
closeagora recebe(code, signal)e diferencia 3 cenarios:stdinError--> rejeita com mensagem clara de que o prompt nao foi entreguesignal--> rejeita com mensagem indicando timeout ou kill externocode !== 0--> comportamento anterior mantido
-
Timeout configuravel --
SubagentDispatcheragora aceitaconfig.claudeTimeout(default 10min) e passa para ospawn(). Isso e uma melhoria pratica -- antes o timeout era herdado apenas doconfig.subtaskTimeoutnoBuildOrchestrator. -
Registro de erro antecipado --
child.on('error')foi movido para antes dostdin.write(), garantindo que erros de spawn (ENOENT) sejam capturados mesmo se ocorrerem antes da escrita. -
Merge conflicts -- Os dois commits de merge (
273bed5b,6746e5e8) sao apenas resolucoes de conflito noentity-registry.yaml, sem impacto funcional.
Verificacao
- O padrao stdin e consistente entre ambos os arquivos (SubagentDispatcher e BuildOrchestrator)
- Os testes em
claude-execution-robustness.test.jscobrem os cenarios de sucesso, validacao de input, exit code nao-zero, e erro de spawn - A prioridade correta de erros no
closehandler: stdinError > signal > exit code
Todas as sugestoes do review anterior foram endereçadas. Aprovando.
re-reviewed by @nikolasdehor
|
This pull request has been automatically marked as stale because it has not had |
|
Merged the latest main to resolve the path conflicts from the project rename. While updating the test suite, I refined the stdin write logic to handle spawn failures more robustly using a dedicated error flag. Everything is passing locally now that the paths are corrected to .aiox-core. @oalanicolas @Pedrovaleriolopez, please take another look. |
Refactored the Claude execution logic in
SubagentDispatcherandBuildOrchestratorto usestdininstead of shell interpolation.During a review of the execution layer, I noticed prompts were being passed via
echoand shell pipes. This approach is prone to failure if the LLM output or task description contains shell-sensitive characters (like complex quotes or pipes), even with basic escaping.This update switches to a direct
stdinwrite, which:GeminiProvider).Verified with existing test suites.
Summary by CodeRabbit
Bug Fixes
Tests
Chores