-
Notifications
You must be signed in to change notification settings - Fork 2.1k
refactor(test): wire E2E security tests to real production code paths #1107
Description
Context
PR #1092 added E2E tests for credential sanitization and Telegram injection, but the tests reimplement production logic inline rather than calling the real code. Carlos flagged this in review — the tests can pass even if production regresses.
Problems
test/e2e/test-credential-sanitization.sh
-
stripCredentials()+isCredentialField()reimplemented 3x (C1-C5, C12, C13) — copy-pastesCREDENTIAL_FIELDS,CREDENTIAL_FIELD_PATTERN, and both functions intonode -eheredocs instead of importing frommigration-state.ts. -
walkAndRemoveFile()reimplemented 2x (C1-C5, C8) — production usescopyDirectory()with aCREDENTIAL_SENSITIVE_BASENAMESfilter. Test implements its own recursive walk. -
sanitizeConfigFile()behavior has drifted — production doesdelete config.gatewaythenstripCredentials(). The test does NOT deletegateway, it strips fields inside it. C4b (gateway.modepreserved) tests wrong behavior. -
verifyBlueprintDigest/verifyDigestreimplemented (C9-C11) — self-fulfilling tests that define their own verification logic, never calling production'scomputeFileDigest(). -
Python dependency for JSON parsing (C3-C4) — uses
python3 -c "import json..."instead ofnode -e.
test/e2e/test-telegram-injection.sh
-
send_message_to_sandbox()is dead code — defined but never called. All tests use inline SSH. -
Tests bypass
runAgentInSandbox()/shellQuote()— T1-T4, T8 useMSG=$(cat) && echo "$MSG"over SSH, a different code path than production'sshellQuote()frombin/lib/runner.js. A regression inshellQuote()wouldn't be caught. -
sandbox_exec()still fails open — this copy wasn't updated with the fail-closed fix applied to the credential test.
Required changes
Production code
nemoclaw/src/commands/migration-state.ts: exportisCredentialField,stripCredentials,sanitizeConfigFile,computeFileDigest,CREDENTIAL_FIELDS,CREDENTIAL_FIELD_PATTERN(or extract to a shared module).scripts/telegram-bridge.js: exportrunAgentInSandboxfor testability.
Test code
- Replace all inline
stripCredentials/walkAndRemoveFile/verifyDigestwithrequire()of real code. - Fix
sanitizeConfigFiledrift (gateway deletion vs stripping). - Replace python3 JSON parsing with node.
- Wire telegram injection tests through
shellQuote()/runAgentInSandbox(). - Remove or use
send_message_to_sandbox()dead code. - Fix
sandbox_exec()fail-closed in telegram test.
References
- PR test(security): add Brev E2E tests for command injection and credential sanitization #1092 (merged) — original E2E tests
- Carlos's review: test(security): add Brev E2E tests for command injection and credential sanitization #1092 (review)
shellQuoteandvalidateNameare already exported frombin/lib/runner.js✅