fix(security): eliminate API key leakage via ps aux across all three execution layers#1
Open
fix(security): eliminate API key leakage via ps aux across all three execution layers#1
Conversation
2afa1f4 to
47512fa
Compare
Add three argv-safe helpers to bin/lib/runner.js:
runArgv(prog, args, opts) -- spawnSync without shell
runCaptureArgv(prog, args, opts) -- execFileSync without shell; returns stdout
assertSafeName(name, label) -- validates against [a-zA-Z0-9][a-zA-Z0-9_-]{0,62}
Fix pre-existing opts.env overwrite: old spread { ...opts } after the merged env
silently clobbered it. mergeEnv(opts) destructures opts.env first.
test/runner.test.js: 22 new assertions (assertSafeName rejections, injection
PoC, opts.env preservation).
Closes shell-injection attack surface in the legacy CJS layer by replacing
all user-controlled run() / runCapture() shell strings with the new argv-safe
runArgv() / runCaptureArgv() helpers. assertSafeName() guards every
user-supplied sandbox/instance/preset name before it enters any command.
bin/lib/onboard.js -- all openshell/bash/brew calls -> runArgv;
file copies -> fs.cpSync/fs.rmSync (no cp shell)
bin/lib/nim.js -- docker pull/rm/run/stop/inspect -> runArgv/runCaptureArgv;
assertSafeName guard on sandboxName
bin/lib/policies.js -- openshell policy get/set -> runCaptureArgv/runArgv;
assertSafeName on sandboxName and presetName;
temp policy file written with mode 0o600
bin/nemoclaw.js -- setupSpark: remove inline NVIDIA_API_KEY=VALUE from
sudo argv (sudo -E already inherits env);
deploy: assertSafeName on instanceName;
sandbox connect/status/logs/destroy -> runArgv
Supersedes PRs: NVIDIA#148 (shell injection), part of NVIDIA#330 (credential leak).
… in argv Fixes: NVIDIA#325 (API key exposed in process list via ps aux) Supersedes: PRs NVIDIA#191, NVIDIA#330 The root cause: all three execution layers passed the actual credential VALUE as --credential KEY=VALUE, making it visible to any local user via `ps aux` or /proc/<pid>/cmdline. Safe pattern: set the secret in the child's inherited env, then pass only the env-var NAME to --credential (openshell env-lookup form). nemoclaw/src/commands/onboard.ts - process.env[credentialEnv] = apiKey before execOpenShell - --credential arg: credentialEnv (name only, not KEY=VALUE) - applies to both provider create and provider update paths nemoclaw-blueprint/orchestrator/runner.py - Rename credential_env -> target_cred_env with type-based fallback (nvidia -> NVIDIA_API_KEY, openai -> OPENAI_API_KEY) when not set in the blueprint profile. Supersedes PR NVIDIA#191's partial fix. - os.environ[target_cred_env] = credential before run_cmd - --credential arg: target_cred_env (name only) nemoclaw-blueprint/blueprint.yaml - Add credential_env: NVIDIA_API_KEY to the default profile. Without this field the type-based fallback would silently use OPENAI_API_KEY for the nvidia provider_type, causing auth failure. nemoclaw/src/onboard/config.ts - writeFileSync for config.json now passes mode: 0o600 so the file containing endpoint/model/credentialEnv metadata is not world-readable. test/credential-exposure.test.js (new file) - Static source scan: asserts no --credential KEY=VALUE pattern in any of the 3 execution layer files (allowlists dummy/ollama stubs) - Layer-specific structural checks (process.env set, os.environ set, blueprint default profile has credential_env) - Runtime injection PoC: proves old bash -c IS vulnerable; new runCaptureArgv IS NOT All 84 tests pass.
47512fa to
892bc25
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes NVIDIA#325 — API key exposed in process list via
ps aux.This PR is a comprehensive rollup that supersedes open PRs NVIDIA#148, NVIDIA#191, NVIDIA#225, and NVIDIA#330. It fixes every instance of secret leakage and shell injection that was identified during a first-principles audit of all three execution layers.
Root Cause
openshell provider create --credential KEY=VALUEpasses the secret as a command-line argument. Every user on the machine can read it viaps auxor/proc/<pid>/cmdline. The same pattern existed in:bin/lib/onboard.jsrun(\openshell provider create --credential "NVIDIA_API_KEY=${process.env.NVIDIA_API_KEY}"`)`nemoclaw/src/commands/onboard.tsexecOpenShell(["--credential", \${credentialEnv}=${apiKey}`])`nemoclaw-blueprint/orchestrator/runner.pyprovider_args.extend(["--credential", f"OPENAI_API_KEY={credential}"])Fix — Three Commits
Commit 1 —
fix(runner): safe argv primitives + opts.env overwrite fixrunArgv(prog, args, opts)—spawnSyncwithout shell; no metacharacter expansionrunCaptureArgv(prog, args, opts)—execFileSyncwithout shell; returns stdout stringassertSafeName(name, label)— validates user-supplied names against[a-zA-Z0-9][a-zA-Z0-9_-]{0,62}; callsprocess.exit(1)on rejectionopts.envoverwrite bug:mergeEnv()destructuresopts.envbefore the rest spread soPATH/HOME/DOCKER_HOSTare always preservedCommit 2 —
fix(cli): shell-string → argv arrays (injection prevention)Converts every
run(\...`)` call that accepted user-controlled values across:bin/lib/onboard.js— all openshell/bash/brew calls,fs.cpSync/fs.rmSyncreplacing shellcp/rmbin/lib/nim.js— docker pull/rm/run/stop/inspect +assertSafeNameon sandboxNamebin/lib/policies.js— openshell policy get/set +assertSafeNameon both sandboxName and presetName + temp file written withmode: 0o600bin/nemoclaw.js— removes inlineNVIDIA_API_KEY=VALUEfromsudoargv (superseded bysudo -Eenv inherit);assertSafeNameon deploy instanceName; sandbox connect/status/logs/destroy →runArgvCommit 3 —
fix(credentials): env-lookup form for--credential; secrets never in argvSafe pattern: set the secret in
process.env/os.environbefore the call, then pass only the env-var name to--credential— openshell reads the value from the environment, never from argv.nemoclaw/src/commands/onboard.tsprocess.env[credentialEnv] = apiKeybeforeexecOpenShell; bothprovider createandupdatepaths changednemoclaw-blueprint/orchestrator/runner.pytarget_cred_envwith type-based fallback (supersedes NVIDIA#191);os.environ[target_cred_env] = credential;--credential target_cred_envnemoclaw-blueprint/blueprint.yamlcredential_env: NVIDIA_API_KEYtodefaultprofile — without it the fallback would pickOPENAI_API_KEYfor thenvidiaprovider typenemoclaw/src/onboard/config.tswriteFileSyncuses{ mode: 0o600 }so config.json is owner-readable onlyVerification
Key tests added:
test/runner.test.js(22 assertions) — assertSafeName rejects;,$(),|,../, spaces;runCaptureArgvdoes not expand shell metacharacters;opts.envpreservesPATHtest/credential-exposure.test.js(9 assertions) — static scan of all 3 layers for--credential KEY=VALUEpatterns; structural checks forprocess.env[credentialEnv]andos.environ[target_cred_env]; runtime injection PoCps aux before/after
Before (vulnerable):
After (safe):
The secret is in the process's environment variables, not in its argv.
PRs Superseded
isNonInteractive()inonboard.tsalready implements this--credentialarg