Summary
OPENAI_API_KEY is referenced at runtime by both the App API and the Inference API (AgentCore Runtime), but it is not passed through to the deployed containers. Today it only works when a developer sets it in their local backend/src/.env. Deployed environments return a 500 from /admin/models/openai and any OpenAI-provider chat request fails in AgentFactory.
Current State
Consumer call sites (already implemented)
Infrastructure gap
Neither CDK stack passes the value through:
Proposed Solution — Follow the existing OAuth/Auth-Provider Secrets pattern
The repo already has a clean pattern for runtime secrets (see OAUTH_CLIENT_SECRETS_ARN and AUTH_PROVIDER_SECRETS_ARN): create a Secrets Manager secret in InfrastructureStack, publish its ARN via SSM, import the ARN in consumer stacks, inject the ARN as an env var, grant secretsmanager:GetSecretValue on the task role, and fetch the plaintext at runtime.
Why ARN-at-runtime rather than raw env var injection:
- Consistent with existing secrets in this repo.
- AgentCore
CfnRuntime.environmentVariables only accepts plain string values (no ValueFrom/secrets block like ECS has), so we'd need a fetch-at-runtime helper for the Inference API regardless. Using the same approach in both services keeps things symmetric.
- The plaintext key never lands in CloudFormation templates, task definitions, or logs.
Instructions for the implementing agent
Scope boundaries: Do not change the two consumer call sites to use a different config key name. They already read OPENAI_API_KEY from the environment — keep it that way. Your job is to ensure that env var is populated (from Secrets Manager) inside both deployed containers, plus add a small helper that hydrates os.environ['OPENAI_API_KEY'] at startup from the secret ARN when running in AWS.
1. CDK — InfrastructureStack
In infrastructure/lib/infrastructure-stack.ts, near the existing OAuthClientSecretsSecret (around line 590):
- Create
new secretsmanager.Secret(this, "OpenAiApiKeySecret", { secretName: getResourceName(config, "openai-api-key"), description: "OpenAI API key for OpenAI provider models" }) with removalPolicy: getRemovalPolicy(config).
- Do not generate the value via
generateSecretString — this is a user-supplied key. The secret is created empty and populated out-of-band (documented in step 6).
- Publish the ARN to SSM at
/${config.projectPrefix}/llm/openai-api-key-secret-arn (mirror the OAuthClientSecretsArnParameter block exactly).
2. CDK — AppApiStack
In infrastructure/lib/app-api-stack.ts:
- Import the ARN via
ssm.StringParameter.valueForStringParameter(this, '/${config.projectPrefix}/llm/openai-api-key-secret-arn') near the other SSM imports.
- Add
OPENAI_API_KEY_SECRET_ARN: openAiApiKeySecretArn to the container environment: block (~line 403).
- Add an IAM policy statement to
taskDefinition.taskRole granting secretsmanager:GetSecretValue and secretsmanager:DescribeSecret on ${openAiApiKeySecretArn}* (wildcard to cover the random suffix), following the exact shape of the existing OAuthClientSecretsAccess statement (~line 912).
3. CDK — InferenceApiStack
In infrastructure/lib/inference-api-stack.ts:
- Import the same SSM ARN alongside the other parameter imports.
- Add
OPENAI_API_KEY_SECRET_ARN: openAiApiKeySecretArn to the CfnRuntime.environmentVariables block (~line 904).
- Add a
secretsmanager:GetSecretValue / secretsmanager:DescribeSecret statement to the runtime execution role (not the task role — AgentCore uses runtimeExecutionRole, see around line 195 where existing SSM permissions are granted).
4. Backend — secret hydration helper
The two call sites read os.environ['OPENAI_API_KEY'] directly. Add a small bootstrap helper that populates it from Secrets Manager before the consumers run:
- Create
backend/src/apis/shared/secrets/openai_key.py (new file) with one function hydrate_openai_api_key() -> None that:
- Short-circuits if
OPENAI_API_KEY is already set (local dev path — .env wins).
- Reads
OPENAI_API_KEY_SECRET_ARN from env; returns silently if empty (OpenAI is optional).
- Uses
boto3.client('secretsmanager').get_secret_value(SecretId=arn) and sets os.environ['OPENAI_API_KEY'] from SecretString (treat as plaintext — the secret stores the raw key, not JSON).
- Catches
ClientError with ResourceNotFoundException / empty SecretString and logs a warning rather than raising — OpenAI is optional and the rest of the system must still boot.
- Uses a module-level flag so it runs at most once per process.
- Call it from both service entrypoints at startup:
- Leave the consumer call sites untouched.
5. Tests
- Add
backend/tests/apis/shared/secrets/test_openai_key.py covering: (a) already-set env var is preserved, (b) missing ARN is a no-op, (c) successful fetch populates env, (d) ResourceNotFoundException logs and does not raise, (e) idempotent on second call.
- Use
moto or unittest.mock — match whatever pattern is already used in backend/tests/ (check the existing secrets-related tests first).
- Do not modify the existing
test_agent_factory.py tests around OPENAI_API_KEY — they correctly test the factory's behavior assuming the env var is set.
6. Documentation
- Update backend/src/.env.example:418 to add a new
OPENAI_API_KEY_SECRET_ARN= entry below OPENAI_API_KEY= with a comment explaining: local dev uses OPENAI_API_KEY, deployed environments use OPENAI_API_KEY_SECRET_ARN which CDK populates from SSM.
- Add a short section to the appropriate deploy doc under
.github/docs/deploy/ (check what exists for OAuth client secrets — likely step-02-aws-setup.md or similar) documenting the post-deploy step: aws secretsmanager put-secret-value --secret-id <name> --secret-string <your-openai-key>. Note that until this is populated, OpenAI models will not work but the rest of the system is unaffected.
7. Verification
cd infrastructure && npm run build && npx cdk synth must succeed.
cd backend && uv run python -m pytest tests/apis/shared/secrets/ tests/agents/main_agent/core/test_agent_factory.py -v must pass.
- Manually trace the flow: a fresh
cdk deploy + put-secret-value + redeploy of the App API service → GET /admin/models/openai returns the OpenAI model list instead of 500.
Non-Goals
- Do not add
GOOGLE_GEMINI_API_KEY plumbing in this issue — same problem exists but track it separately to keep the diff reviewable.
- Do not change the
EnvVars.OPENAI_API_KEY constant or rename anything at the consumer call sites.
- Do not make OpenAI a required dependency — the system must still boot when the secret is empty.
Summary
OPENAI_API_KEYis referenced at runtime by both the App API and the Inference API (AgentCore Runtime), but it is not passed through to the deployed containers. Today it only works when a developer sets it in their localbackend/src/.env. Deployed environments return a 500 from/admin/models/openaiand any OpenAI-provider chat request fails inAgentFactory.Current State
Consumer call sites (already implemented)
os.getenv(EnvVars.OPENAI_API_KEY); raisesValueErrorif unset. Runs inside the AgentCore Runtime container.os.environ.get('OPENAI_API_KEY'); returns 500 if unset. Runs inside the ECS App API container.EnvVars.OPENAI_API_KEYconstant.Infrastructure gap
Neither CDK stack passes the value through:
environment:block on theAppApiContainerdoes not includeOPENAI_API_KEY.environmentVariables:onCfnRuntimedoes not includeOPENAI_API_KEY.OPENAI, so the value is not sourced from CI secrets either.Proposed Solution — Follow the existing OAuth/Auth-Provider Secrets pattern
The repo already has a clean pattern for runtime secrets (see
OAUTH_CLIENT_SECRETS_ARNandAUTH_PROVIDER_SECRETS_ARN): create a Secrets Manager secret inInfrastructureStack, publish its ARN via SSM, import the ARN in consumer stacks, inject the ARN as an env var, grantsecretsmanager:GetSecretValueon the task role, and fetch the plaintext at runtime.Why ARN-at-runtime rather than raw env var injection:
CfnRuntime.environmentVariablesonly accepts plain string values (noValueFrom/secretsblock like ECS has), so we'd need a fetch-at-runtime helper for the Inference API regardless. Using the same approach in both services keeps things symmetric.Instructions for the implementing agent
1. CDK — InfrastructureStack
In infrastructure/lib/infrastructure-stack.ts, near the existing
OAuthClientSecretsSecret(around line 590):new secretsmanager.Secret(this, "OpenAiApiKeySecret", { secretName: getResourceName(config, "openai-api-key"), description: "OpenAI API key for OpenAI provider models" })withremovalPolicy: getRemovalPolicy(config).generateSecretString— this is a user-supplied key. The secret is created empty and populated out-of-band (documented in step 6)./${config.projectPrefix}/llm/openai-api-key-secret-arn(mirror theOAuthClientSecretsArnParameterblock exactly).2. CDK — AppApiStack
In infrastructure/lib/app-api-stack.ts:
ssm.StringParameter.valueForStringParameter(this, '/${config.projectPrefix}/llm/openai-api-key-secret-arn')near the other SSM imports.OPENAI_API_KEY_SECRET_ARN: openAiApiKeySecretArnto the containerenvironment:block (~line 403).taskDefinition.taskRolegrantingsecretsmanager:GetSecretValueandsecretsmanager:DescribeSecreton${openAiApiKeySecretArn}*(wildcard to cover the random suffix), following the exact shape of the existingOAuthClientSecretsAccessstatement (~line 912).3. CDK — InferenceApiStack
In infrastructure/lib/inference-api-stack.ts:
OPENAI_API_KEY_SECRET_ARN: openAiApiKeySecretArnto theCfnRuntime.environmentVariablesblock (~line 904).secretsmanager:GetSecretValue/secretsmanager:DescribeSecretstatement to the runtime execution role (not the task role — AgentCore usesruntimeExecutionRole, see around line 195 where existing SSM permissions are granted).4. Backend — secret hydration helper
The two call sites read
os.environ['OPENAI_API_KEY']directly. Add a small bootstrap helper that populates it from Secrets Manager before the consumers run:backend/src/apis/shared/secrets/openai_key.py(new file) with one functionhydrate_openai_api_key() -> Nonethat:OPENAI_API_KEYis already set (local dev path —.envwins).OPENAI_API_KEY_SECRET_ARNfrom env; returns silently if empty (OpenAI is optional).boto3.client('secretsmanager').get_secret_value(SecretId=arn)and setsos.environ['OPENAI_API_KEY']fromSecretString(treat as plaintext — the secret stores the raw key, not JSON).ClientErrorwithResourceNotFoundException/ emptySecretStringand logs a warning rather than raising — OpenAI is optional and the rest of the system must still boot.5. Tests
backend/tests/apis/shared/secrets/test_openai_key.pycovering: (a) already-set env var is preserved, (b) missing ARN is a no-op, (c) successful fetch populates env, (d)ResourceNotFoundExceptionlogs and does not raise, (e) idempotent on second call.motoorunittest.mock— match whatever pattern is already used inbackend/tests/(check the existing secrets-related tests first).test_agent_factory.pytests aroundOPENAI_API_KEY— they correctly test the factory's behavior assuming the env var is set.6. Documentation
OPENAI_API_KEY_SECRET_ARN=entry belowOPENAI_API_KEY=with a comment explaining: local dev usesOPENAI_API_KEY, deployed environments useOPENAI_API_KEY_SECRET_ARNwhich CDK populates from SSM..github/docs/deploy/(check what exists for OAuth client secrets — likelystep-02-aws-setup.mdor similar) documenting the post-deploy step:aws secretsmanager put-secret-value --secret-id <name> --secret-string <your-openai-key>. Note that until this is populated, OpenAI models will not work but the rest of the system is unaffected.7. Verification
cd infrastructure && npm run build && npx cdk synthmust succeed.cd backend && uv run python -m pytest tests/apis/shared/secrets/ tests/agents/main_agent/core/test_agent_factory.py -vmust pass.cdk deploy+put-secret-value+ redeploy of the App API service →GET /admin/models/openaireturns the OpenAI model list instead of 500.Non-Goals
GOOGLE_GEMINI_API_KEYplumbing in this issue — same problem exists but track it separately to keep the diff reviewable.EnvVars.OPENAI_API_KEYconstant or rename anything at the consumer call sites.