Skip to content

Harden /ready checks + production CORS guardrails#79

Merged
alanmaizon merged 2 commits intomainfrom
feat/readyz-cors-lockdown
Feb 18, 2026
Merged

Harden /ready checks + production CORS guardrails#79
alanmaizon merged 2 commits intomainfrom
feat/readyz-cors-lockdown

Conversation

@alanmaizon
Copy link
Owner

@alanmaizon alanmaizon commented Feb 17, 2026

What

  • /ready now verifies DB connectivity + storage (local or S3 write/read), with a small in-memory cache to avoid thrashing.
  • Production CORS guardrails: backend fails fast if CORS_ORIGINS contains http:// or localhost when APP_ENV=production.
  • Deploy readiness script now fails a production deploy when CORS_ORIGINS is insecure (http://, *, or localhost).

Why

Avoid silent partial outages (DB/S3 miswiring) and prevent drifting back to insecure CORS configuration.

Required AWS change BEFORE deploying this PR

Your current backend task definition has CORS_ORIGINS=http://... which will fail once this PR deploys.
Update it to your CloudFront HTTPS origin.

CloudFront domain (current): https://d1isah9ifwe7fg.cloudfront.net

CLI patch (creates a new task definition revision + force deploy):

AWS_PROFILE=nebula
AWS_REGION=eu-central-1
ECS_CLUSTER=nebula-cluster
ECS_SERVICE=nebula-backend
ECS_CONTAINER=nebula-backend
NEW_CORS_ORIGINS="https://d1isah9ifwe7fg.cloudfront.net"

TD_ARN="$(aws ecs describe-services --profile "$AWS_PROFILE" --region "$AWS_REGION" --cluster "$ECS_CLUSTER" --services "$ECS_SERVICE" --query 'services[0].taskDefinition' --output text)"
aws ecs describe-task-definition --profile "$AWS_PROFILE" --region "$AWS_REGION" --task-definition "$TD_ARN" --query taskDefinition --output json > /tmp/nebula-backend-td.json

jq --arg c "$ECS_CONTAINER" --arg cors "$NEW_CORS_ORIGINS" '
  del(.taskDefinitionArn,.revision,.status,.requiresAttributes,.compatibilities,.registeredAt,.registeredBy)
  | .containerDefinitions |= map(
      if .name == $c then
        .environment = ((.environment // [])
          | map(select(.name != "CORS_ORIGINS"))
          + [{"name":"CORS_ORIGINS","value":$cors}])
      else . end
    )
' /tmp/nebula-backend-td.json > /tmp/nebula-backend-td.patched.json

NEW_TD_ARN="$(aws ecs register-task-definition --profile "$AWS_PROFILE" --region "$AWS_REGION" --cli-input-json file:///tmp/nebula-backend-td.patched.json --query 'taskDefinition.taskDefinitionArn' --output text)"
aws ecs update-service --profile "$AWS_PROFILE" --region "$AWS_REGION" --cluster "$ECS_CLUSTER" --service "$ECS_SERVICE" --task-definition "$NEW_TD_ARN" --force-new-deployment
aws ecs wait services-stable --profile "$AWS_PROFILE" --region "$AWS_REGION" --cluster "$ECS_CLUSTER" --services "$ECS_SERVICE"

Test

cd backend
PYTHONPATH=. .venv/bin/python3.14 -m pytest -q

Summary by Sourcery

Strengthen readiness and deployment safety by making /ready perform real DB and storage checks and enforcing secure CORS configuration in production.

New Features:

  • Add a readiness endpoint that validates database connectivity and storage (local or S3) with lightweight caching of results.

Enhancements:

  • Enforce stricter CORS validation in production, rejecting insecure http:// and localhost origins at application startup.
  • Extend the deploy readiness script to fail production deployments that use insecure CORS_ORIGINS values, including http://, wildcard, or localhost.

Tests:

  • Update readiness endpoint test to assert successful DB and storage checks in the response payload.

@sourcery-ai
Copy link

sourcery-ai bot commented Feb 17, 2026

Reviewer's Guide

/ready now performs real DB + storage readiness checks with caching, production CORS configuration is hardened to reject insecure origins both at app startup and in the deploy readiness script, and tests are updated accordingly.

Flow diagram for production CORS validation in app startup

flowchart TD
    Start([Start create_app])
    Load["Load cors_origins from settings"]
    CheckWildcard["credentials enabled AND cors_origins contains '*'"]
    WildcardError[["Raise RuntimeError: wildcard not allowed with credentials"]]

    CheckEnv["app_env == production?"]

    CheckHttp["Any origin starts with 'http://'? (case-insensitive)"]
    HttpError[["Raise RuntimeError: http:// origins not allowed in production"]]

    CheckLocal["Any origin contains localhost or 127.0.0.1?"]
    LocalError[["Raise RuntimeError: localhost origins not allowed in production"]]

    Success(["Proceed to FastAPI app creation and CORS middleware"])

    Start --> Load --> CheckWildcard
    CheckWildcard -- Yes --> WildcardError
    CheckWildcard -- No --> CheckEnv

    CheckEnv -- No --> Success
    CheckEnv -- Yes --> CheckHttp

    CheckHttp -- Yes --> HttpError
    CheckHttp -- No --> CheckLocal

    CheckLocal -- Yes --> LocalError
    CheckLocal -- No --> Success
Loading

File-Level Changes

Change Details Files
/ready endpoint now performs cached DB + storage readiness checks and returns structured JSON with 200/503 based on results.
  • Introduce a small in-memory cache with TTL to avoid repeated expensive readiness checks.
  • Add helpers to infer database backend type from DATABASE_URL and normalize STORAGE_BACKEND values.
  • Add DB connectivity probe using get_conn() and SELECT 1, failing fast and caching failures.
  • Add local filesystem readiness probe that writes/reads/deletes a token file under storage_root.
  • Add S3 readiness probe using boto3 to put/get an object in the configured bucket/prefix, with detailed error reporting and failure caching.
  • Change /ready response to JSONResponse with a payload including environment, per-subsystem checks, and overall status, returning 503 on any failed or unsupported backend.
backend/app/api/routers/system.py
Harden production CORS configuration in the FastAPI app startup.
  • Extend startup-time CORS validation to, in production, reject any http:// origins even if otherwise syntactically valid.
  • Reject localhost and 127.0.0.1 origins when APP_ENV=production, requiring only HTTPS or no CORS in production.
backend/app/main.py
Extend deploy readiness shell script to enforce secure CORS in production ECS services.
  • Extract CORS_ORIGINS from the ECS task definition JSON using jq.
  • In production, fail deploy readiness if CORS_ORIGINS contains any http:// origin, a bare wildcard '*', or localhost/127.0.0.1 entries.
  • Keep existing checks on DB backend and storage backend for production (e.g., disallow sqlite/local storage).
scripts/aws/check_deploy_readiness.sh
Update health test to assert new /ready payload shape and readiness checks.
  • Adjust /ready test to verify status is 'ready' and that both db and storage checks report ok=True.
backend/tests/test_health.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • The in-memory _ready_cache dict is mutated from the request handler without any locking; in a multi-threaded or multi-worker deployment you may want to guard it with a lock or use a simpler atomic structure to avoid rare race conditions when multiple /ready calls happen concurrently.
  • In the S3 readiness check, you always write to the same readyz/{env}/backend.txt key; if multiple backends or versions share a bucket/prefix this can cause them to overwrite each other’s probes, so consider incorporating app name or instance ID into the key to avoid collisions.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The in-memory `_ready_cache` dict is mutated from the request handler without any locking; in a multi-threaded or multi-worker deployment you may want to guard it with a lock or use a simpler atomic structure to avoid rare race conditions when multiple `/ready` calls happen concurrently.
- In the S3 readiness check, you always write to the same `readyz/{env}/backend.txt` key; if multiple backends or versions share a bucket/prefix this can cause them to overwrite each other’s probes, so consider incorporating app name or instance ID into the key to avoid collisions.

## Individual Comments

### Comment 1
<location> `backend/app/api/routers/system.py:141-147` </location>
<code_context>
+                Body=token.encode("utf-8"),
+                ContentType="text/plain",
+            )
+            response = client.get_object(Bucket=bucket, Key=key)
+            body = response.get("Body")
+            if body is None:
+                raise RuntimeError("S3 get_object returned no Body")
+            read_back = body.read().decode("utf-8", errors="replace")
+            if read_back != token:
+                raise RuntimeError("S3 readiness probe mismatch")
</code_context>

<issue_to_address>
**suggestion (bug_risk):** S3 response body stream is not closed after reading.

`Body` is a streaming object; if it’s not closed after `read()`, connections can be leaked. Please either call `body.close()` after reading or use it as a context manager (where supported by your boto3 version), which is especially important since this path is hit repeatedly by readiness probes.

```suggestion
            response = client.get_object(Bucket=bucket, Key=key)
            body = response.get("Body")
            if body is None:
                raise RuntimeError("S3 get_object returned no Body")
            try:
                read_back = body.read().decode("utf-8", errors="replace")
            finally:
                # Ensure the streaming body is closed to avoid leaking connections
                body.close()
            if read_back != token:
                raise RuntimeError("S3 readiness probe mismatch")
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +141 to +147
response = client.get_object(Bucket=bucket, Key=key)
body = response.get("Body")
if body is None:
raise RuntimeError("S3 get_object returned no Body")
read_back = body.read().decode("utf-8", errors="replace")
if read_back != token:
raise RuntimeError("S3 readiness probe mismatch")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): S3 response body stream is not closed after reading.

Body is a streaming object; if it’s not closed after read(), connections can be leaked. Please either call body.close() after reading or use it as a context manager (where supported by your boto3 version), which is especially important since this path is hit repeatedly by readiness probes.

Suggested change
response = client.get_object(Bucket=bucket, Key=key)
body = response.get("Body")
if body is None:
raise RuntimeError("S3 get_object returned no Body")
read_back = body.read().decode("utf-8", errors="replace")
if read_back != token:
raise RuntimeError("S3 readiness probe mismatch")
response = client.get_object(Bucket=bucket, Key=key)
body = response.get("Body")
if body is None:
raise RuntimeError("S3 get_object returned no Body")
try:
read_back = body.read().decode("utf-8", errors="replace")
finally:
# Ensure the streaming body is closed to avoid leaking connections
body.close()
if read_back != token:
raise RuntimeError("S3 readiness probe mismatch")

@alanmaizon alanmaizon merged commit 7f2633e into main Feb 18, 2026
12 checks passed
@alanmaizon alanmaizon deleted the feat/readyz-cors-lockdown branch February 18, 2026 00:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant