diff --git a/aws-infra/README.md b/aws-infra/README.md new file mode 100644 index 000000000..e30297aa9 --- /dev/null +++ b/aws-infra/README.md @@ -0,0 +1,637 @@ +## ⚠️ Experimental AWS Deployment + +> **Note:** The AWS deployment described in this document is **experimental and not supported at this time**. +> It is intended for development, testing, and architectural exploration only and **should not be used in production**. + +# Antenna Platform - Deployment & Infrastructure Guide + +This document describes the AWS infrastructure and deployment pipeline for the Antenna platform. +It is intended for maintainers and contributors who need to understand, update, or reproduce the deployed environment. + +## 1. Overview + +Antenna consists of two major parts: + +1. **Backend (Django API + Celery Worker + Celery Beat + Flower + ML processing services + AWS CLI)** running as multiple Docker containers. +2. **Frontend (React + Vite)** built into static files, hosted on **S3**, and delivered globally via **CloudFront**. + + +## 1.1 Backend components (AWS) + +The backend is deployed as a **multi-container** service on AWS: + +- **Elastic Beanstalk (Docker on ECS)**: runs all backend containers (Django, Celery Worker, Celery Beat, Flower, ML services, AWS CLI containers as needed). +- **Amazon ECR**: stores Docker images that Elastic Beanstalk pulls at deploy/runtime. +- **Amazon RDS (PostgreSQL)**: primary application database. +- **Amazon ElastiCache (Redis with TLS)**: Celery broker and Django cache. +- **Amazon S3**: object storage (e.g., uploaded files/static/media). +- **Amazon CloudWatch**: logs, health monitoring, and Elastic Beanstalk instance metrics. + +## 1.2 Frontend components (AWS) + +- **S3:** Hosts the compiled **React (Vite)** static assets (HTML, JS, CSS, images). +- **CloudFront:** Delivers the UI globally from S3 and forwards **`/api/*`** requests to the backend so the UI and API are served from a single domain. + +--- + +## 2. High-level AWS architecture + +This section provides a visual, end-to-end view of how Antenna is deployed and how requests flow at runtime. +Details for each component are documented in the sections that follow. + +### 2.1 Backend: build + deploy + runtime + +AWS deployment and runtime backend architecture diagram + +**Figure:** AWS deployment + runtime architecture for the Antenna backend. +Docker images for each service are built locally and pushed to Amazon ECR; Elastic Beanstalk is deployed using a ZIP bundle that includes `Dockerrun.aws.json` (pointing to the ECR image URIs). At runtime, a single Elastic Beanstalk environment (Docker on ECS, single EC2 instance) pulls those images from ECR and runs seven containers: Django (API), Celery Worker, Celery Beat (scheduler), Flower (monitoring), an AWS CLI helper container, and ML processing services. + + + +### 2.2 Frontend: global delivery + API proxy + +AWS deployment and runtime frontend architecture diagram + + +**Figure:** Antenna frontend web app deployment flow. +The React frontend is built into static website files and stored in Amazon S3, then delivered globally via CloudFront. CloudFront serves the UI for normal page requests and forwards /api/* requests to the Elastic Beanstalk backend (Django + Celery), which connects privately to RDS (PostgreSQL) and ElastiCache (Redis). + + +--- + +## 3. Elastic Beanstalk (EB) + +## 3.1. Elastic Beanstalk environment + +- Platform: ECS on Amazon Linux 2 (Multicontainer Docker) +- Deployment bundle includes: + - `Dockerrun.aws.json` + - `.ebextensions/00_setup.config` +- Environment type: + - Single-instance environment (used for development/testing to reduce cost). + - Can be upgraded later to a load-balanced environment for production. +- **Instance Configuration** + - Architecture: `x86_64` + - Instance types (preferred order): + - `t3.large` + - `t3.small` + - Capacity type: **On-Demand instances** + +- **Auto Scaling Group** + - Uses a **single-instance ASG** (managed automatically by Elastic Beanstalk) + - EB performs health checks on the instance + +- **Security Groups** + - EB-managed instance security group (default inbound + outbound rules) + - Additional outbound egress security group + +- **Enhanced health reporting** + - Real-time system + application monitoring + - Free custom metric: `EnvironmentHealth` + +- **Health Event Streaming** + - Log streaming to CloudWatch Logs: Enabled + - Retention: 7 days + - Lifecycle: Keep logs after terminating environment + +- **Managed Platform Updates** + - Enabled + - Weekly maintenance window: Thursday @ 22:40 UTC + - Update level: Apply **minor and patch** updates + - Instance replacement enabled : EB replaces instance if no other updates apply. + +- **Rolling Updates & Deployments** + - Deployment policy: All at once + - Batch size type: Percentage + - Rolling updates: Disabled (not needed for single instance) + - **Deployment preferences:** + - Ignore health check: `False` + - Health threshold: `OK` + - Command timeout: `600 seconds` + + +--- + +### 3.2. Docker Containers + +EB ECS runs the following containers: + +1. **django** - web application (the container listens on port 5000, which is exposed as port 80 on the Elastic Beanstalk host) +2. **celeryworker** - asynchronous task worker +3. **celerybeat** - scheduled task runner +4. **flower** - Celery monitoring UI (port 5555) +5. **awscli** - lightweight helper container for internal AWS commands +6. **ml-backend-minimal** - A lightweight, minimal ML processing service used to test Antenna’s processing-service API and verify ML integration. It provides basic pipelines only. +7. **ml-backend-example** - A demonstration ML backend that includes more advanced pipelines and shows how to build custom algorithms and processing flows. Intended as an example/template for extending ML capabilities. + +--- + +### 3.3. ECR Repositories Used + +All application containers pull from: + +- **antenna-backend** + `/antenna-backend` + +The AWS CLI helper container pulls from: + +- **antenna-awscli** + `/antenna-awscli` + +The ML processing services pull from: + +- **antenna-ml-minimal** + `/antenna-ml-minimal` + +- **antenna-ml-example** + `/antenna-ml-example` + + +All repositories are **mutable**, support versioned tags, and **AES-256 encrypted**. + +--- + +## 3.4. Environment Variables + +In this setup, **all required environment variables—including secrets—are defined inside** +`.ebextensions/00_setup.config`. + +Elastic Beanstalk automatically reads the values from this file and writes them into its +**Environment Properties** at deployment time. +This ensures a fully automated bootstrap with no manual EB console entry. + +The deployment uses the following environment variables across these categories: + +### Django +- `DJANGO_SETTINGS_MODULE` +- `DJANGO_SECRET_KEY` +- `DJANGO_ALLOWED_HOSTS` +- `DJANGO_SECURE_SSL_REDIRECT` +- `DJANGO_ADMIN_URL` +- `DJANGO_DEBUG` +- `EB_HEALTHCHECK` + +### AWS / S3 +- `DJANGO_AWS_ACCESS_KEY_ID` +- `DJANGO_AWS_SECRET_ACCESS_KEY` +- `DJANGO_AWS_STORAGE_BUCKET_NAME` +- `DJANGO_AWS_S3_REGION_NAME` + +### Database (RDS) +- `POSTGRES_DB` +- `POSTGRES_USER` +- `POSTGRES_PASSWORD` +- `POSTGRES_HOST` +- `POSTGRES_PORT` +- `DATABASE_URL` + +### Redis / Celery +- `REDIS_URL` +- `CELERY_BROKER_URL` + +### ML Processing Service +- `DEFAULT_PROCESSING_SERVICE_ENDPOINT` + +### Third-Party Integrations +- `SENDGRID_API_KEY` +- `SENTRY_DSN` + + + + +### 3.5. Elastic Beanstalk EC2 Instance + +- **Instance type:** `t3.large` +- **Instance profile:** `aws-elasticbeanstalk-ec2-role` +- **Service role:** `aws-elasticbeanstalk-service-role` +- Create an EC2 key pair in your AWS account and attach it to the EB environment when launching the backend. (Each developer should use their own key pair.) +- **Public IP:** Assigned +- **Security groups:** + - EB default instance SG + - Outbound-only egress SG (`antenna-eb-sg`) + + +### 3.6. IAM Roles and Policies + +**1. EC2 Instance Profile – `aws-elasticbeanstalk-ec2-role`** +Attached AWS-managed policies (default from EB): +- `AWSElasticBeanstalkWebTier` +- `AWSElasticBeanstalkWorkerTier` +- `AmazonEC2ContainerRegistryReadOnly` (ECR pull) +- `CloudWatchAgentServerPolicy` (log streaming) +- S3 read/write access granted through `AWSElasticBeanstalkWebTier` + (used for EB deployment bundles, log archives, temp artifacts) + +This role is used **by the EC2 instance itself**. +It allows the instance to: +- Pull container images from ECR +- Upload logs to CloudWatch +- Read/write to the EB S3 bucket +- Communicate with ECS agent inside the EB environment + +--- + +**2. Service Role – `aws-elasticbeanstalk-service-role`** +Attached AWS-managed policies (default from EB): +- `AWSElasticBeanstalkEnhancedHealth` +- `AWSElasticBeanstalkService` + +This role is used **by the Elastic Beanstalk service**, not the EC2 instance. +It allows EB to: +- Manage environment health monitoring +- Launch/update/terminate EC2 instances +- Interact with Auto Scaling +- Register container tasks and update ECS configuration + + +--- + +### 3.7. Networking (EB Environment) + +- **VPC:** default VPC +- **Subnets:** + - EB instance runs in a **public subnet** + - RDS + Redis run in **private subnets** (via their subnet groups) +- **Public access:** + - EB EC2 instance receives a public IP + - No load balancer (single-instance environment) +- **Connectivity:** + - EB instance can reach RDS & Redis via SG rules + - Internet connectivity available through AWS default routing + +--- + +## 4. Backend Deployment Workflows + +### 4.0. Configuration Files + +- `Dockerrun.aws.json` + Multi-container definition for Elastic Beanstalk (images + ports + container config) + +- `.ebextensions/00_setup.config` + Elastic Beanstalk environment settings + environment variables + + +### 4.1 Backend deploy (Docker -> ECR -> EB) + +#### Step 1 - Create the ECR repository, then build + push (Linux / AMD64) + +Create an ECR repository named antenna-backend (one-time). Then build the Linux AMD64 image, tag it with the repo URI, and push: + +```bash +# Build a Linux AMD64 image locally (creates a local image tag: antenna-backend:latest) +docker buildx build --platform linux/amd64 -t antenna-backend:latest --load . + +# Tag the same image for the ECR repository (creates: :v10) +docker tag antenna-backend:latest :v10 + +# Push the image to Amazon ECR (uploads: :v10) +docker push :v10 +``` + +#### Step 2 - Update Dockerrun.aws.json + +Update the tag: + +"image": ":v10" + +```md +### About Docker Image Versions +`v10` is only an example placeholder. + +You can use any version tag (v1, v2, v3, etc.). + +How to choose a version: +- Increase the number with every build, OR +- Use semantic versions (1.0.0, 1.1.0) + +Use the same tag in both: +1. `docker tag`and `docker push` commands +2. `Dockerrun.aws.json` + +``` + +### Step 3 — Create EB bundle + +zip -r deploy.zip Dockerrun.aws.json .ebextensions .ebignore + + +### Step 4 — Deploy to Elastic Beanstalk + +- EB Console → Environment → Upload & Deploy +- Upload `deploy.zip` +- Wait for ECS tasks to start + +### Step 5 — Validate Deployment + +- `/api/v2/` returns `200` +- Django container remains healthy +- Celery worker connects to Redis successfully +- Celery Beat schedules run successfully +- Flower UI loads on port 5555 (if security groups permit) + +> +> ### How to Validate this deployment? +> The points listed above describe the expected state of a successful AWS deployment. +> +> --- +> +> ### 1. Confirming that `/api/v2/` returns a successful response +> When you open: +> ``` +> https:///api/v2/ +> ``` +> the browser shows the JSON content returned by Django. +> This means that simply opening the URL visually confirms “the API is working,” but not the status code. +> +> To check the actual HTTP status code, use: +> ```bash +> curl -I https:///api/v2/ +> ``` +> This command returns the HTTP headers. A successful response looks like: +> ``` +> HTTP/1.1 200 OK +> ``` +> This confirms: +> - Django is reachable +> - Routing is working +> - The container is serving requests normally +> +> --- +> +> ### 2. Verifying that the Django container is healthy inside Elastic Beanstalk +> Elastic Beanstalk uses several internal checks to determine whether a container is “Healthy.” +> EB marks the Django container as Healthy only if: +> - the Docker container starts without crashing. If Django crashes on startup, health becomes "Severe". +> - the app listens on the correct port (port 8000 in this project). If nothing is listening, health becomes "Warning". +> - the configured application URL returns a successful response +> +> In this deployment, we used the `/api/v2/` endpoint for testing. +> Once this endpoint consistently returned **HTTP 200 OK**, Elastic Beanstalk updated the +> environment status to **Healthy**. +> +> You can confirm this in the EB Console under: +> **Elastic Beanstalk -> Environments -> Health** +> +> --- +> +> ### 3. Confirming Celery Worker successfully connects to Redis +> Celery Worker logs show whether the process booted correctly and whether it could authenticate with Redis over TLS. +> +> We retrieved logs using: +> ```bash +> eb logs --all +> ``` +> In the worker logs, we verified the connection with entries such as: +> ``` +> Connected to rediss://:6379/0 +> ``` +> This confirms: +> - the worker process started +> - Redis TLS (`rediss://`) worked +> - certificate verification settings were correct +> - no connection retries or SSL errors occurred +> +> --- +> +> ### 4. Confirming Celery Beat is running and scheduling tasks +> Celery Beat regularly prints a log line every time a scheduled task becomes due. +> +> In the logs retrieved above, we saw messages like: +> ``` +> celery.beat: Scheduler: Sending due task +> ``` +> This indicates: +> - Beat is running continuously +> - it is reading the schedule correctly +> - scheduled tasks are being dispatched to the worker +> +> --- +> +> ### 5. Confirming Flower UI loads on port 5555 +> Flower exposes a monitoring dashboard on port **5555**, but it will only load if the EC2 instance’s Security Group allows inbound traffic on that port. +> +> After enabling access, visiting: +> ``` +> http://:5555 +> ``` +> displayed the Flower dashboard, confirming that: +> - the Flower container is running +> - it can communicate with Redis +> - it can display worker and task activity +> +> --- +> +> Together, these checks confirm that the full Django + Celery + Redis deployment is healthy and functioning as expected on AWS. +--- + + +## 5. Other Backend AWS Infrastructure Units + +### 5.1. RDS (PostgreSQL) + +- **Engine:** PostgreSQL 17.6 +- **Instance class:** `db.t4g.small` +- **Availability:** Single-AZ deployment (`us-west-2b`) + +- **Networking** + - Runs inside the **default VPC** + - Uses the **private DB subnet group**: `antenna-private-db-subnet-group` + - **Public accessibility:** Disabled (RDS is fully private) + - Accessible only from resources inside the VPC + +- **Security Group (`antenna-rds-sg`) ** + - **Inbound:** + - Port **5432** allowed **only** from the Elastic Beanstalk security group (`antenna-eb-sg`) + - **Outbound:** + - Allowed to `0.0.0.0/0` (default outbound rule) + +--- + +### 5.2. ElastiCache (Redis) + +- **Engine:** Redis 7.1 +- **Node type:** `cache.t4g.micro` +- **Cluster mode:** Disabled (single node) +- **Multi-AZ:** Disabled +- **Auto-failover:** Disabled + +- **Security** + - **Encryption in transit:** Enabled + - **Encryption at rest:** Enabled + - Redis connections must use: + - `rediss://` (TLS endpoint) + - `ssl_cert_reqs=none` (required for Celery / Django Redis clients) + - **Security Group (`antenna-redis-sg`)** + - **Inbound:** + - Port **6379** allowed **only from the Elastic Beanstalk SG** (`antenna-eb-sg`) + *(Only the Django app can talk to Redis — fully private.)* + - **Outbound:** + - Default allow to `0.0.0.0/0` (standard for ElastiCache) + +- **Networking:** + - Deployed into private subnets (via its subnet group) + - Runs within the same VPC as EB and RDS + +--- + +## 6. Frontend deployment workflow (S3 + CloudFront + EB API proxy) + +This section documents how the Antenna **React + Vite** frontend is deployed to AWS using: + +- **S3** to host compiled static files +- **CloudFront** to serve the UI globally and proxy API calls +- **Elastic Beanstalk (EB)** as the backend API origin for **`/api/*`** + +--- + +### 6.1 Build the UI locally + +```bash +cd ui +nvm use +yarn install +yarn build +``` + +This generates optimized static web assets (HTML, JS, CSS, images) that do not require a Node server. + +The build output directory depends on the UI build configuration (commonly `dist/` for Vite-based projects). + +### 6.2 Upload the build output to S3 + +- **S3 bucket:** `antenna-prod-ssec` +- Upload the **contents** of the build output directory (not the directory itself) + +Expected structure in the bucket root: + +- `index.html` +- `assets/` (and other static files) + +### 6.3 Create / configure the CloudFront distribution + +- **Distribution name:** `antenna-ui-prod` +- **Default root object:** `index.html` + +#### Origins + +- **S3 origin:** `antenna-prod-ssec` (serves UI static assets) +- **EB origin:** Elastic Beanstalk environment URL (serves API requests) + +#### Behaviors + +- **Path pattern:** `/api/*` + - Origin: EB origin + - Caching: Disabled + - Viewer protocol policy: Redirect HTTP to HTTPS (or HTTPS only) + +- **Default behavior:** `*` + - Origin: S3 origin + - Caching: Optimized + - Viewer protocol policy: Redirect HTTP to HTTPS (or HTTPS only) + +### 6.4 Invalidate CloudFront after every frontend deployment + +After uploading a new UI build to S3, create a CloudFront invalidation: + +- **Invalidation path:** `/*` + +This forces CloudFront to fetch the latest files from S3. + +### 6.5 Test production + +- **UI:** `https://` +- **API (via CloudFront):** `https:///api/projects` + +Request flow: + +- UI requests → CloudFront → S3 +- API requests (`/api/*`) → CloudFront → Elastic Beanstalk + +--- + +## 7. Common Issues & Fixes + +### Redis SSL Errors + +ElastiCache requires TLS. Missing SSL args causes: + +ssl.SSLCertVerificationError + +**Fix:** +Use `rediss://` and `ssl_cert_reqs=none`. + +```md +### Where the Redis TLS Fix Is Applied + +This project reads Redis and Celery connection settings from **Elastic Beanstalk environment variables**, +not from Django code. + +The TLS configuration (`rediss://` + `ssl_cert_reqs=none`) is defined in: + +**`.ebextensions/00_setup.config`**, for example: + + +REDIS_URL: "rediss://:6379/0?ssl_cert_reqs=none" +CELERY_BROKER_URL: "rediss://:6379/0?ssl_cert_reqs=none" + + +Elastic Beanstalk automatically loads these values and sets them as environment variables +inside the running containers. Django and Celery then read: + +- `REDIS_URL` +- `CELERY_BROKER_URL` + +directly from the EB environment. +``` + + +### Health Check Redirect Loops + +EB health checks cannot handle HTTPS. + +**Fix:** +Set `EB_HEALTHCHECK=1` and temporarily disable SSL redirect for health checks. + +### Early Migrations Failure + +EB sometimes runs migrations before services are ready. + +**Fix:** +`.ebextensions` migration command is set to ignore failures and retry. + +--- + +## 8. Future Improvements + +To harden the deployment and move toward a production-grade architecture, the following enhancements are recommended: + +- **Move secrets to AWS Secrets Manager** + Centralize all sensitive variables (DB password, Redis URL, Django secret key, Sentry key, SendGrid, etc.) and replace `.ebextensions` injection with runtime retrieval. + +- **Enable ElastiCache Multi-AZ + Auto-Failover** + Improves high availability for Celery and Django caching; eliminates single-node Redis failure risks. + +- **IAM hardening and least-privilege review** + Replace broad EB-managed policies with reduced IAM policies scoped only to required S3, ECR, CloudWatch, and ECS resources. + +- **Add CI/CD pipeline (GitHub Actions -> ECR -> EB)** + Automate build, tag, push of images and deployments to Elastic Beanstalk for consistent, reproducible releases. + +- **Add staging environment** + Separate EB environment (staging) for testing migrations, image builds, and infrastructure changes before production. + +- **Migrate to load-balanced EB environment (optional)** + Enables rolling deployments, zero-downtime updates, and better scalability. + +- **Enable RDS Multi-AZ + automated backups** + Ensures database failover and improves disaster recovery readiness. + +- **Add health checks for Celery worker & beat** + Custom EB or CloudWatch alarms to alert on worker failures, broker connectivity issues, or long task queues. + +- The S3 bucket policy currently includes a public read rule (`Principal: "*"`, `s3:GetObject`), so objects are accessible directly from S3. There is also a CloudFront-specific allow rule. If we want stricter security later, we can remove the public rule and allow reads **only via CloudFront (OAC). + + + + diff --git a/aws-infra/__main__.py b/aws-infra/__main__.py new file mode 100644 index 000000000..2da9dd4cd --- /dev/null +++ b/aws-infra/__main__.py @@ -0,0 +1,74 @@ +import pulumi + +# --------------------------------------------------------- +# Optional deploy flags +# --------------------------------------------------------- +config = pulumi.Config() +DEPLOY_EB = config.get_bool("deployEb") or False +BUILD_IMAGES = config.get_bool("buildImages") or False +DEPLOY_FRONTEND = config.get_bool("deployFrontend") or False + + +# ========================================================= +# CREATE BASE INFRA +# (networking, storage, IAM, ECR, Redis, RDS) +# ========================================================= + +# --- Networking --- +from networking.vpc import default_vpc +from networking import subnets +from networking import routes +from networking.security_group import * + +# --- Storage (S3 + policies) --- +from storage import * + +# --- IAM Roles --- +import iam_roles + +# --- ECR repos (needed by images build + EB dockerrun) --- +import ecr + +# --- Redis + RDS --- +import redis +import rds + + +# ========================================================= +# 1) BUILD DOCKER IMAGES (ONLY IF REQUESTED) +# ========================================================= +if BUILD_IMAGES: + import images + + +# ========================================================= +# 2) DEPLOY EB (ONLY IF REQUESTED) +# ========================================================= +if DEPLOY_EB: + # EB requires Dockerrun/zip generation code, + # so always import images when deploying EB + if not BUILD_IMAGES: + import images + + # Secrets Manager: + # - creates manual secrets (django key, sendgrid, sentry) + # - exports AWS-generated RDS master secret arn + # - constructs EB_ENV map + import secrets_manager + + # Elastic Beanstalk environment + import eb + + # ========================================================= + # 3) DEPLOY FRONTEND (ONLY IF REQUESTED) + # IMPORTANT: must come AFTER EB import, because cloudfront.py + # imports env_pulumi from eb.py for backend origin. + # ========================================================= + if DEPLOY_FRONTEND: + import cloudfront + +else: + # If EB is not deployed in this run, we cannot deploy frontend proxying to EB + # because cloudfront.py expects env_pulumi from eb.py. + if DEPLOY_FRONTEND: + raise Exception("deployFrontend=true requires deployEb=true in the same run (CloudFront backend origin uses EB env URL).") diff --git a/aws-infra/cloudfront.py b/aws-infra/cloudfront.py new file mode 100644 index 000000000..02b74beae --- /dev/null +++ b/aws-infra/cloudfront.py @@ -0,0 +1,411 @@ +import os +import mimetypes +import subprocess +import hashlib +import atexit +import time + +import pulumi +import pulumi_aws as aws + +# NOTE: +# We want the backend origin to be the EB environment URL. +# This requires EB to be deployed in the same update *before* importing this file. +from eb import env_pulumi + + +# ========================================================= +# CONFIG +# ========================================================= + +config = pulumi.Config() +project_config = pulumi.Config(pulumi.get_project()) + +# S3 bucket where the compiled UI will live +ui_bucket_name = ( + config.get("ui_bucket_name") + or project_config.get("ui_bucket_name") + or config.require("ui_bucket_name") +) # e.g. antenna-prod-ui-pulumi + +# --------------------------------------------------------- +# UI build paths +# --------------------------------------------------------- +repo_root = os.path.abspath(os.path.join(os.path.dirname(__file__), "..")) + +ui_dir = ( + config.get("ui_dir") + or project_config.get("ui_dir") + or os.path.join(repo_root, "ui") +) + + +ui_dist_dir = ( + config.get("ui_dist_dir") + or project_config.get("ui_dist_dir") + or os.path.join(ui_dir, "build") +) + +# If True, Pulumi will run: +# cd ui +# nvm use +# yarn install +# yarn build +build_ui_in_pulumi = ( + config.get_bool("build_ui_in_pulumi") + or project_config.get_bool("build_ui_in_pulumi") + or False +) + + +# ========================================================= +# HELPERS +# ========================================================= +def normalize_origin_domain(d: str) -> str: + return d.replace("https://", "").replace("http://", "").strip("/") + + +def guess_content_type(path: str) -> str: + ctype, _ = mimetypes.guess_type(path) + return ctype or "application/octet-stream" + + +def cache_control_for_key(key: str) -> str: + # Vite SPA: index.html should not be aggressively cached + if key == "index.html": + return "no-cache" + # Fingerprinted assets can be cached forever + if key.startswith("assets/"): + return "public, max-age=31536000, immutable" + return "public, max-age=3600" + + +def run_ui_build_if_enabled() -> None: + """ + Incorporates: + cd ui + nvm use + yarn install + yarn build + + Guardrails: + - Only run during an actual update (not preview) + - Use bash -lc so nvm works (nvm is a shell function) + """ + if not build_ui_in_pulumi: + pulumi.log.info("UI build disabled (build_ui_in_pulumi=false).") + return + + if pulumi.runtime.is_dry_run(): + pulumi.log.info("Preview detected: skipping UI build (build_ui_in_pulumi=true).") + return + + if not os.path.isdir(ui_dir): + raise Exception(f"UI directory not found: {ui_dir}") + + cmd = f""" +set -euo pipefail +cd "{ui_dir}" + +# Make nvm available in non-interactive shells. +# We avoid relying on ~/.bash_profile because it may not be sourced cleanly +# (and Pulumi runs this in a subprocess). +export NVM_DIR="${{NVM_DIR:-$HOME/.nvm}}" +if [ -s "$NVM_DIR/nvm.sh" ]; then + . "$NVM_DIR/nvm.sh" +else + echo "nvm.sh not found at $NVM_DIR/nvm.sh" + exit 1 +fi + +# Use .nvmrc if present, otherwise fall back to default nvm alias +if [ -f ".nvmrc" ]; then + nvm use +else + nvm use default || nvm use +fi + +yarn install +yarn build +""" + + pulumi.log.info("Building UI inside Pulumi: cd ui && nvm use && yarn install && yarn build") + subprocess.run(["bash", "-lc", cmd], check=True) + + +# ========================================================= +# 6.1 BUILD UI +# ========================================================= +run_ui_build_if_enabled() + + +# ========================================================= +# 6.2 S3 BUCKET (CREATE A NEW ONE) +# ========================================================= +ui_bucket = aws.s3.Bucket( + "antenna-ui-bucket", + bucket=ui_bucket_name, +) + +# Object ownership controls (recommended with OAC) +ui_ownership = aws.s3.BucketOwnershipControls( + "antenna-ui-bucket-ownership", + bucket=ui_bucket.id, + rule=aws.s3.BucketOwnershipControlsRuleArgs( + object_ownership="BucketOwnerPreferred" + ), +) + +# Block public access ON (CloudFront OAC will read privately) +ui_public_access_block = aws.s3.BucketPublicAccessBlock( + "antenna-ui-bucket-public-access", + bucket=ui_bucket.id, + block_public_acls=True, + ignore_public_acls=True, + block_public_policy=True, + restrict_public_buckets=True, +) + +# Default encryption ON +ui_encryption = aws.s3.BucketServerSideEncryptionConfiguration( + "antenna-ui-bucket-encryption", + bucket=ui_bucket.id, + rules=[ + aws.s3.BucketServerSideEncryptionConfigurationRuleArgs( + apply_server_side_encryption_by_default= + aws.s3.BucketServerSideEncryptionConfigurationRuleApplyServerSideEncryptionByDefaultArgs( + sse_algorithm="AES256" + ), + bucket_key_enabled=True, + ) + ], +) + + +# ========================================================= +# 6.3 CLOUDFRONT (S3 + EB ORIGIN, /api/* PROXY) +# ========================================================= + +# --------------------------------------------------------- +# Managed CloudFront policies +# --------------------------------------------------------- +caching_optimized = aws.cloudfront.get_cache_policy_output(name="Managed-CachingOptimized") +caching_disabled = aws.cloudfront.get_cache_policy_output(name="Managed-CachingDisabled") +all_viewer = aws.cloudfront.get_origin_request_policy_output(name="Managed-AllViewer") + +# OAC (private S3 origin access) +oac = aws.cloudfront.OriginAccessControl( + "antenna-ui-oac", + description="OAC for Antenna UI S3 origin", + origin_access_control_origin_type="s3", + signing_behavior="always", + signing_protocol="sigv4", +) + +# Backend origin domain = EB endpoint (no scheme) +backend_origin_domain = env_pulumi.cname.apply(normalize_origin_domain) + +# CloudFront distribution +cf_distribution = aws.cloudfront.Distribution( + "antenna-ui-prod", + enabled=True, + comment="CloudFront distribution for Antenna UI (S3) + EB API proxy (/api/v2/*)", + default_root_object="index.html", + + origins=[ + # UI origin (S3) + aws.cloudfront.DistributionOriginArgs( + origin_id="antenna-ui-origin", + domain_name=ui_bucket.bucket_regional_domain_name, + origin_access_control_id=oac.id, + s3_origin_config=aws.cloudfront.DistributionOriginS3OriginConfigArgs( + origin_access_identity="" + ), + ), + # Backend origin (Elastic Beanstalk) + aws.cloudfront.DistributionOriginArgs( + origin_id="antenna-backend-origin", + domain_name=backend_origin_domain, + custom_origin_config=aws.cloudfront.DistributionOriginCustomOriginConfigArgs( + http_port=80, + https_port=443, + # IMPORTANT: + # Your EB endpoint is NOT serving HTTPS (443 was failing to connect). + # CloudFront must talk to EB over HTTP, otherwise /api/* will 504. + origin_protocol_policy="http-only", + origin_ssl_protocols=["TLSv1.2"], + # Reduce 504s on slower app cold starts / DB connects + origin_read_timeout=60, + origin_keepalive_timeout=60, + ), + ), + ], + + # Default: UI from S3 (caching optimized) + default_cache_behavior=aws.cloudfront.DistributionDefaultCacheBehaviorArgs( + target_origin_id="antenna-ui-origin", + viewer_protocol_policy="redirect-to-https", + allowed_methods=["GET", "HEAD", "OPTIONS"], + cached_methods=["GET", "HEAD"], + compress=True, + cache_policy_id=caching_optimized.id, # Managed-CachingOptimized + ), + + # /api/v2/* -> EB (caching disabled) + ordered_cache_behaviors=[ + aws.cloudfront.DistributionOrderedCacheBehaviorArgs( + path_pattern="/api/v2/*", + target_origin_id="antenna-backend-origin", + viewer_protocol_policy="redirect-to-https", + allowed_methods=["GET", "HEAD", "OPTIONS", "PUT", "POST", "PATCH", "DELETE"], + cached_methods=["GET", "HEAD"], + compress=True, + cache_policy_id=caching_disabled.id, # Managed-CachingDisabled + origin_request_policy_id=all_viewer.id, # Managed-AllViewer + ), + + aws.cloudfront.DistributionOrderedCacheBehaviorArgs( + path_pattern="/api/*", + target_origin_id="antenna-backend-origin", + viewer_protocol_policy="redirect-to-https", + allowed_methods=["GET", "HEAD", "OPTIONS", "PUT", "POST", "PATCH", "DELETE"], + cached_methods=["GET", "HEAD"], + compress=True, + cache_policy_id=caching_disabled.id, # Managed-CachingDisabled + origin_request_policy_id=all_viewer.id, # Managed-AllViewer + ), + ], + + # SPA fallback to index.html + custom_error_responses=[ + aws.cloudfront.DistributionCustomErrorResponseArgs( + error_code=403, response_code=200, response_page_path="/index.html" + ), + aws.cloudfront.DistributionCustomErrorResponseArgs( + error_code=404, response_code=200, response_page_path="/index.html" + ), + ], + + price_class="PriceClass_All", + restrictions=aws.cloudfront.DistributionRestrictionsArgs( + geo_restriction=aws.cloudfront.DistributionRestrictionsGeoRestrictionArgs( + restriction_type="none" + ) + ), + viewer_certificate=aws.cloudfront.DistributionViewerCertificateArgs( + cloudfront_default_certificate=True + ), + tags={"Environment": "production"}, +) + +# Bucket policy: allow ONLY this distribution to read (OAC) +caller = aws.get_caller_identity_output() + +ui_bucket_policy = aws.s3.BucketPolicy( + "antenna-ui-bucket-policy", + bucket=ui_bucket.id, + policy=pulumi.Output.all(ui_bucket.bucket, caller.account_id, cf_distribution.id).apply( + lambda args: f""" +{{ + "Version": "2012-10-17", + "Statement": [ + {{ + "Sid": "AllowCloudFrontReadOnlyViaOAC", + "Effect": "Allow", + "Principal": {{ "Service": "cloudfront.amazonaws.com" }}, + "Action": "s3:GetObject", + "Resource": "arn:aws:s3:::{args[0]}/*", + "Condition": {{ + "StringEquals": {{ + "AWS:SourceArn": "arn:aws:cloudfront::{args[1]}:distribution/{args[2]}" + }} + }} + }} + ] +}} +""".strip() + ), + opts=pulumi.ResourceOptions(depends_on=[ui_public_access_block, ui_ownership]), +) + + +# ========================================================= +# 6.2 UPLOAD dist/ CONTENTS TO S3 +# ========================================================= +uploaded_objects = [] + +if os.path.isdir(ui_dist_dir): + for root, _, files in os.walk(ui_dist_dir): + for filename in files: + full_path = os.path.join(root, filename) + rel_key = os.path.relpath(full_path, ui_dist_dir).replace(os.sep, "/") + + # Keep Pulumi resource names short + stable + key_hash = hashlib.md5(rel_key.encode("utf-8")).hexdigest()[:12] + res_name = f"ui-obj-{key_hash}" + + obj = aws.s3.BucketObject( + res_name, + bucket=ui_bucket.id, + key=rel_key, + source=pulumi.FileAsset(full_path), + content_type=guess_content_type(full_path), + cache_control=cache_control_for_key(rel_key), + opts=pulumi.ResourceOptions(depends_on=[ui_bucket_policy]), + ) + uploaded_objects.append(obj) +else: + pulumi.log.warn( + f"UI build output not found: {ui_dist_dir}. " + "If you disabled build_ui_in_pulumi, run the UI build locally first." + ) + + +# ========================================================= +# 6.4 INVALIDATE CLOUDFRONT AFTER DEPLOY +# ========================================================= +# Pulumi AWS classic does NOT expose an Invalidation resource. +# Pulumi's recommended workaround is to run a post-deploy task using runtime logic + SDK. +def register_invalidation(distribution_id: str) -> None: + if pulumi.runtime.is_dry_run(): + pulumi.log.info("Preview detected: skipping CloudFront invalidation.") + return + + def _do_invalidate() -> None: + import boto3 # keep import here so previews don’t require boto3 + + pulumi.log.info(f"Creating CloudFront invalidation for distribution {distribution_id} (paths: /*)") + client = boto3.client("cloudfront") + result = client.create_invalidation( + DistributionId=distribution_id, + InvalidationBatch={ + "CallerReference": f"pulumi-{time.time()}", + "Paths": { + "Quantity": 1, + "Items": ["/*"], + }, + }, + ) + status = result["Invalidation"]["Status"] + inval_id = result["Invalidation"]["Id"] + pulumi.log.info(f"CloudFront invalidation created: {inval_id} (status: {status})") + + # run once the program is about to exit (after resources have been applied) + atexit.register(_do_invalidate) + + +# Trigger invalidation after any update where this program runs +cf_distribution.id.apply(lambda d: register_invalidation(d)) + + +# ========================================================= +# 6.5 OUTPUTS +# ========================================================= +pulumi.export("cloudfront_domain", cf_distribution.domain_name) +pulumi.export("ui_bucket_name", ui_bucket.bucket) +pulumi.export("backend_origin_domain", backend_origin_domain) +pulumi.export("ui_dist_dir", ui_dist_dir) +pulumi.export("build_ui_in_pulumi", build_ui_in_pulumi) + +pulumi.export("debug_eb_endpoint_url", env_pulumi.endpoint_url) +pulumi.export("debug_eb_cname", env_pulumi.cname) diff --git a/aws-infra/configurations/.ebextensions/00_setup.config_template b/aws-infra/configurations/.ebextensions/00_setup.config_template new file mode 100644 index 000000000..df7ad4591 --- /dev/null +++ b/aws-infra/configurations/.ebextensions/00_setup.config_template @@ -0,0 +1,48 @@ +option_settings: + aws:elasticbeanstalk:application:environment: + # Core Django settings + DJANGO_SETTINGS_MODULE: "config.settings.production" + DJANGO_SECRET_KEY: "" + DJANGO_DEBUG: "False" + DJANGO_ALLOWED_HOSTS: "," + DJANGO_SECURE_SSL_REDIRECT: "True" # disable HTTPS redirect to stop infinite loop + DJANGO_ADMIN_URL: "" + EB_HEALTHCHECK: "1" + + # AWS + S3 + DJANGO_AWS_ACCESS_KEY_ID: "" + DJANGO_AWS_SECRET_ACCESS_KEY: "" + DJANGO_AWS_STORAGE_BUCKET_NAME: "" + DJANGO_AWS_S3_REGION_NAME: "" + + # Database + POSTGRES_DB: "" + POSTGRES_USER: "" + POSTGRES_PASSWORD: "" + POSTGRES_HOST: "" + POSTGRES_PORT: "5432" + DATABASE_URL: "postgres://:@:5432/" + + # Redis / Celery + REDIS_URL: "rediss://:6379/0?ssl_cert_reqs=none" + CELERY_BROKER_URL: "rediss://:6379/0?ssl_cert_reqs=none" + + # ML backend + DEFAULT_PROCESSING_SERVICE_ENDPOINT: "http://ml-backend-minimal:2000" + + # Other integrations + SENTRY_DSN: "" + SENDGRID_API_KEY: "" + + # Health check settings for Elastic Beanstalk + aws:elasticbeanstalk:environment:process:default: + HealthCheckPath: "/api/v2/" + HealthCheckProtocol: "HTTP" # use HTTP since HTTPS not terminated internally + + aws:autoscaling:launchconfiguration: + RootVolumeSize: 30 + +container_commands: + 01_run_migrations: + command: "docker exec $(docker ps -q -f name=django) python manage.py migrate --noinput || true" + leader_only: true diff --git a/aws-infra/configurations/Dockerrun.aws.json_template b/aws-infra/configurations/Dockerrun.aws.json_template new file mode 100644 index 000000000..9ff6330b7 --- /dev/null +++ b/aws-infra/configurations/Dockerrun.aws.json_template @@ -0,0 +1,113 @@ +{ + "AWSEBDockerrunVersion": 2, + "containerDefinitions": [ + + { + "name": "django", + "image": ".dkr.ecr..amazonaws.com/antenna-backend:", + "essential": true, + "memory": 1024, + "portMappings": [ + { "hostPort": 80, "containerPort": 5000 } + ], + "command": ["/start"], + "environment": [ + { "name": "USE_DOCKER", "value": "yes" }, + { "name": "DEFAULT_PROCESSING_SERVICE_ENDPOINT", "value": "http://ml-backend-minimal:2000" }, + { "name": "DEFAULT_PROCESSING_SERVICE_NAME", "value": "Default ML Service" } + ], + "links": [ + "ml-backend-minimal", + "ml-backend-example" + ] + }, + + { + "name": "celeryworker", + "image": ".dkr.ecr..amazonaws.com/antenna-backend:", + "essential": false, + "memory": 512, + "command": ["/start-celeryworker"], + "environment": [ + { "name": "USE_DOCKER", "value": "yes" }, + { "name": "DEFAULT_PROCESSING_SERVICE_ENDPOINT", "value": "http://ml-backend-minimal:2000" } + ], + "links": [ + "ml-backend-minimal", + "ml-backend-example" + ] + }, + + { + "name": "celerybeat", + "image": ".dkr.ecr..amazonaws.com/antenna-backend:", + "essential": false, + "memory": 512, + "command": ["/start-celerybeat"], + "environment": [ + { "name": "USE_DOCKER", "value": "yes" }, + { "name": "DEFAULT_PROCESSING_SERVICE_ENDPOINT", "value": "http://ml-backend-minimal:2000" } + ], + "links": [ + "ml-backend-minimal", + "ml-backend-example" + ] + }, + + { + "name": "flower", + "image": ".dkr.ecr..amazonaws.com/antenna-backend:", + "essential": false, + "memory": 512, + "portMappings": [ + { "hostPort": 5555, "containerPort": 5555 } + ], + "command": ["/start-flower"], + "environment": [ + { "name": "CELERY_FLOWER_USER", "value": "" }, + { "name": "CELERY_FLOWER_PASSWORD", "value": "" }, + { "name": "USE_DOCKER", "value": "yes" }, + { "name": "DEFAULT_PROCESSING_SERVICE_ENDPOINT", "value": "http://ml-backend-minimal:2000" } + ], + "links": [ + "ml-backend-minimal", + "ml-backend-example" + ] + }, + + { + "name": "awscli", + "image": ".dkr.ecr..amazonaws.com/antenna-awscli:", + "essential": false, + "memory": 256, + "command": ["sleep", "9999999"], + "environment": [ + { "name": "AWS_REGION", "value": "" }, + { "name": "USE_DOCKER", "value": "yes" } + ] + }, + + { + "name": "ml-backend-minimal", + "image": ".dkr.ecr..amazonaws.com/antenna-ml-minimal:", + "essential": false, + "memory": 512, + "hostname": "ml-backend-minimal", + "portMappings": [ + { "hostPort": 2000, "containerPort": 2000 } + ] + }, + + { + "name": "ml-backend-example", + "image": ".dkr.ecr..amazonaws.com/antenna-ml-example:", + "essential": false, + "memory": 512, + "hostname": "ml-backend-example", + "portMappings": [ + { "hostPort": 2003, "containerPort": 2000 } + ] + } + + ] +} diff --git a/aws-infra/eb.py b/aws-infra/eb.py new file mode 100644 index 000000000..57694dde7 --- /dev/null +++ b/aws-infra/eb.py @@ -0,0 +1,437 @@ +import os +import json +import zipfile +import hashlib + +import pulumi +import pulumi_aws as aws + +from pulumi import ResourceOptions, CustomTimeouts + + +# --------------------------------------------------------- +# ECR repos + docker build/push +# --------------------------------------------------------- +import ecr +import images + +# --------------------------------------------------------- +# IAM roles + networking +# --------------------------------------------------------- +from iam_roles import ( + ec2_instance_profile_pulumi, + service_role_pulumi, + ecs_execution_role, + eb_ec2_passrole_ecs_execution, +) +from networking.subnets import redis_default_subnets +from networking.security_group import eb_sg + +# --------------------------------------------------------- +# Infra outputs +# --------------------------------------------------------- +from rds import rds_instance +from redis import redis + +# --------------------------------------------------------- +# EB_ENV contains: +# - plain env vars +# - *_SECRET_ARN pointers used to populate Dockerrun "secrets" +# --------------------------------------------------------- +from secrets_manager import EB_ENV + +# --------------------------------------------------------- +# Constants +# --------------------------------------------------------- +BUILD_DIR = "build_eb_bundle" +DOCKERRUN_PATH = os.path.join(BUILD_DIR, "Dockerrun.aws.json") +DEPLOY_ZIP_PATH = os.path.join(BUILD_DIR, "deploy.zip") + +BACKEND_TAG = "latest" +AWSCLI_TAG = "latest" +ML_MIN_TAG = "latest" +ML_EX_TAG = "latest" + + +def ensure_build_dir() -> None: + os.makedirs(BUILD_DIR, exist_ok=True) + + +def _split_plain_env_and_secret_arns(env: dict) -> tuple[list[dict], dict]: + """ + EB_ENV contains: + - plain env vars (strings) + - secret ARN pointers (keys ending with _SECRET_ARN) + + We generate: + - "environment": list[{name,value}] for plain vars + - "secrets": list[{name,valueFrom}] for secret values + (container gets env var NAME with secret VALUE at runtime) + """ + plain_env_list: list[dict] = [] + secret_arns: dict[str, str] = {} + + for k, v in env.items(): + if k.endswith("_SECRET_ARN"): + secret_arns[k] = v + continue + + plain_env_list.append({"name": k, "value": v}) + + return plain_env_list, secret_arns + + +def _require_secret_arn(secret_arns: dict[str, str], key: str) -> str: + """ + Fail early with a helpful message if a required *_SECRET_ARN entry is missing. + """ + if key not in secret_arns or not secret_arns[key]: + raise Exception(f"Missing required secret ARN in EB_ENV: {key}") + return secret_arns[key] + + +def build_dockerrun_and_zip( + backend_repo_url: str, + awscli_repo_url: str, + mlmin_repo_url: str, + mlex_repo_url: str, + execution_role_arn: str, + postgres_host: str, + eb_env: dict, +) -> str: + """ + Generate Dockerrun.aws.json (AWSEBDockerrunVersion 2) + and zip it into deploy.zip with Dockerrun at ZIP ROOT. + """ + ensure_build_dir() + + backend_image = f"{backend_repo_url}:{BACKEND_TAG}" + awscli_image = f"{awscli_repo_url}:{AWSCLI_TAG}" + ml_min_image = f"{mlmin_repo_url}:{ML_MIN_TAG}" + ml_ex_image = f"{mlex_repo_url}:{ML_EX_TAG}" + + # Split EB_ENV into plain vars vs secret ARN pointers + plain_env_list, secret_arns = _split_plain_env_and_secret_arns(eb_env) + + # Force POSTGRES_HOST from RDS output so it's always correct/reliable + plain_env_list = [e for e in plain_env_list if e["name"] != "POSTGRES_HOST"] + plain_env_list.append({"name": "POSTGRES_HOST", "value": postgres_host}) + + # Force SSL for postgres clients + plain_env_list = [e for e in plain_env_list if e["name"] != "PGSSLMODE"] + plain_env_list.append({"name": "PGSSLMODE", "value": "require"}) + + # Shared runtime env for all backend containers (django/celery/flower) + backend_environment = [{"name": "USE_DOCKER", "value": "yes"}] + plain_env_list + + # ECS secrets injection: + # valueFrom MUST be secret ARN; "name" becomes the env var in the container. + backend_secrets: list[dict] = [ + {"name": "DJANGO_SECRET_KEY", "valueFrom": _require_secret_arn(secret_arns, "DJANGO_SECRET_KEY_SECRET_ARN")}, + {"name": "POSTGRES_PASSWORD", "valueFrom": _require_secret_arn(secret_arns, "POSTGRES_PASSWORD_SECRET_ARN")}, + {"name": "DATABASE_URL", "valueFrom": _require_secret_arn(secret_arns, "DATABASE_URL_SECRET_ARN")}, + ] + + # S3 credentials + backend_secrets += [ + {"name": "DJANGO_AWS_ACCESS_KEY_ID", "valueFrom": _require_secret_arn(secret_arns, "DJANGO_AWS_ACCESS_KEY_ID_SECRET_ARN")}, + {"name": "DJANGO_AWS_SECRET_ACCESS_KEY", "valueFrom": _require_secret_arn(secret_arns, "DJANGO_AWS_SECRET_ACCESS_KEY_SECRET_ARN")}, + ] + + # Other secrets + backend_secrets += [ + {"name": "SENDGRID_API_KEY", "valueFrom": _require_secret_arn(secret_arns, "SENDGRID_API_KEY_SECRET_ARN")}, + {"name": "SENTRY_DSN", "valueFrom": _require_secret_arn(secret_arns, "SENTRY_DSN_SECRET_ARN")}, + {"name": "REDIS_URL", "valueFrom": _require_secret_arn(secret_arns, "REDIS_URL_SECRET_ARN")}, + {"name": "CELERY_BROKER_URL", "valueFrom": _require_secret_arn(secret_arns, "CELERY_BROKER_URL_SECRET_ARN")}, + ] + + dockerrun = { + "AWSEBDockerrunVersion": 2, + "executionRoleArn": execution_role_arn, + "containerDefinitions": [ + { + "name": "ml-backend-minimal", + "image": ml_min_image, + "essential": False, + "memory": 512, + "hostname": "ml-backend-minimal", + "portMappings": [{"hostPort": 2000, "containerPort": 2000}], + }, + { + "name": "ml-backend-example", + "image": ml_ex_image, + "essential": False, + "memory": 512, + "hostname": "ml-backend-example", + "portMappings": [{"hostPort": 2003, "containerPort": 2000}], + }, + { + "name": "awscli", + "image": awscli_image, + "essential": False, + "memory": 256, + "command": ["sleep", "9999999"], + "environment": [{"name": "AWS_REGION", "value": "us-west-2"}], + }, + { + "name": "django", + "image": backend_image, + "essential": True, + "memory": 1024, + "entryPoint": ["/entrypoint"], + "portMappings": [{"hostPort": 80, "containerPort": 5000}], + "command": ["/start"], + # NOTE: only ONE DEFAULT_PROCESSING_SERVICE_ENDPOINT key + "environment": backend_environment + [ + {"name": "DEFAULT_PROCESSING_SERVICE_ENDPOINT", "value": "http://ml-backend-minimal:2000"}, + {"name": "DEFAULT_PROCESSING_SERVICE_NAME", "value": "Default ML Service"}, + ], + "links": ["ml-backend-minimal", "ml-backend-example"], + "secrets": backend_secrets, + "dependsOn": [ + {"containerName": "ml-backend-minimal", "condition": "START"}, + {"containerName": "ml-backend-example", "condition": "START"}, + ], + }, + { + "name": "celeryworker", + "image": backend_image, + "essential": False, + "memory": 512, + "entryPoint": ["/entrypoint"], + "command": ["/start-celeryworker"], + "environment": backend_environment + [ + {"name": "DEFAULT_PROCESSING_SERVICE_ENDPOINT", "value": "http://ml-backend-minimal:2000"}, + ], + "links": ["ml-backend-minimal", "ml-backend-example"], + "secrets": backend_secrets, + "dependsOn": [{"containerName": "django", "condition": "START"}], + }, + { + "name": "celerybeat", + "image": backend_image, + "essential": False, + "memory": 512, + "entryPoint": ["/entrypoint"], + "command": ["/start-celerybeat"], + "environment": backend_environment + [ + {"name": "DEFAULT_PROCESSING_SERVICE_ENDPOINT", "value": "http://ml-backend-minimal:2000"}, + ], + "links": ["ml-backend-minimal", "ml-backend-example"], + "secrets": backend_secrets, + "dependsOn": [{"containerName": "django", "condition": "START"}], + }, + { + "name": "flower", + "image": backend_image, + "essential": False, + "memory": 512, + "entryPoint": ["/entrypoint"], + "portMappings": [{"hostPort": 5555, "containerPort": 5555}], + "command": ["/start-flower"], + "environment": backend_environment + [ + {"name": "CELERY_FLOWER_USER", "value": "admin"}, + {"name": "CELERY_FLOWER_PASSWORD", "value": "password"}, + {"name": "DEFAULT_PROCESSING_SERVICE_ENDPOINT", "value": "http://ml-backend-minimal:2000"}, + ], + "links": ["ml-backend-minimal", "ml-backend-example"], + "secrets": backend_secrets, + "dependsOn": [{"containerName": "django", "condition": "START"}], + }, + ], + } + + with open(DOCKERRUN_PATH, "w") as f: + json.dump(dockerrun, f, indent=2) + + if os.path.exists(DEPLOY_ZIP_PATH): + os.remove(DEPLOY_ZIP_PATH) + + with zipfile.ZipFile(DEPLOY_ZIP_PATH, "w", zipfile.ZIP_DEFLATED) as zf: + zf.write(DOCKERRUN_PATH, arcname="Dockerrun.aws.json") + + return DEPLOY_ZIP_PATH + + +def file_sha256(path: str) -> str: + h = hashlib.sha256() + with open(path, "rb") as f: + for chunk in iter(lambda: f.read(1024 * 1024), b""): + h.update(chunk) + return h.hexdigest()[:12] + + +def make_bundle_key_and_asset(zip_path: str): + sha = file_sha256(zip_path) + return sha, f"deploy-{sha}.zip", pulumi.FileAsset(zip_path) + + +# --------------------------------------------------------- +# 1) Create deploy bundle +# --------------------------------------------------------- +zip_meta_output = ( + pulumi.Output.all( + ecr.ecr_repos["antenna-backend-pulumi"].repository_url, + ecr.ecr_repos["antenna-awscli-pulumi"].repository_url, + ecr.ecr_repos["antenna-ml-minimal-pulumi"].repository_url, + ecr.ecr_repos["antenna-ml-example-pulumi"].repository_url, + ecs_execution_role.arn, + rds_instance.address, # authoritative POSTGRES_HOST + EB_ENV, # contains both plain vars and secret ARN pointers + ) + .apply(lambda args: build_dockerrun_and_zip(*args)) + .apply(make_bundle_key_and_asset) +) +# zip_meta_output -> (sha, key, FileAsset) + + +# --------------------------------------------------------- +# 2) EB prereqs +# --------------------------------------------------------- +default_vpc = aws.ec2.get_vpc_output(default=True) + +ecs_solution_stack = aws.elasticbeanstalk.get_solution_stack( + name_regex="64bit Amazon Linux 2.*running ECS", + most_recent=True, +) + +# --------------------------------------------------------- +# 3) EB Application +# --------------------------------------------------------- +eb_app_pulumi = aws.elasticbeanstalk.Application( + "antenna-django-eb-app-pulumi", + name="antenna-django-eb-app-pulumi", +) + +# --------------------------------------------------------- +# 4) S3 bundle bucket + object +# --------------------------------------------------------- +eb_bundle_bucket = aws.s3.Bucket( + "antenna-eb-bundles-pulumi", + force_destroy=True, +) + +eb_bundle_object = aws.s3.BucketObject( + "antenna-eb-deploy-zip", + bucket=eb_bundle_bucket.bucket, + key=zip_meta_output.apply(lambda x: x[1]), + source=zip_meta_output.apply(lambda x: x[2]), +) + +# --------------------------------------------------------- +# 5) EB Application Version +# --------------------------------------------------------- +app_version = aws.elasticbeanstalk.ApplicationVersion( + "antenna-eb-app-version-pulumi", + application=eb_app_pulumi.name, + bucket=eb_bundle_bucket.bucket, + key=eb_bundle_object.key, + name=zip_meta_output.apply(lambda x: f"deploy-{x[0]}"), +) + +# --------------------------------------------------------- +# 6) EB Environment Settings +# NOTE: console visibility only. +# IMPORTANT: do NOT inject *_SECRET_ARN keys here. +# --------------------------------------------------------- +eb_console_env = {k: v for k, v in EB_ENV.items() if not k.endswith("_SECRET_ARN")} + +eb_app_env_settings = [ + aws.elasticbeanstalk.EnvironmentSettingArgs( + namespace="aws:elasticbeanstalk:application:environment", + name=k, + value=v, + ) + for k, v in eb_console_env.items() +] + + +# (POSTGRES_HOST is already in eb_console_env after filtering secrets, but we keep this override +# to ensure the RDS output wins if anything drifted.) +eb_app_env_settings += [ + aws.elasticbeanstalk.EnvironmentSettingArgs( + namespace="aws:elasticbeanstalk:application:environment", + name="POSTGRES_HOST", + value=rds_instance.address, + ), + aws.elasticbeanstalk.EnvironmentSettingArgs( + namespace="aws:elasticbeanstalk:application:environment", + name="REDIS_HOST", + value=redis.primary_endpoint_address, + ), +] + +eb_env_settings = [ + aws.elasticbeanstalk.EnvironmentSettingArgs(namespace="aws:ec2:vpc", name="VPCId", value=default_vpc.id), + aws.elasticbeanstalk.EnvironmentSettingArgs(namespace="aws:ec2:vpc", name="AssociatePublicIpAddress", value="true"), + aws.elasticbeanstalk.EnvironmentSettingArgs( + namespace="aws:ec2:vpc", + name="Subnets", + value=pulumi.Output.all(*[s.id for s in redis_default_subnets]).apply(",".join), + ), + aws.elasticbeanstalk.EnvironmentSettingArgs( + namespace="aws:elasticbeanstalk:environment", + name="EnvironmentType", + value="SingleInstance", + ), + aws.elasticbeanstalk.EnvironmentSettingArgs( + namespace="aws:autoscaling:launchconfiguration", + name="InstanceType", + value="t3.large", + ), + aws.elasticbeanstalk.EnvironmentSettingArgs( + namespace="aws:autoscaling:launchconfiguration", + name="SecurityGroups", + value=eb_sg.id, + ), + aws.elasticbeanstalk.EnvironmentSettingArgs( + namespace="aws:autoscaling:launchconfiguration", + name="RootVolumeSize", + value="30", + ), + aws.elasticbeanstalk.EnvironmentSettingArgs( + namespace="aws:autoscaling:launchconfiguration", + name="IamInstanceProfile", + value=ec2_instance_profile_pulumi.name, + ), + aws.elasticbeanstalk.EnvironmentSettingArgs( + namespace="aws:elasticbeanstalk:environment", + name="ServiceRole", + value=service_role_pulumi.name, + ), +] + +eb_env_settings += eb_app_env_settings + +# --------------------------------------------------------- +# 7) EB Environment +# --------------------------------------------------------- +env_pulumi = aws.elasticbeanstalk.Environment( + "antenna-django-eb-env-pulumi-v2", + application=eb_app_pulumi.name, + solution_stack_name=ecs_solution_stack.name, + version=app_version.name, + settings=eb_env_settings, + opts=pulumi.ResourceOptions( + depends_on=[ + ecs_execution_role, + eb_ec2_passrole_ecs_execution, + app_version, + images.backend_cmd, + images.awscli_cmd, + images.ml_min_cmd, + images.ml_ex_cmd, + ], + custom_timeouts=CustomTimeouts(create="60m", update="60m"), + ), +) + +# --------------------------------------------------------- +# 8) Outputs +# --------------------------------------------------------- +pulumi.export("eb_env_name_pulumi", env_pulumi.name) +pulumi.export("eb_url_pulumi", env_pulumi.endpoint_url) +pulumi.export("eb_bundle_bucket", eb_bundle_bucket.bucket) +pulumi.export("eb_deploy_zip_s3_key", eb_bundle_object.key) +pulumi.export("eb_deploy_version_label", app_version.name) diff --git a/aws-infra/ecr.py b/aws-infra/ecr.py new file mode 100644 index 000000000..e621b63b2 --- /dev/null +++ b/aws-infra/ecr.py @@ -0,0 +1,50 @@ +import pulumi +import pulumi_aws as aws + +# Repositories to create +REPOS = [ + "antenna-awscli-pulumi", + "antenna-backend-pulumi", + "antenna-ml-example-pulumi", + "antenna-ml-minimal-pulumi", +] + +ecr_repos = {} + +for repo in REPOS: + ecr_repos[repo] = aws.ecr.Repository( + repo, + name=repo, + + force_delete=True, + + # Scan on push = ON + image_scanning_configuration=aws.ecr.RepositoryImageScanningConfigurationArgs( + scan_on_push=False + ), + + # Mutable tags + image_tag_mutability="MUTABLE", + + # Encryption: AES-256 (SSE-S3) + encryption_configurations=[ + aws.ecr.RepositoryEncryptionConfigurationArgs( + encryption_type="AES256" + ) + ], + + tags={ + "Name": repo, + "ManagedBy": "Pulumi", + "Project": "Antenna", + }, + ) + + + + +# outputs for EB Dockerrun generation +pulumi.export("ecr_backend_repo_url", ecr_repos["antenna-backend-pulumi"].repository_url) +pulumi.export("ecr_awscli_repo_url", ecr_repos["antenna-awscli-pulumi"].repository_url) +pulumi.export("ecr_ml_min_repo_url", ecr_repos["antenna-ml-minimal-pulumi"].repository_url) +pulumi.export("ecr_ml_ex_repo_url", ecr_repos["antenna-ml-example-pulumi"].repository_url) diff --git a/aws-infra/iam_roles.py b/aws-infra/iam_roles.py new file mode 100644 index 000000000..1bc7f1e2c --- /dev/null +++ b/aws-infra/iam_roles.py @@ -0,0 +1,240 @@ +import json +import pulumi +import pulumi_aws as aws + +from storage.s3 import assets_bucket # used for S3 bucket ARN resolution + +# ========================================================= +# 1) ECS TASK EXECUTION ROLE +# +# Used by ECS TASKS (containers) to: +# - Pull images from ECR +# - Read secrets via ECS `valueFrom` +# ========================================================= + +ecs_execution_role = aws.iam.Role( + "antenna-ecs-task-execution-role-pulumi", + name="antenna-ecs-task-execution-role-pulumi", + assume_role_policy=json.dumps( + { + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Principal": {"Service": "ecs-tasks.amazonaws.com"}, + "Action": "sts:AssumeRole", + } + ], + } + ), + tags={"ManagedBy": "Pulumi", "Project": "Antenna"}, +) + +# Standard ECS execution policy (ECR pulls + CloudWatch logs) +aws.iam.RolePolicyAttachment( + "ecs-execution-policy-attach", + role=ecs_execution_role.name, + policy_arn="arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy", +) + +aws.iam.RolePolicy( + "ecs-execution-secrets-readonly", + role=ecs_execution_role.name, + policy=json.dumps( + { + "Version": "2012-10-17", + "Statement": [ + { + "Sid": "AllowSecretsReadForEcsTasks", + "Effect": "Allow", + "Action": [ + "secretsmanager:GetSecretValue", + "secretsmanager:DescribeSecret", + ], + "Resource": "arn:aws:secretsmanager:*:*:secret:*", + } + ], + } + ), +) + +pulumi.export("ecs_execution_role_arn", ecs_execution_role.arn) + +# ========================================================= +# 2) ELASTIC BEANSTALK EC2 INSTANCE ROLE +# +# Used by: +# - EB platform +# - ECS agent on the EB host +# - EB deploy hooks +# - SSM Session Manager + +# ========================================================= + +ec2_role_pulumi = aws.iam.Role( + "aws-elasticbeanstalk-ec2-role_pulumi", + name="aws-elasticbeanstalk-ec2-role_pulumi", + assume_role_policy=json.dumps( + { + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Principal": {"Service": "ec2.amazonaws.com"}, + "Action": "sts:AssumeRole", + } + ], + } + ), + tags={"ManagedBy": "Pulumi", "Project": "Antenna"}, +) + +# Standard EB + ECS host permissions +ec2_policy_arns = [ + "arn:aws:iam::aws:policy/AWSElasticBeanstalkWebTier", + "arn:aws:iam::aws:policy/AWSElasticBeanstalkWorkerTier", + "arn:aws:iam::aws:policy/AWSElasticBeanstalkMulticontainerDocker", + "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly", + "arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy", + "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore", # SSM Session Manager +] + +for i, policy_arn in enumerate(ec2_policy_arns): + aws.iam.RolePolicyAttachment( + f"aws-elasticbeanstalk-ec2-policy-{i}_pulumi", + role=ec2_role_pulumi.name, + policy_arn=policy_arn, + ) + +# EC2 role MUST be able to read Secrets Manager (for ECS/EB bootstrapping + testing) +aws.iam.RolePolicy( + "eb-ec2-secretsmanager-readonly", + role=ec2_role_pulumi.name, + policy=json.dumps( + { + "Version": "2012-10-17", + "Statement": [ + { + "Sid": "AllowSecretsReadFromEc2", + "Effect": "Allow", + "Action": [ + "secretsmanager:GetSecretValue", + "secretsmanager:DescribeSecret", + ], + "Resource": "arn:aws:secretsmanager:*:*:secret:*", + } + ], + } + ), +) + + + +assets_bucket_arn = assets_bucket.arn +assets_objects_arn = assets_bucket.arn.apply(lambda a: f"{a}/*") + +aws.iam.RolePolicy( + "eb-ec2-assets-s3-access", + role=ec2_role_pulumi.name, + policy=pulumi.Output.all(assets_bucket_arn, assets_objects_arn).apply( + lambda arns: json.dumps( + { + "Version": "2012-10-17", + "Statement": [ + # Bucket-level + { + "Sid": "AllowAssetsBucketListAndLocation", + "Effect": "Allow", + "Action": [ + "s3:ListBucket", + "s3:GetBucketLocation", + ], + "Resource": arns[0], + }, + # Object-level + { + "Sid": "AllowAssetsObjectRW", + "Effect": "Allow", + "Action": [ + "s3:GetObject", + "s3:PutObject", + "s3:DeleteObject", + ], + "Resource": arns[1], + }, + ], + } + ) + ), +) + +# Allow EB EC2 host to pass the ECS execution role (used by ECS task execution) +eb_ec2_passrole_ecs_execution = aws.iam.RolePolicy( + "eb-ec2-passrole-ecs-execution-role", + role=ec2_role_pulumi.name, + policy=ecs_execution_role.arn.apply( + lambda arn: json.dumps( + { + "Version": "2012-10-17", + "Statement": [ + { + "Sid": "AllowPassRoleToEcs", + "Effect": "Allow", + "Action": "iam:PassRole", + "Resource": arn, + } + ], + } + ) + ), +) + +ec2_instance_profile_pulumi = aws.iam.InstanceProfile( + "aws-elasticbeanstalk-ec2-instance-profile_pulumi", + name="aws-elasticbeanstalk-ec2-instance-profile_pulumi", + role=ec2_role_pulumi.name, + tags={"ManagedBy": "Pulumi", "Project": "Antenna"}, +) + +# ========================================================= +# 3) ELASTIC BEANSTALK SERVICE ROLE +# +# Used by EB control plane (health, updates) +# ========================================================= + +service_role_pulumi = aws.iam.Role( + "aws-elasticbeanstalk-service-role_pulumi", + name="aws-elasticbeanstalk-service-role_pulumi", + assume_role_policy=json.dumps( + { + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Principal": {"Service": "elasticbeanstalk.amazonaws.com"}, + "Action": "sts:AssumeRole", + } + ], + } + ), + tags={"ManagedBy": "Pulumi", "Project": "Antenna"}, +) + +service_policy_arns = [ + "arn:aws:iam::aws:policy/AWSElasticBeanstalkManagedUpdatesCustomerRolePolicy", + "arn:aws:iam::aws:policy/service-role/AWSElasticBeanstalkEnhancedHealth", +] + +for i, policy_arn in enumerate(service_policy_arns): + aws.iam.RolePolicyAttachment( + f"aws-elasticbeanstalk-service-policy-{i}_pulumi", + role=service_role_pulumi.name, + policy_arn=policy_arn, + ) + +# ========================================================= +# Outputs +# ========================================================= +pulumi.export("eb_ec2_role_name_pulumi", ec2_role_pulumi.name) +pulumi.export("eb_ec2_instance_profile_name_pulumi", ec2_instance_profile_pulumi.name) +pulumi.export("eb_service_role_name_pulumi", service_role_pulumi.name) diff --git a/aws-infra/images.py b/aws-infra/images.py new file mode 100644 index 000000000..93595e199 --- /dev/null +++ b/aws-infra/images.py @@ -0,0 +1,242 @@ +import os +import hashlib +import pulumi +import pulumi_aws as aws +from pulumi_command import local +import ecr + +# ------------------------------------------------------------------- +# AWS account + region info +# ------------------------------------------------------------------- +caller = aws.get_caller_identity() # gets AWS account ID +region = aws.get_region() # gets current AWS region + +# ------------------------------------------------------------------- +# Repo root (used to resolve relative Docker paths reliably) +# ------------------------------------------------------------------- +REPO_ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), "..")) + +# ------------------------------------------------------------------- +# Isolated Docker config (avoids conflicts with ~/.docker) +# ------------------------------------------------------------------- +DOCKER_CONFIG_DIR = "/tmp/pulumi-docker-config" +DOCKER_CONFIG_JSON = os.path.join(DOCKER_CONFIG_DIR, "config.json") + +# ------------------------------------------------------------------- +# Docker image tags +# ------------------------------------------------------------------- +BACKEND_TAG = "latest" +AWSCLI_TAG = "latest" +ML_MIN_TAG = "latest" +ML_EX_TAG = "latest" + +# ------------------------------------------------------------------- +# ECR repository URLs (created in ecr.py) +# ------------------------------------------------------------------- +backend_repo_url = ecr.ecr_repos["antenna-backend-pulumi"].repository_url +awscli_repo_url = ecr.ecr_repos["antenna-awscli-pulumi"].repository_url +mlmin_repo_url = ecr.ecr_repos["antenna-ml-minimal-pulumi"].repository_url +mlex_repo_url = ecr.ecr_repos["antenna-ml-example-pulumi"].repository_url + +# ------------------------------------------------------------------- +# Helper: convert relative paths to absolute paths +# (Docker + Pulumi are safer with absolute paths) +# ------------------------------------------------------------------- +def _abs(path: str) -> str: + if os.path.isabs(path): + return path + return os.path.abspath(os.path.join(REPO_ROOT, path)) + +# ------------------------------------------------------------------- +# Helper: create a hash from Docker build inputs +# Used to trigger rebuilds only when inputs change + +# Build context path is the directory Docker is allowed to see during the build. + + +# ------------------------------------------------------------------- +def _hash_build_inputs(dockerfile_path: str, context_path: str) -> str: + h = hashlib.sha256() + h.update(dockerfile_path.encode("utf-8")) # include Dockerfile path + h.update(context_path.encode("utf-8")) # include build context path + return h.hexdigest() + +# ------------------------------------------------------------------- +# Docker setup script +# - resets Docker config +# - restores buildx plugin (needed when DOCKER_CONFIG is custom) + +# buildx: +# Docker build extension that supports modern features like BuildKit, +# cross-platform builds (e.g. linux/amd64 for AWS), +# Required here so images built on Mac (ARM) run correctly on AWS (x86). + +# ------------------------------------------------------------------- +DOCKER_SETUP_BLOCK = f""" +export DOCKER_CONFIG="{DOCKER_CONFIG_DIR}" +mkdir -p "$DOCKER_CONFIG" +rm -f "{DOCKER_CONFIG_JSON}" + +# Create a clean Docker config (no cached auth) +cat > "{DOCKER_CONFIG_JSON}" << "EOF" +{{ + "auths": {{}}, + "credsStore": "", + "credHelpers": {{}} +}} +EOF + +# Docker looks for plugins in $DOCKER_CONFIG/cli-plugins +mkdir -p "$DOCKER_CONFIG/cli-plugins" + +# Restore docker-buildx plugin from common install locations +if [ -f "$HOME/.docker/cli-plugins/docker-buildx" ]; then + ln -sf "$HOME/.docker/cli-plugins/docker-buildx" "$DOCKER_CONFIG/cli-plugins/docker-buildx" +elif [ -f "/usr/local/lib/docker/cli-plugins/docker-buildx" ]; then + ln -sf "/usr/local/lib/docker/cli-plugins/docker-buildx" "$DOCKER_CONFIG/cli-plugins/docker-buildx" +elif [ -f "/opt/homebrew/lib/docker/cli-plugins/docker-buildx" ]; then + ln -sf "/opt/homebrew/lib/docker/cli-plugins/docker-buildx" "$DOCKER_CONFIG/cli-plugins/docker-buildx" +fi + +# Debug info to confirm setup +echo "Using DOCKER_CONFIG=$DOCKER_CONFIG" +ls -la "$DOCKER_CONFIG/cli-plugins" || true +""" + +# ========================================================= +# Helper: authenticate Docker to ECR WITHOUT using macOS Keychain +# - avoids `docker login` credential helper storing duplicates in Keychain +# - writes auth directly into the isolated DOCKER_CONFIG/config.json +# ========================================================= +ECR_AUTH_BLOCK = r""" +PASS="$(aws ecr get-login-password --region "$REGION")" +AUTH="$(printf 'AWS:%s' "$PASS" | base64)" +cat > "$DOCKER_CONFIG/config.json" < $REGISTRY" +docker buildx version +""" + ), + # Re-run login if account or region changes + triggers=[pulumi.Output.all(caller.account_id, region.name)], +) + +# ========================================================= +# Helper: build + push Docker image to ECR +# ========================================================= +def build_push_image(name: str, tag: str, repo_url, dockerfile_path: str, context_path: str): + dockerfile_abs = _abs(dockerfile_path) # absolute Dockerfile path + context_abs = _abs(context_path) # absolute build context + input_hash = _hash_build_inputs(dockerfile_abs, context_abs) + + # Triggers decide when Pulumi reruns this command + triggers = [tag, dockerfile_abs, context_abs, input_hash, repo_url] + + return local.Command( + name, + create=pulumi.Output.all(repo_url, caller.account_id, region.name).apply( + lambda args: f""" +set -euo pipefail + +REPO_URL="{args[0]}" +ACCOUNT_ID="{args[1]}" +REGION="{args[2]}" +REGISTRY="$ACCOUNT_ID.dkr.ecr.$REGION.amazonaws.com" + +export DOCKER_BUILDKIT=1 + +{DOCKER_SETUP_BLOCK} + +# Auth again to ensure Docker can push (no Keychain writes) +{ECR_AUTH_BLOCK} + +echo "Building + pushing -> $REPO_URL:{tag}" +echo "Dockerfile -> {dockerfile_abs}" +echo "Context -> {context_abs}" + +docker buildx version + +# Build for AWS-compatible architecture and push to ECR +docker buildx build --platform linux/amd64 \ + -f "{dockerfile_abs}" \ + -t "${{REPO_URL}}:{tag}" \ + "{context_abs}" \ + --push + +echo "Pushed -> $REPO_URL:{tag}" +""" + ), + # Ensure ECR login happens first + opts=pulumi.ResourceOptions(depends_on=[ecr_login]), + triggers=triggers, + ) + +# ========================================================= +# Image builds +# ========================================================= +backend_cmd = build_push_image( + "build-push-backend", + BACKEND_TAG, + backend_repo_url, + "compose/production/django/Dockerfile", + ".", +) + +awscli_cmd = build_push_image( + "build-push-awscli", + AWSCLI_TAG, + awscli_repo_url, + "compose/production/aws/Dockerfile", + ".", +) + +ml_min_cmd = build_push_image( + "build-push-ml-minimal", + ML_MIN_TAG, + mlmin_repo_url, + "processing_services/minimal/Dockerfile", + "processing_services/minimal", +) + +ml_ex_cmd = build_push_image( + "build-push-ml-example", + ML_EX_TAG, + mlex_repo_url, + "processing_services/example/Dockerfile", + "processing_services/example", +) + +# ========================================================= +# Outputs (final image URIs) +# ========================================================= +pulumi.export("backend_image_uri", pulumi.Output.concat(backend_repo_url, ":", BACKEND_TAG)) +pulumi.export("awscli_image_uri", pulumi.Output.concat(awscli_repo_url, ":", AWSCLI_TAG)) +pulumi.export("ml_min_image_uri", pulumi.Output.concat(mlmin_repo_url, ":", ML_MIN_TAG)) +pulumi.export("ml_ex_image_uri", pulumi.Output.concat(mlex_repo_url, ":", ML_EX_TAG)) diff --git a/aws-infra/images/aws_architecture_backend.svg b/aws-infra/images/aws_architecture_backend.svg new file mode 100644 index 000000000..25afed397 --- /dev/null +++ b/aws-infra/images/aws_architecture_backend.svg @@ -0,0 +1 @@ + diff --git a/aws-infra/images/aws_architecture_frontend.svg b/aws-infra/images/aws_architecture_frontend.svg new file mode 100644 index 000000000..3ac0efa88 --- /dev/null +++ b/aws-infra/images/aws_architecture_frontend.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/aws-infra/modifications/storage.py b/aws-infra/modifications/storage.py new file mode 100644 index 000000000..e3a9ec495 --- /dev/null +++ b/aws-infra/modifications/storage.py @@ -0,0 +1,161 @@ +""" + +This module provides helper functions for setting up and working with +S3-based storage used by Antenna projects. + +## Function Overview +- create_storage_source(...) + Called during project setup to ensure an S3 storage source exists + and is accessible for a given Project. + +- populate_bucket(...) + Called only for demo or test setup to upload generated image data + into S3/MinIO for pipeline validation. + + +## What Was Modified and Why +Previously, this module assumed a test-only MinIO setup and always tried +to create the S3 bucket at runtime. + +This works locally, but causes issues in AWS because: +- S3 buckets already exist and are managed outside the application +- Attempting to create them again can fail or behave incorrectly + +The logic was updated to: +- Automatically select AWS S3 in production and MinIO locally +- Assume the bucket already exists in AWS +- Verify access by writing a small placeholder file instead of creating + the bucket + +This allows the same code to run safely in both local and AWS +environments without duplication. + +""" + + + +import io +import logging + +from django.conf import settings + +from ami.main.models import Project, S3StorageSource +from ami.tests.fixtures.images import GeneratedTestFrame, generate_moth_series +from ami.utils import s3 + +logger = logging.getLogger(__name__) + +# --------------------------------------------------------------------- +# SAFE CONFIG LOGIC +# Uses REAL AWS S3 in production (EB) +# Uses FALLBACK MINIO fake config during Docker build / test mode +# --------------------------------------------------------------------- + +USE_AWS = all([ + hasattr(settings, "DJANGO_AWS_ACCESS_KEY_ID"), + hasattr(settings, "DJANGO_AWS_SECRET_ACCESS_KEY"), + hasattr(settings, "DJANGO_AWS_STORAGE_BUCKET_NAME"), +]) + +if USE_AWS: + # REAL AWS CONFIG (for Elastic Beanstalk) + S3_TEST_CONFIG = s3.S3Config( + endpoint_url=None, # boto3 auto-selects correct S3 endpoint + access_key_id=settings.DJANGO_AWS_ACCESS_KEY_ID, + secret_access_key=settings.DJANGO_AWS_SECRET_ACCESS_KEY, + bucket_name=settings.DJANGO_AWS_STORAGE_BUCKET_NAME, + prefix="demo-data", + public_base_url=f"https://{settings.DJANGO_AWS_STORAGE_BUCKET_NAME}.s3.amazonaws.com/demo-data", + ) +else: + # FALLBACK CONFIG (for Docker build/test) + S3_TEST_CONFIG = s3.S3Config( + endpoint_url="http://minio:9000", + access_key_id="minioadmin", + secret_access_key="minioadmin", + bucket_name="ami-test", + prefix="test_prefix", + public_base_url="http://minio:9000/ami-test/test_prefix", + ) + logger.warning("Using fallback MinIO test config (AWS settings not found).") + + +# --------------------------------------------------------------------- +# CREATE STORAGE SOURCE +# --------------------------------------------------------------------- +def create_storage_source(project: Project, name: str, prefix: str = S3_TEST_CONFIG.prefix) -> S3StorageSource: + + # Try creating tiny placeholder to verify prefix + placeholder_key = f"{prefix}/.placeholder" + + try: + s3.write_file(S3_TEST_CONFIG, placeholder_key, b"") + logger.info(f"[S3] Verified prefix exists: {placeholder_key}") + except Exception as e: + logger.error(f"[S3] Could not verify prefix '{prefix}': {e}") + + data_source, _created = S3StorageSource.objects.get_or_create( + project=project, + name=name, + defaults=dict( + bucket=S3_TEST_CONFIG.bucket_name, + prefix=prefix, + endpoint_url=S3_TEST_CONFIG.endpoint_url, + access_key=S3_TEST_CONFIG.access_key_id, + secret_key=S3_TEST_CONFIG.secret_access_key, + public_base_url=S3_TEST_CONFIG.public_base_url, + ), + ) + + return data_source + + +# --------------------------------------------------------------------- +# POPULATE BUCKET WITH DEMO IMAGES +# --------------------------------------------------------------------- +def populate_bucket( + config: s3.S3Config, + subdir: str = "deployment_1", + num_nights: int = 2, + images_per_day: int = 3, + minutes_interval: int = 45, + minutes_interval_variation: int = 10, + skip_existing: bool = True, +) -> list[GeneratedTestFrame]: + + created = [] + + # Skip if images already exist + if skip_existing: + keys = s3.list_files(config=config, subdir=subdir, limit=10) + existing_keys = [key.key for key, i in keys if key] + + if existing_keys: + logger.info(f"[S3] Skipping: Found existing images in {subdir}: {existing_keys}") + return [] + + logger.info(f"[S3] Generating {num_nights * images_per_day} demo frames…") + + for _ in range(num_nights): + for frame in generate_moth_series( + num_frames=images_per_day, + minutes_interval=minutes_interval, + minutes_interval_variation=minutes_interval_variation, + save_images=False, + ): + + # Convert image to bytes + img_bytes = io.BytesIO() + frame.image.save(img_bytes, format="JPEG") + img_bytes = img_bytes.getvalue() + + # S3 key + key = f"{subdir}/{frame.filename}" + + logger.info(f"[S3] Uploading: {key} → {config.bucket_name}") + s3.write_file(config, key, img_bytes) + + frame.object_store_key = key + created.append(frame) + + return created diff --git a/aws-infra/networking/routes.py b/aws-infra/networking/routes.py new file mode 100644 index 000000000..db96063d6 --- /dev/null +++ b/aws-infra/networking/routes.py @@ -0,0 +1,34 @@ +""" +This creates a dedicated private route table that is associated only with the custom RDS private subnets. +The default AWS route table and default subnets are left untouched. +Since this route table contains only the default local VPC route and no internet or NAT gateway routes, +any resources launched in these subnets can communicate only within the VPC. +This ensures the RDS database remains fully private and inaccessible from the public internet. +""" + +import pulumi +import pulumi_aws as aws +from .subnets import default_vpc, private_db_subnets + + +# --------------------------------------------------------- +# Private route table +# --------------------------------------------------------- +private_rt = aws.ec2.RouteTable( + "antenna-pulumi-private-rt", + vpc_id=default_vpc.id, + tags={"Name": "antenna-pulumi-private-rt"}, +) + +# --------------------------------------------------------- +# Associate ONLY DB private subnets +# --------------------------------------------------------- +for i, subnet in enumerate(private_db_subnets[:2]): + aws.ec2.RouteTableAssociation( + f"antenna-pulumi-private-rt-assoc-{i}", + subnet_id=subnet.id, + route_table_id=private_rt.id, + ) + +pulumi.export("private_route_table_id", private_rt.id) +pulumi.export("attached_subnet_ids", [s.id for s in private_db_subnets[:2]]) diff --git a/aws-infra/networking/security_group.py b/aws-infra/networking/security_group.py new file mode 100644 index 000000000..08c6e6019 --- /dev/null +++ b/aws-infra/networking/security_group.py @@ -0,0 +1,198 @@ +import pulumi +import pulumi_aws as aws + +# --------------------------------------------------------- +# Fetch the default VPC +# --------------------------------------------------------- + +default_vpc = aws.ec2.get_vpc_output(default=True) + +# --------------------------------------------------------- +# Elastic Beanstalk Security Group +# --------------------------------------------------------- +# This security group is attached to the Elastic Beanstalk +# environment. +# +# Purpose: +# - Allow the application to make outbound connections +# (to RDS, Redis, external APIs, etc.) +# - Allow restricted inbound admin/debug access + +# --------------------------------------------------------- + +eb_sg = aws.ec2.SecurityGroup( + "antenna-eb-sg-pulumi", + description="SG attached to EB instance (inbound admin/debug + outbound app traffic)", + vpc_id=default_vpc.id, + + # ----------------- + # INGRESS RULES + # ----------------- + # Ingress controls who can initiate connections INTO + # the Elastic Beanstalk environment. + + ingress=[ + # Allow incoming HTTP traffic on port 80 from anywhere. + # This is needed because the Django app is directly exposed + # on the EC2 instance (Single-Instance Elastic Beanstalk). + aws.ec2.SecurityGroupIngressArgs( + protocol="tcp", + from_port=80, + to_port=80, + cidr_blocks=["0.0.0.0/0"], # Public access + description="Public HTTP access to the Django web app", + ), + + # Allow incoming HTTPS traffic on port 443 from anywhere. + # Use this ONLY if TLS/SSL is terminated on the EC2 instance itself. + # If SSL is handled elsewhere (e.g., ALB or CloudFront), this can be removed. + aws.ec2.SecurityGroupIngressArgs( + protocol="tcp", + from_port=443, + to_port=443, + cidr_blocks=["0.0.0.0/0"], # Public access + description="Public HTTPS access to the Django web app (optional)", + ), + + + # Admin UI (e.g. Flower) - restricted to your IP + aws.ec2.SecurityGroupIngressArgs( + protocol="tcp", + from_port=5555, # Port used for admin UI (e.g. Flower) + to_port=5555, + cidr_blocks=[""], # Only allow your personal IP + description="Admin access (Flower)", + ), + + ], + + # ----------------- + # EGRESS RULES + # ----------------- + # Egress controls where the application can connect OUT to. + # Allowing all outbound traffic is standard for app services + # so they can reach databases, caches, and external APIs. + egress=[ + aws.ec2.SecurityGroupEgressArgs( + protocol="-1", # -1 means all protocols + from_port=0, + to_port=0, + cidr_blocks=["0.0.0.0/0"], # Allow outbound traffic to anywhere + description="Allow all outbound traffic", + ) + ], + + tags={ + "Name": "antenna-eb-sg-pulumi" + } +) + +# --------------------------------------------------------- +# RDS (PostgreSQL) Security Group +# --------------------------------------------------------- +# This security group protects the PostgreSQL database. +# +# Design: +# - Database does NOT accept public traffic +# - Only the application SG and your IP can connect +# --------------------------------------------------------- + +rds_sg = aws.ec2.SecurityGroup( + "antenna-rds-sg-pulumi", + description="Security group for RDS PostgreSQL", + vpc_id=default_vpc.id, + + ingress=[ + # Allow Postgres access FROM the Elastic Beanstalk application + aws.ec2.SecurityGroupIngressArgs( + protocol="tcp", + from_port=5432, # PostgreSQL default port + to_port=5432, + security_groups=[eb_sg.id], + description="Postgres access from application", + ), + + # Allow Postgres access FROM your IP for manual admin/debugging + aws.ec2.SecurityGroupIngressArgs( + protocol="tcp", + from_port=5432, + to_port=5432, + cidr_blocks=[""], # Replace with your IP + description="Postgres admin access", + ), + ], + + # Databases need open egress so they can respond to clients. + # Security groups are stateful, so this does NOT expose the DB publicly. + egress=[ + aws.ec2.SecurityGroupEgressArgs( + protocol="-1", + from_port=0, + to_port=0, + cidr_blocks=["0.0.0.0/0"], + ) + ], + + tags={ + "Name": "antenna-rds-sg-pulumi" + } +) + +# --------------------------------------------------------- +# Redis Security Group +# --------------------------------------------------------- +# This security group protects the Redis cache. +# +# Design mirrors RDS: +# - EB App can connect +# - You can connect from your IP +# - No public access +# --------------------------------------------------------- + +redis_sg = aws.ec2.SecurityGroup( + "antenna-redis-sg-pulumi", + description="Security group for Redis", + vpc_id=default_vpc.id, + + ingress=[ + # Allow Redis access FROM the Elastic Beanstalk application + aws.ec2.SecurityGroupIngressArgs( + protocol="tcp", + from_port=6379, # Redis default port + to_port=6379, + security_groups=[eb_sg.id], + description="Redis access from application", + ), + + # Allow Redis access FROM your IP for debugging/admin + aws.ec2.SecurityGroupIngressArgs( + protocol="tcp", + from_port=6379, + to_port=6379, + cidr_blocks=[""], # Replace with your IP + description="Redis admin access", + ), + ], + + # Open egress so Redis can respond to allowed inbound connections + egress=[ + aws.ec2.SecurityGroupEgressArgs( + protocol="-1", + from_port=0, + to_port=0, + cidr_blocks=["0.0.0.0/0"], + ) + ], + + tags={ + "Name": "antenna-redis-sg-pulumi" + } +) + +# --------------------------------------------------------- +# Export Security Group IDs +# --------------------------------------------------------- + +pulumi.export("eb_sg_id", eb_sg.id) +pulumi.export("rds_sg_id", rds_sg.id) +pulumi.export("redis_sg_id", redis_sg.id) diff --git a/aws-infra/networking/subnets.py b/aws-infra/networking/subnets.py new file mode 100644 index 000000000..261e3fa92 --- /dev/null +++ b/aws-infra/networking/subnets.py @@ -0,0 +1,77 @@ +""" +Subnets strategy: + +Redis (ElastiCache) uses the AWS default subnets in the default VPC because ElastiCache nodes do not receive public IPs +by default and are never directly exposed to the internet. Access to Redis is strictly controlled through a dedicated +security group that only allows inbound traffic from the application security group and a single admin IP for debugging, +which is sufficient to keep it private. + + +In contrast, RDS requires explicitly creating private database subnets because AWS mandates that RDS be launched +inside a DB subnet group. By creating new private subnets across multiple AZs, we ensure the database has no internet +gateway routes and cannot be accidentally exposed. + + +Since default subnets typically occupy the first /20 CIDR blocks, we carve out new /20 ranges away from the defaults +to maintain clear separation and stronger network isolation for the database. +""" + +import ipaddress +import pulumi +import pulumi_aws as aws + + +# --------------------------------------------------------- +# Fetch the default VPC +# --------------------------------------------------------- +default_vpc = aws.ec2.get_vpc(default=True) + + +# --------------------------------------------------------- +# Redis: Use AWS DEFAULT subnets (in the default VPC) +# --------------------------------------------------------- + +# Only keep AWS "default subnets" +aws_default_subnet_ids = aws.ec2.get_subnets( + filters=[ + {"name": "vpc-id", "values": [default_vpc.id]}, + {"name": "default-for-az", "values": ["true"]}, + ] +).ids + +redis_default_subnets = [ + aws.ec2.Subnet.get(f"aws-default-subnet-{i}", subnet_id) + for i, subnet_id in enumerate(aws_default_subnet_ids) +] + +pulumi.export("redis_default_subnet_ids", aws_default_subnet_ids) + + +# --------------------------------------------------------- +# RDS: Create 3 NEW PRIVATE DB subnets +# --------------------------------------------------------- +vpc_cidr = ipaddress.ip_network(default_vpc.cidr_block) + +# Split /16 into /20s +cidr_blocks = list(vpc_cidr.subnets(new_prefix=20)) + +# SAFETY: skip first 4 blocks (these are typically AWS default subnets) +db_cidrs = [cidr_blocks[4], cidr_blocks[7], cidr_blocks[8]] # exactly 3 subnets + + +# Pick 3 AZs (High Availability) +azs = aws.get_availability_zones(state="available").names[:3] + +private_db_subnets = [] +for i, az in enumerate(azs): + subnet = aws.ec2.Subnet( + f"antenna-pulumi-private-{az[-1]}", # ex: antenna-pulumi-private-a, antenna-pulumi-private-b + vpc_id=default_vpc.id, + cidr_block=str(db_cidrs[i]), + availability_zone=az, + map_public_ip_on_launch=False, + tags={"Name": f"antenna-pulumi-private-{az[-1]}"}, + ) + private_db_subnets.append(subnet) + +pulumi.export("private_db_subnet_ids", [s.id for s in private_db_subnets]) diff --git a/aws-infra/networking/vpc.py b/aws-infra/networking/vpc.py new file mode 100644 index 000000000..f155c05f1 --- /dev/null +++ b/aws-infra/networking/vpc.py @@ -0,0 +1,8 @@ +import pulumi +import pulumi_aws as aws + +# We do not create a VPC. We just use the AWS default VPC. + +default_vpc = aws.ec2.get_vpc(default=True) + +pulumi.export("vpc_id", default_vpc.id) diff --git a/aws-infra/pulumi_docs/installation.md b/aws-infra/pulumi_docs/installation.md new file mode 100644 index 000000000..6522ce268 --- /dev/null +++ b/aws-infra/pulumi_docs/installation.md @@ -0,0 +1,397 @@ +# Infrastructure Setup (Pulumi + AWS) + +This document explains how to deploy the Antenna infrastructure using Pulumi on AWS and how to complete required post-deployment steps (Django migrations) that Pulumi does not handle automatically. + +--- + +## Prerequisites + +- AWS account with sufficient permissions +- Python 3.10+ +- AWS CLI +- Pulumi CLI +- Docker (local only, for image builds if enabled) + +--- + +## 1. Install AWS CLI + +### macOS +```bash +brew install awscli +``` + +### Windows (PowerShell) +```powershell +winget install Amazon.AWSCLI +``` + +--- + +## 2. Configure AWS CLI +```bash +aws configure +``` + +You will be prompted for: + +- AWS Access Key ID +- AWS Secret Access Key +- Default region: `us-west-2` +- Default output format: `json` + +--- + +## 3. Install Pulumi + +### macOS +```bash +brew install pulumi +``` + +### Windows (PowerShell) +```powershell +winget install Pulumi.Pulumi +``` + +--- + +## 4. Pulumi Login + +Pulumi requires an access token. + +```bash +export PULUMI_ACCESS_TOKEN= # macOS +setx PULUMI_ACCESS_TOKEN # Windows + +pulumi login +``` + +--- + +## 5. Initialize Pulumi Project +```bash +pulumi new aws-python --name aws_infra --stack dev --yes --force +``` + +This creates: +- Pulumi.yaml +- Pulumi.dev.yaml +- Python virtual environment (if needed) + +--- + +## 6. Pulumi Configuration (REQUIRED) : RUN ONCE IN YOUR TERMINAL +```bash + + +# ========================================================= +# AWS & Pulumi BASICS +# ========================================================= + +# AWS region where all infrastructure will be deployed +pulumi config set aws:region us-west-2 + + +# ========================================================= +# PROJECT / ENVIRONMENT METADATA +# ========================================================= + +# Project name +pulumi config set aws_infra:project antenna + +# Deployment environment +pulumi config set aws_infra:environment prod + +# AWS account ID where resources will be created +pulumi config set aws_infra:aws_account_id + + +# ========================================================= +# DJANGO APPLICATION SETTINGS +# ========================================================= + +# Django settings module to use in this environment +pulumi config set aws_infra:DJANGO_SETTINGS_MODULE "config.settings.production" + +# Disable debug mode in production +pulumi config set aws_infra:DJANGO_DEBUG "False" + +# Allowed hosts for Django (can be restricted later) +pulumi config set aws_infra:DJANGO_ALLOWED_HOSTS "*" + +# Disable forced HTTPS redirect at Django level +# (TLS may be terminated elsewhere, e.g., ALB or CloudFront) +pulumi config set aws_infra:DJANGO_SECURE_SSL_REDIRECT "False" + +# Admin URL for basic security hardening +pulumi config set aws_infra:DJANGO_ADMIN_URL + + +# ========================================================= +# POSTGRES (RDS) DATABASE CONFIG +# ========================================================= + +# Database username +pulumi config set aws_infra:POSTGRES_USER "postgres" + +# Database name +pulumi config set aws_infra:POSTGRES_DB "postgres" + +# Database port (PostgreSQL default) +pulumi config set aws_infra:POSTGRES_PORT "5432" + + +# ========================================================= +# REDIS (ELASTICACHE) CONFIG +# ========================================================= + +# Redis port (default) +pulumi config set aws_infra:REDIS_PORT "6379" + + +# ========================================================= +# DJANGO STATIC FILES (S3 STORAGE) +# ========================================================= + +# S3 bucket used for Django static/media files +pulumi config set aws_infra:DJANGO_AWS_STORAGE_BUCKET_NAME "antenna-prod-assets-pulumi" + +# AWS region where the S3 bucket lives +pulumi config set aws_infra:DJANGO_AWS_S3_REGION_NAME "us-west-2" + + +# ========================================================= +# SERVICE ENDPOINTS & HEALTH CHECKS +# ========================================================= + +# Internal endpoint for the ML / processing service +pulumi config set aws_infra:DEFAULT_PROCESSING_SERVICE_ENDPOINT \ +"http://ml-backend-example:2000" + +# Health check path used by Elastic Beanstalk +pulumi config set aws_infra:EB_HEALTHCHECK "/health/" + + +# ========================================================= +# DEPLOYMENT TOGGLES +# ========================================================= + +# Whether to deploy Elastic Beanstalk infrastructure +pulumi config set aws_infra:deployEb true + +# Whether to build Docker images as part of Pulumi +pulumi config set aws_infra:buildImages true + +# Whether to deploy the frontend +pulumi config set deployFrontend true + + +# ========================================================= +# FRONTEND (UI) BUILD & DEPLOY CONFIG +# ========================================================= + +# Path to frontend source directory +pulumi config set aws_infra:ui_dir ../ui + +# Path to built frontend assets +pulumi config set aws_infra:ui_dist_dir ../ui/build + +# Build the frontend inside Pulumi execution +pulumi config set aws_infra:build_ui_in_pulumi true + +# S3 bucket where the frontend will be deployed +pulumi config set aws_infra:ui_bucket_name antenna-prod-ui-pulumi + + +# ========================================================= +# SECRETS (ENCRYPTED BY PULUMI) +# ========================================================= + +# Django secret key +pulumi config set --secret aws_infra:DJANGO_SECRET_KEY \ + +# SendGrid API key for email delivery +pulumi config set --secret aws_infra:SENDGRID_API_KEY \ + + +# Sentry DSN for error tracking +pulumi config set --secret aws_infra:SENTRY_DSN \ + + + +# ========================================================= +# AWS CREDENTIALS FOR DJANGO (S3 ACCESS) +# ========================================================= + +pulumi config set --secret DJANGO_AWS_ACCESS_KEY_ID \ + + +pulumi config set --secret DJANGO_AWS_SECRET_ACCESS_KEY \ + + + +# ========================================================= +# DATABASE CONNECTION (FULL URL) +# ========================================================= + +# Password +pulumi config set --secret POSTGRES_PASSWORD + +# Full database connection string used by Django +pulumi config set --secret aws_infra:DATABASE_URL \ + + +``` + + +## 7. Deploy Infrastructure +```bash +pulumi up +``` + +This provisions: +- VPC networking +- RDS (Postgres) +- ElastiCache (Redis with TLS enabled) +- ECR repositories +- Elastic Beanstalk (ECS-based) +- Secrets Manager secrets +- Cloudfront + +--- + +## 8. IMPORTANT: Django Migrations (REQUIRED) + +Pulumi does not run Django migrations. + +--- + +## 9. Access Elastic Beanstalk EC2 via SSM +```bash +aws ssm start-session --target --region us-west-2 +``` + +Find the instance ID in: +- AWS Console -> EC2 -> Instances (linked to the EB environment) + +--- + +## 10. Run Django Migrations Inside the Container +```bash +DJANGO_CONTAINER=$(sudo docker ps --format '{{.Names}}' | grep -E 'django-' | head -n 1) +echo "Django container: $DJANGO_CONTAINER" + +sudo docker exec -it "$DJANGO_CONTAINER" sh -lc 'python manage.py migrate --noinput' +``` + +--- + + + +## 11. Verify Application Health +```bash +sudo docker exec -it "$DJANGO_CONTAINER" sh -lc 'python - << "PY" +import urllib.request, socket +socket.setdefaulttimeout(10) + +for path in ["/api/v2/events/", "/api/v2/storage/"]: + url = "http://127.0.0.1:5000" + path + print("\nGET", url) + with urllib.request.urlopen(url) as r: + print("status:", r.status) + print("sample:", r.read(200)) +PY' +``` + +Expected: +- HTTP 200 +- JSON response + +--- + +## Redis + TLS Notes + +Redis runs with TLS enabled. + +Injected URLs intentionally use: +```text +rediss://:6379/0?ssl_cert_reqs=none +``` + +This is required for: +- Django cache +- Celery broker +--- + + +### Access Flower (Celery Dashboard) via SSM Port Forwarding + +Flower runs inside the Elastic Beanstalk ECS host and listens on port 5555. +Even if you open port 5555 on a security group, the Elastic Beanstalk environment DNS does not reliably expose arbitrary ports like :5555 (EB is primarily designed for web traffic via 80/443). + +Because Flower is an admin dashboard, we intentionally do not expose it publicly. +Instead, we use AWS SSM Port Forwarding to securely tunnel port 5555 from the EB EC2 instance to your laptop. + +Why this approach + +Secure by default: Flower is not public on the internet. + +No IP allowlist headaches: your IP can change (Wi-Fi/VPN), but SSM still works. + +Works even when EB DNS doesn’t serve :5555 + +Auditable: access is logged via AWS Systems Manager. + +## Step 1: Find the EB EC2 Instance ID + +If you know the EB environment name (for example antenna-django-eb-env-pulumi-v2), you can fetch the instance ID like this: +```bash + +aws ec2 describe-instances \ + --region us-west-2 \ + --filters "Name=tag:elasticbeanstalk:environment-name,Values=antenna-django-eb-env-pulumi-v2" "Name=instance-state-name,Values=running" \ + --query "Reservations[].Instances[].InstanceId" \ + --output text + +``` + +Copy the output instance ID, for example: + +i-050060d3e7473792b + +## Step 2: Start an SSM Port Forwarding Session (5555 -> localhost:5555) + +Run this on your laptop: + +```bash +aws ssm start-session \ + --region us-west-2 \ + --target i-050060d3e7473792b \ + --document-name AWS-StartPortForwardingSession \ + --parameters '{"portNumber":["5555"],"localPortNumber":["5555"]}' +``` + +Keep this terminal session running. + +## Step 3: Open Flower in Your Browser + +Once the port-forward session is active, open: + +http://localhost:5555 + +You should see the Flower UI. + +--- + + + +## References + +- Pulumi Docs: https://www.pulumi.com/docs/ +- AWS Systems Manager: https://docs.aws.amazon.com/systems-manager/ +- Elastic Beanstalk: https://docs.aws.amazon.com/elasticbeanstalk/ + + + + diff --git a/aws-infra/rds.py b/aws-infra/rds.py new file mode 100644 index 000000000..fc01b49dc --- /dev/null +++ b/aws-infra/rds.py @@ -0,0 +1,143 @@ +import pulumi +import pulumi_aws as aws + +from networking.subnets import private_db_subnets +from networking.security_group import rds_sg + +config = pulumi.Config() + +# ----------------------------- +# DB config keys +# ----------------------------- +POSTGRES_USER = config.get("POSTGRES_USER") or "postgres" +POSTGRES_DB = config.get("POSTGRES_DB") or "postgres" + +POSTGRES_PORT = int(config.require("POSTGRES_PORT")) + + +POSTGRES_PASSWORD = config.require_secret("POSTGRES_PASSWORD") + +# ----------------------------- +# Get AWS-managed RDS KMS key ARN +# ----------------------------- +rds_kms_key = aws.kms.get_key(key_id="alias/aws/rds") + +# --------------------------------------------------------- +# IAM Role for Enhanced Monitoring +# --------------------------------------------------------- +monitoring_role = aws.iam.Role( + "antenna-rds-monitoring-role-pulumi", + name="antenna-rds-monitoring-role-pulumi", + assume_role_policy="""{ + "Version": "2012-10-17", + "Statement": [{ + "Effect": "Allow", + "Principal": { "Service": "monitoring.rds.amazonaws.com" }, + "Action": "sts:AssumeRole" + }] + }""", + tags={ + "Name": "antenna-rds-monitoring-role-pulumi", + "ManagedBy": "Pulumi", + "Project": "Antenna", + }, +) + +aws.iam.RolePolicyAttachment( + "antenna-rds-monitoring-policy-attach-pulumi", + role=monitoring_role.name, + policy_arn="arn:aws:iam::aws:policy/service-role/AmazonRDSEnhancedMonitoringRole", +) + +# --------------------------------------------------------- +# RDS Subnet Group +# --------------------------------------------------------- +rds_subnet_group = aws.rds.SubnetGroup( + "antenna-private-db-subnet-group-pulumi", + name="antenna-private-db-subnet-group-pulumi", + description="Private DB subnet group for Antenna Postgres (Pulumi)", + subnet_ids=[s.id for s in private_db_subnets], + tags={ + "Name": "antenna-private-db-subnet-group-pulumi", + "ManagedBy": "Pulumi", + "Project": "Antenna", + }, +) + +# --------------------------------------------------------- +# RDS Instance +# --------------------------------------------------------- +rds_instance = aws.rds.Instance( + "antenna-postgres1-pulumi", + identifier="antenna-postgres1-pulumi", + + # Engine + engine="postgres", + engine_version="17.6", + instance_class="db.t4g.small", + + # DB init + username=POSTGRES_USER, + password=POSTGRES_PASSWORD, + db_name=POSTGRES_DB, + port=POSTGRES_PORT, + + + # Networking + db_subnet_group_name=rds_subnet_group.name, + vpc_security_group_ids=[rds_sg.id], + publicly_accessible=False, + + # Storage + storage_type="gp3", + allocated_storage=400, + iops=12000, + storage_throughput=500, + max_allocated_storage=1000, + + # Encryption + storage_encrypted=True, + kms_key_id=rds_kms_key.arn, + + # Availability + multi_az=False, + + # Backups + backup_retention_period=7, + backup_window="10:40-11:10", + + # Maintenance + maintenance_window="wed:06:24-wed:06:54", + auto_minor_version_upgrade=True, + + # Enhanced Monitoring + monitoring_interval=60, + monitoring_role_arn=monitoring_role.arn, + + # Performance Insights + performance_insights_enabled=True, + performance_insights_retention_period=7, + + # Destroy behavior + deletion_protection=False, + skip_final_snapshot=True, + + tags={ + "Name": "antenna-postgres1-pulumi", + "ManagedBy": "Pulumi", + "Project": "Antenna", + }, +) + +# --------------------------------------------------------- +# Outputs +# --------------------------------------------------------- +pulumi.export("rds_instance_id", rds_instance.id) +pulumi.export("rds_endpoint", rds_instance.endpoint) +pulumi.export("rds_address", rds_instance.address) +pulumi.export("rds_port", POSTGRES_PORT) +pulumi.export("rds_subnet_group", rds_subnet_group.name) +pulumi.export("rds_security_group_id", rds_sg.id) +pulumi.export("private_db_subnet_ids", [s.id for s in private_db_subnets]) +pulumi.export("rds_monitoring_role_arn", monitoring_role.arn) + diff --git a/aws-infra/redis.py b/aws-infra/redis.py new file mode 100644 index 000000000..e1440a975 --- /dev/null +++ b/aws-infra/redis.py @@ -0,0 +1,112 @@ +import pulumi +import pulumi_aws as aws + +from networking.subnets import redis_default_subnets +from networking.security_group import redis_sg + +config = pulumi.Config() + + +# ------------------------------------------------------------------- +REDIS_REPLICATION_GROUP_ID = "antenna-redis-pulumi" +REDIS_SUBNET_GROUP_NAME = "antenna-redis-subnet-group" + +# Redis port from Pulumi config +REDIS_PORT = int(config.get("REDIS_PORT") or "6379") + + +# ------------------------------------------------------------------- +# Uses AWS DEFAULT subnets in default VPC (redis_default_subnets) +# ------------------------------------------------------------------- +def get_or_create_subnet_group(): + try: + existing = aws.elasticache.get_subnet_group(name=REDIS_SUBNET_GROUP_NAME) + pulumi.log.info( + f"[Redis] Using existing ElastiCache subnet group: {existing.name}" + ) + return existing.name + except Exception: + pulumi.log.warn( + f"[Redis] Subnet group '{REDIS_SUBNET_GROUP_NAME}' not found. Creating new one..." + ) + + sg = aws.elasticache.SubnetGroup( + "antenna-redis-subnet-group-pulumi", + name=REDIS_SUBNET_GROUP_NAME, + description="Subnet group for Antenna Redis (default VPC default subnets)", + subnet_ids=[s.id for s in redis_default_subnets], + tags={ + "Name": REDIS_SUBNET_GROUP_NAME, + "ManagedBy": "Pulumi", + "Project": "Antenna", + }, + ) + return sg.name + + +redis_subnet_group_name = get_or_create_subnet_group() + + +# ------------------------------------------------------------------- +# Redis Replication Group +# - Cluster mode disabled +# - Multi-AZ disabled +# - Auto-failover disabled +# - Engine Redis 7.1 +# - cache.t4g.micro +# - TLS required +# - no access control +# ------------------------------------------------------------------- +redis = aws.elasticache.ReplicationGroup( + "antenna-redis-pulumi", + replication_group_id=REDIS_REPLICATION_GROUP_ID, + description="Private Redis cache for Antenna (Celery broker + backend)", + + engine="redis", + engine_version="7.1", + + port=REDIS_PORT, + + node_type="cache.t4g.micro", + parameter_group_name="default.redis7", + + num_cache_clusters=1, # single node, no replicas + multi_az_enabled=False, + automatic_failover_enabled=False, + + # Networking + subnet_group_name=redis_subnet_group_name, + security_group_ids=[redis_sg.id], + + # Security + at_rest_encryption_enabled=True, + transit_encryption_enabled=True, + transit_encryption_mode="required", + + # Backups disabled + snapshot_retention_limit=0, + + # Maintenance window (Fri 09:30 UTC for 1 hour) + maintenance_window="fri:09:30-fri:10:30", + auto_minor_version_upgrade=True, + + tags={ + "Name": REDIS_REPLICATION_GROUP_ID, + "ManagedBy": "Pulumi", + "Project": "Antenna", + }, +) + + +# ------------------------------------------------------------------- +# Outputs +# ------------------------------------------------------------------- +pulumi.export("redis_replication_group_id", redis.id) +pulumi.export("redis_primary_endpoint", redis.primary_endpoint_address) +pulumi.export("redis_reader_endpoint", redis.reader_endpoint_address) + +pulumi.export("redis_port", REDIS_PORT) + +pulumi.export("redis_subnet_group_name", redis_subnet_group_name) +pulumi.export("redis_security_group_id", redis_sg.id) +pulumi.export("redis_subnet_ids", [s.id for s in redis_default_subnets]) diff --git a/aws-infra/requirements.txt b/aws-infra/requirements.txt new file mode 100644 index 000000000..98a7761f0 --- /dev/null +++ b/aws-infra/requirements.txt @@ -0,0 +1,4 @@ +pulumi>=3.0.0,<4.0.0 +pulumi-aws>=7.0.0,<8.0.0 +pulumi-command>=0.9.0,<1.0.0 +boto3>=1.26,<2.0 diff --git a/aws-infra/secrets_manager.py b/aws-infra/secrets_manager.py new file mode 100644 index 000000000..8e65883be --- /dev/null +++ b/aws-infra/secrets_manager.py @@ -0,0 +1,216 @@ +import json +import urllib.parse + +import pulumi +import pulumi_aws as aws + +from rds import rds_instance +from redis import redis, REDIS_PORT + +config = pulumi.Config() +STACK = pulumi.get_stack() +PROJECT = pulumi.get_project() + + +# ========================================================= +# Helper: create Secrets Manager secret + version +# ========================================================= +def create_secret( + key: str, + value: pulumi.Input[str], + description: str = "", +) -> aws.secretsmanager.Secret: + secret_name = f"{PROJECT}-{STACK}-{key}" + + secret = aws.secretsmanager.Secret( + secret_name, + name=secret_name, + description=description or f"Managed by Pulumi: {secret_name}", + tags={ + "Name": secret_name, + "ManagedBy": "Pulumi", + "Project": "Antenna", + "PulumiProject": PROJECT, + "PulumiStack": STACK, + }, + ) + + aws.secretsmanager.SecretVersion( + f"{secret_name}-version", + secret_id=secret.id, + secret_string=value, + ) + + pulumi.export(f"{key}_SECRET_ARN", secret.arn) + return secret + + +# ========================================================= +# TRUE SECRETS (Pulumi encrypted secrets) +# ========================================================= + +DJANGO_SECRET_KEY_secret = create_secret( + "DJANGO_SECRET_KEY", + config.require_secret("DJANGO_SECRET_KEY"), + "Django secret key", +) + +SENDGRID_API_KEY_secret = create_secret( + "SENDGRID_API_KEY", + config.require_secret("SENDGRID_API_KEY"), + "SendGrid API key", +) + +SENTRY_DSN_secret = create_secret( + "SENTRY_DSN", + config.require_secret("SENTRY_DSN"), + "Sentry DSN", +) + +# ========================================================= +# S3 CREDENTIALS (Pulumi encrypted secrets) +# ========================================================= + +DJANGO_AWS_ACCESS_KEY_ID_secret = create_secret( + "DJANGO_AWS_ACCESS_KEY_ID", + config.require_secret("DJANGO_AWS_ACCESS_KEY_ID"), + "AWS access key id for django S3 storage", +) + +DJANGO_AWS_SECRET_ACCESS_KEY_secret = create_secret( + "DJANGO_AWS_SECRET_ACCESS_KEY", + config.require_secret("DJANGO_AWS_SECRET_ACCESS_KEY"), + "AWS secret access key for django S3 storage", +) + +# ========================================================= +# DERIVED SECRETS (Redis / Celery URLs) +# ========================================================= + +REDIS_URL = redis.primary_endpoint_address.apply( + lambda host: f"rediss://{host}:{REDIS_PORT}/0?ssl_cert_reqs=none" +) + +CELERY_BROKER_URL = redis.primary_endpoint_address.apply( + lambda host: f"rediss://{host}:{REDIS_PORT}/0?ssl_cert_reqs=none" +) + +REDIS_URL_secret = create_secret( + "REDIS_URL", + REDIS_URL, + "Redis URL for Django/Celery (TLS)", +) + +CELERY_BROKER_URL_secret = create_secret( + "CELERY_BROKER_URL", + CELERY_BROKER_URL, + "Celery broker URL (TLS)", +) + +# ========================================================= +# Postgres config +# ========================================================= + +POSTGRES_PASSWORD_PULUMI = config.require_secret("POSTGRES_PASSWORD") +POSTGRES_USER = config.require("POSTGRES_USER") + +# ========================================================= +# NON-SECRET CONFIG (needed by runtime) +# ========================================================= + +DJANGO_SETTINGS_MODULE = config.require("DJANGO_SETTINGS_MODULE") +DJANGO_DEBUG = config.require("DJANGO_DEBUG") +DJANGO_ALLOWED_HOSTS = config.require("DJANGO_ALLOWED_HOSTS") +DJANGO_SECURE_SSL_REDIRECT = config.require("DJANGO_SECURE_SSL_REDIRECT") +DJANGO_ADMIN_URL = config.require("DJANGO_ADMIN_URL") + +POSTGRES_DB = config.require("POSTGRES_DB") +POSTGRES_PORT = config.require("POSTGRES_PORT") + +DEFAULT_PROCESSING_SERVICE_ENDPOINT = config.require("DEFAULT_PROCESSING_SERVICE_ENDPOINT") +EB_HEALTHCHECK = config.require("EB_HEALTHCHECK") + +DJANGO_AWS_STORAGE_BUCKET_NAME = config.require("DJANGO_AWS_STORAGE_BUCKET_NAME") +DJANGO_AWS_S3_REGION_NAME = config.require("DJANGO_AWS_S3_REGION_NAME") + +# Derived non-secret +POSTGRES_HOST = rds_instance.address + +# ========================================================= +# Secrets we inject into containers +# ========================================================= + +POSTGRES_PASSWORD_secret = create_secret( + "POSTGRES_PASSWORD", + POSTGRES_PASSWORD_PULUMI, + "Postgres password", +) + +# ========================================================= +# DERIVED DATABASE_URL SECRET (URL-encoded password) +# ========================================================= + +DATABASE_URL = pulumi.Output.all( + POSTGRES_USER, + POSTGRES_PASSWORD_PULUMI, + POSTGRES_HOST, + POSTGRES_PORT, + POSTGRES_DB, +).apply( + lambda args: ( + "postgres://" + f"{args[0]}:{urllib.parse.quote(args[1])}" + f"@{args[2]}:{args[3]}/{args[4]}" + ) +) + +DATABASE_URL_secret = create_secret( + "DATABASE_URL", + DATABASE_URL, + "Full Postgres connection URL for Django", +) + +# ========================================================= +# EB ENV (console-visible only) +# ========================================================= + +EB_ENV = { + # Django (plain) + "DJANGO_SETTINGS_MODULE": DJANGO_SETTINGS_MODULE, + "DJANGO_DEBUG": DJANGO_DEBUG, + "DJANGO_ALLOWED_HOSTS": DJANGO_ALLOWED_HOSTS, + "DJANGO_SECURE_SSL_REDIRECT": DJANGO_SECURE_SSL_REDIRECT, + "DJANGO_ADMIN_URL": DJANGO_ADMIN_URL, + + # Postgres pieces (plain) + "POSTGRES_HOST": POSTGRES_HOST, + "POSTGRES_PORT": POSTGRES_PORT, + "POSTGRES_DB": POSTGRES_DB, + "POSTGRES_USER": POSTGRES_USER, + + # Force SSL + "PGSSLMODE": "require", + + # App config (plain) + "DEFAULT_PROCESSING_SERVICE_ENDPOINT": DEFAULT_PROCESSING_SERVICE_ENDPOINT, + "EB_HEALTHCHECK": EB_HEALTHCHECK, + + # S3 (plain) + "DJANGO_AWS_STORAGE_BUCKET_NAME": DJANGO_AWS_STORAGE_BUCKET_NAME, + "DJANGO_AWS_S3_REGION_NAME": DJANGO_AWS_S3_REGION_NAME, + + # Secret ARNs + "DJANGO_SECRET_KEY_SECRET_ARN": DJANGO_SECRET_KEY_secret.arn, + "POSTGRES_PASSWORD_SECRET_ARN": POSTGRES_PASSWORD_secret.arn, + "DATABASE_URL_SECRET_ARN": DATABASE_URL_secret.arn, + "SENDGRID_API_KEY_SECRET_ARN": SENDGRID_API_KEY_secret.arn, + "SENTRY_DSN_SECRET_ARN": SENTRY_DSN_secret.arn, + "REDIS_URL_SECRET_ARN": REDIS_URL_secret.arn, + "CELERY_BROKER_URL_SECRET_ARN": CELERY_BROKER_URL_secret.arn, + + # S3 credential secrets + "DJANGO_AWS_ACCESS_KEY_ID_SECRET_ARN": DJANGO_AWS_ACCESS_KEY_ID_secret.arn, + "DJANGO_AWS_SECRET_ACCESS_KEY_SECRET_ARN": DJANGO_AWS_SECRET_ACCESS_KEY_secret.arn, +} + +pulumi.export("EB_ENV", EB_ENV) diff --git a/aws-infra/storage/__init__.py b/aws-infra/storage/__init__.py new file mode 100644 index 000000000..a27877e60 --- /dev/null +++ b/aws-infra/storage/__init__.py @@ -0,0 +1,2 @@ +from .s3 import * +from .s3_policies import * diff --git a/aws-infra/storage/s3.py b/aws-infra/storage/s3.py new file mode 100644 index 000000000..7e7b5a03f --- /dev/null +++ b/aws-infra/storage/s3.py @@ -0,0 +1,65 @@ +import pulumi +import pulumi_aws as aws + + +config = pulumi.Config("aws_infra") + +PROJECT = config.require("project") +ENVIRONMENT = config.require("environment") + +bucket_name = f"{PROJECT}-{ENVIRONMENT}-assets-pulumi" + +# --------------------------------------------------------- +# S3 Bucket +# --------------------------------------------------------- +assets_bucket = aws.s3.Bucket( + bucket_name, + bucket=bucket_name, + force_destroy=True, +) + +# --------------------------------------------------------- +# Object Ownership (DISABLE ACLs properly) +# --------------------------------------------------------- +ownership = aws.s3.BucketOwnershipControls( + f"{bucket_name}-ownership", + bucket=assets_bucket.id, + rule=aws.s3.BucketOwnershipControlsRuleArgs( + object_ownership="ObjectWriter" + ), +) + +# --------------------------------------------------------- +# Block Public Access (ON) - safest defaults +# --------------------------------------------------------- +public_access = aws.s3.BucketPublicAccessBlock( + f"{bucket_name}-public-access", + bucket=assets_bucket.id, + block_public_acls=False, + ignore_public_acls=False, + block_public_policy=True, + restrict_public_buckets=True, +) + +# --------------------------------------------------------- +# Default Encryption +# --------------------------------------------------------- +encryption = aws.s3.BucketServerSideEncryptionConfiguration( + f"{bucket_name}-encryption", + bucket=assets_bucket.id, + rules=[ + aws.s3.BucketServerSideEncryptionConfigurationRuleArgs( + apply_server_side_encryption_by_default= + aws.s3.BucketServerSideEncryptionConfigurationRuleApplyServerSideEncryptionByDefaultArgs( + sse_algorithm="AES256" + ), + bucket_key_enabled=True, + ) + ], +) + +# --------------------------------------------------------- +# Exports +# --------------------------------------------------------- +pulumi.export("assets_bucket_name", assets_bucket.bucket) +pulumi.export("assets_bucket_arn", assets_bucket.arn) diff --git a/aws-infra/storage/s3_policies.py b/aws-infra/storage/s3_policies.py new file mode 100644 index 000000000..d08b37227 --- /dev/null +++ b/aws-infra/storage/s3_policies.py @@ -0,0 +1,122 @@ +import json +import pulumi +import pulumi_aws as aws + +from storage.s3 import assets_bucket + +config = pulumi.Config("aws_infra") + +AWS_ACCOUNT_ID = config.require("aws_account_id") +CLOUDFRONT_DISTRIBUTION_ID = config.get("CLOUDFRONT_DISTRIBUTION_ID") + + +# Reference the EB EC2 role created in this stack +from iam_roles import ec2_role_pulumi as eb_ec2_role + + + +# The app is fetching images using *plain S3 HTTPS URLs*, e.g. +# https://.s3.amazonaws.com/ +# +# That request is an *anonymous* HTTP GET (no AWS SigV4 signing), so IAM role +# permissions DO NOT apply. A private bucket will return 403 AccessDenied and +# your API surfaces it as a 500. +# +# Since we don't want to change app code to use boto3 (signed GetObject) or CloudFront +# URLs, the only infrastructure-only fix is: +# 1) allow public read (Principal="*") for s3:GetObject, AND +# 2) turn off the bucket’s PublicAccessBlock settings that currently prevent +# public bucket policies from taking effect. +# +# This makes every object in the bucket publicly readable if someone knows or +# can guess the URL. Only do this if bucket contents are safe to be public. +# Prefer a future fix where the django app uses CloudFront URLs or signed S3 access. + + +# --------------------------------------------------------- +# Public Access Block (bucket-level) +# --------------------------------------------------------- + +public_access_block = aws.s3.BucketPublicAccessBlock( + "assets-bucket-public-access-block", + bucket=assets_bucket.id, + block_public_acls=False, + ignore_public_acls=False, + block_public_policy=False, + restrict_public_buckets=False, +) + +# --------------------------------------------------------- +# Bucket Policy +# --------------------------------------------------------- +bucket_policy = aws.s3.BucketPolicy( + "assets-bucket-policy", + bucket=assets_bucket.id, + policy=pulumi.Output.all( + assets_bucket.bucket, + eb_ec2_role.arn, + ).apply( + lambda args: json.dumps( + { + "Version": "2012-10-17", + "Statement": [ + # 1) CloudFront read-only + { + "Sid": "AllowCloudFrontServicePrincipalOnly", + "Effect": "Allow", + "Principal": {"Service": "cloudfront.amazonaws.com"}, + "Action": ["s3:GetObject"], + "Resource": f"arn:aws:s3:::{args[0]}/*", + "Condition": { + "StringEquals": { + "AWS:SourceArn": ( + f"arn:aws:cloudfront::{AWS_ACCOUNT_ID}:distribution/" + f"{CLOUDFRONT_DISTRIBUTION_ID}" + ) + } + }, + }, + + # 2) EB EC2 role: bucket-level permissions + { + "Sid": "AllowEbEc2RoleBucketAccess", + "Effect": "Allow", + "Principal": {"AWS": args[1]}, + "Action": [ + "s3:ListBucket", + "s3:GetBucketLocation", + ], + "Resource": f"arn:aws:s3:::{args[0]}", + }, + + # 3) EB EC2 role: object-level permissions (RW) + { + "Sid": "AllowEbEc2RoleObjectRW", + "Effect": "Allow", + "Principal": {"AWS": args[1]}, + "Action": [ + "s3:GetObject", + "s3:PutObject", + "s3:DeleteObject", + ], + "Resource": f"arn:aws:s3:::{args[0]}/*", + }, + + # 4) PUBLIC READ (infrastructure-only fix for anonymous S3 URL GETs) + { + "Sid": "AllowPublicReadObjectsForUnsignedHttpGet", + "Effect": "Allow", + "Principal": "*", + "Action": ["s3:GetObject"], + "Resource": f"arn:aws:s3:::{args[0]}/*", + }, + ], + } + ) + ), + # Ensure the public-access-block config is applied before the policy + opts=pulumi.ResourceOptions(depends_on=[public_access_block]), +) + +pulumi.export("assets_bucket_policy_id", bucket_policy.id) +pulumi.export("assets_bucket_public_access_block_id", public_access_block.id)