This is the worker environment for the CodeReviewer application. It handles long-running security scan tasks via an SQS queue.
The worker environment:
- HTTP server that receives POST requests from Elastic Beanstalk's SQS daemon (sqsd)
- sqsd automatically polls the SQS queue and forwards messages as HTTP POST requests
- Runs security scanning tools (Semgrep, OpenGrep, Gitleaks, Checkov, Trivy)
- Updates scan results in the PostgreSQL database
- Runs on AWS Elastic Beanstalk Worker tier with t4g.medium instances
SQS Queue → sqsd (Beanstalk daemon) → HTTP POST → Worker App
↓
Process Job
↓
Return 200 OK / 500 Error
↓
sqsd deletes message (if 200)
or retries (if 500)
Module System: This worker uses ES Modules (ESM) ("type": "module" in package.json). This is required because web-tree-sitter is an ESM-only package. Do not convert back to CommonJS as it will break the tree-sitter parser.
- Install dependencies:
npm install- Configure environment variables (
.env):
POSTGRES_URL=postgresql://user:password@host:port/database
PORT=8080
# GitHub App Configuration (required for private repo access)
GITHUB_APP_ID=your_github_app_id
GITHUB_APP_PRIVATE_KEY_PATH=./config/github-app-private-key.pemNote: SQS queue configuration is handled by Elastic Beanstalk's sqsd daemon, not by the application directly.
The worker uses GitHub App authentication to access private repositories. This is the same GitHub App used by the main CodeReviewer application.
How it works:
- The main app stores GitHub App installations in the database with access tokens
- When a security scan job is queued, it includes the repo's installation ID
- The worker fetches or refreshes the installation token from the database
- The token is used to authenticate with GitHub when downloading the repository
Configuration:
GITHUB_APP_ID: Your GitHub App ID (found in GitHub App settings)GITHUB_APP_PRIVATE_KEY_PATH: Absolute path to the PEM file for your GitHub App private key. By default the repo ships withconfig/github-app-private-key.pemso you can mount or replace it as needed.- (Legacy)
GITHUB_APP_PRIVATE_KEY: Only for backwards compatibility. Prefer storing the PEM on disk and pointingGITHUB_APP_PRIVATE_KEY_PATHat it.
The worker will automatically:
- Use cached installation tokens if still valid (1-hour expiry)
- Refresh tokens when expired or about to expire
- Fall back to public repo access if no authentication is available
- Build:
npm run build- Run locally (for testing):
npm run devThe worker is deployed to AWS Elastic Beanstalk Worker tier via the infrastructure in CodeReviewerInfra/elastic_beanstalk_worker.tf.
- Build the application:
npm run build- Create a deployment package:
zip -r worker-deploy.zip package.json dist/ src/ .ebextensions/ Dockerfile- Deploy to Elastic Beanstalk:
eb deploy codereview-production-workerThe worker environment includes the following security scanning tools:
- Semgrep: SAST using Trail of Bits rules
- OpenGrep: Fast SAST fork of Semgrep
- Gitleaks: Secret and credential detection
- Checkov: Infrastructure as Code scanning
- Trivy: Dependency vulnerability scanning
All tools are installed via .ebextensions/01_security_tools.config.
The worker receives HTTP POST requests from sqsd with the job in the request body:
{
"scanId": "scan_123456_abc",
"repoId": 1,
"repoUrl": "https://github.com/owner/repo.git",
"branch": "main",
"installationId": "12345678"
}Legacy format (still supported):
{
"scanId": "scan_123456_abc",
"repoId": 1,
"repoUrl": "https://github.com/owner/repo.git",
"branch": "main",
"token": "ghp_..."
}The worker:
- Receives HTTP POST request from sqsd (on port 8080, path
/) - Parses the job from the request body
- Downloads the repository
- Runs all security scanners in parallel
- Updates the database with results
- Returns HTTP 200 (success) or 500 (failure)
- sqsd automatically deletes the message from SQS if 200, or retries if 500
- CloudWatch Logs:
/aws/elasticbeanstalk/codereview-production-worker/ - CloudWatch Alarms:
codereview-production-sqs-dlq-messages: Alerts when messages appear in the dead letter queuecodereview-production-sqs-message-age: Alerts when messages are not being processed
- Check CloudWatch Logs for errors
- Verify worker HTTP server is running (should see "Worker HTTP server listening on port 8080")
- Check sqsd is configured correctly in Beanstalk worker settings
- Verify security tools are installed correctly
- Test the worker endpoint manually:
curl -X POST http://localhost:8080/ -d '{"scanId":"test","repoId":1,...}'
The worker has a 1-hour timeout for each scan job. If scans are taking longer:
- Increase the SQS visibility timeout
- Increase the worker instance size
- Optimize the scan configuration (skip certain tools)
- Create a new scanner file in
src/security/scanners/ - Add the scanner to
orchestrator.ts - Update
.ebextensions/01_security_tools.configto install the tool - Update the Dockerfile to include the tool installation