An enterprise-grade, multi-cloud Terraform platform for provisioning and governing infrastructure across AWS, Azure, and Google Cloud with strict environment isolation, policy enforcement, drift detection, and hardened CI/CD.
flowchart LR
DEV[Developer PR] --> PLAN[Terraform Plan Workflow]
PLAN --> CKV[Checkov Scan]
PLAN --> OPA[OPA Custom Policies]
PLAN --> COST[Infracost]
CKV --> REVIEW[PR Comment with Plan + Cost + Policy]
OPA --> REVIEW
COST --> REVIEW
REVIEW --> MERGE[Merge to main]
MERGE --> APPLY[Terraform Apply Workflow]
APPLY --> AWS[AWS Accounts]
APPLY --> AZ[Azure Subscriptions]
APPLY --> GCP[GCP Projects]
DRIFT[Scheduled Drift Detection] --> AWS
DRIFT --> AZ
DRIFT --> GCP
DRIFT --> ISSUE[Auto-create GitHub Issue]
This platform is built around five core principles:
- Multi-cloud parity: consistent IaC patterns across AWS, Azure, and GCP.
- Environment isolation: separate dev, staging, and prod state and deployment flows.
- Governance first: cloud-native org policies plus policy-as-code checks in CI.
- Safe delivery: PR-based planning, artifacted outputs, controlled applies, and state locking.
- Operational resilience: drift detection, concurrency controls, and failure notifications.
For each cloud, modular Terraform components are organized by domain:
- Networking
- Compute
- Database
Each cloud has dedicated environment directories:
- dev
- staging
- prod
Each environment composes modules and can receive distinct variables, policy guardrails, and approvals.
- AWS Service Control Policies (SCPs)
- Azure Policy assignments
- GCP Organization Policies
- CI static policy checks (Checkov)
- CI custom policy checks (OPA)
.
├── .github/
│ └── workflows/
│ ├── terraform-plan.yml
│ ├── terraform-apply.yml
│ └── drift-detection.yml
├── infra/
│ ├── modules/
│ │ ├── aws/
│ │ │ ├── networking/
│ │ │ ├── compute/
│ │ │ └── database/
│ │ ├── azure/
│ │ │ ├── networking/
│ │ │ ├── compute/
│ │ │ └── database/
│ │ └── gcp/
│ │ ├── networking/
│ │ ├── compute/
│ │ └── database/
│ ├── envs/
│ │ ├── dev/
│ │ ├── staging/
│ │ └── prod/
│ └── global/
│ ├── org-policies/
│ │ ├── aws/
│ │ ├── azure/
│ │ └── gcp/
│ └── policies/
│ ├── checkov/
│ └── opa/
├── backend/
└── frontend/
Triggered on pull requests affecting infra/workflows.
Pipeline stages:
- Authenticate by cloud using OIDC.
- terraform init.
- Checkov scan.
- terraform validate.
- terraform plan.
- OPA evaluation on plan JSON.
- Infracost estimate.
- Post enriched PR comment (changes, cost, policy status).
Triggered on push to main when infra/workflows change.
Pipeline stages:
- Authenticate by cloud using OIDC.
- terraform init.
- Checkov scan.
- terraform plan with lock timeout.
- OPA policy evaluation for prod.
- terraform apply with lock timeout.
- Slack alert on failure.
Hardening:
- Per cloud+environment concurrency group to prevent overlapping apply runs.
- State lock timeout to reduce lock race failures.
Triggered every 6 hours and on demand.
Behavior:
- Runs terraform plan -detailed-exitcode across cloud/environment matrix.
- If drift exists (exit code 2), creates a GitHub issue with a plan excerpt.
SCP policy-as-code definitions include examples for:
- Deny disabling security services (CloudTrail/Config).
- Deny leaving organization.
- Deny KMS key deletion (except break-glass role pattern).
- Restrict regions.
Subscription policy assignments include:
- Allowed locations.
- Required Environment tag value.
- Deny public network access for storage.
- Allowed VM SKUs.
Folder-level organization policies include:
- Disable serial port access.
- Restrict Cloud SQL public IP.
- Enforce uniform bucket-level access.
Current baseline custom policy:
- Deny aws_instance type t2.micro in production plans.
Path:
- infra/global/policies/opa/deny_t2_micro_prod.rego
- Terraform 1.6+
- Cloud credentials via OIDC or local authenticated sessions
- Optional local tools for parity with CI:
- Python 3.11+
- Checkov
- OPA
- Infracost
- jq
AWS dev:
cd infra/envs/dev/aws
terraform init
terraform validate
terraform plan -lock-timeout=300sAzure dev:
cd infra/envs/dev/azure
terraform init
terraform validate
terraform plan -lock-timeout=300sGCP dev:
cd infra/envs/dev/gcp
terraform init
terraform validate
terraform plan -lock-timeout=300sPrinciples:
- Keep non-sensitive settings in environment-specific variable files.
- Inject secrets through GitHub Actions secrets or a cloud secret manager.
- Do not commit live credentials.
Recommended GitHub secrets:
- DB_PASSWORD
- DB_USERNAME
- AWS_ROLE_ARN
- AZURE_CLIENT_ID
- AZURE_TENANT_ID
- AZURE_SUBSCRIPTION_ID
- GCP_WORKLOAD_IDENTITY_PROVIDER
- GCP_SERVICE_ACCOUNT
- GCP_PROJECT_ID
- SLACK_WEBHOOK_URL
- SCP/Policy/Org policy guardrails are preventive controls at runtime.
- Checkov and OPA are preventive controls at PR/apply-time.
- Drift detection is detective control for out-of-band changes.
- Infracost adds financial risk visibility before merges.
- Concurrency and lock-timeouts reduce state corruption risk.
- Enable workflows for dev only first.
- Validate policy behavior and false positives.
- Expand matrix to staging.
- Add manual approvals for prod environments in GitHub environments.
- Expand alerting to on-call channels.
-
Plan fails on missing sensitive variables: Set required TF_VAR_* values or configure GitHub secrets.
-
Checkov failures block PR: Fix resources or add explicitly reviewed exceptions.
-
OPA fails in prod: Inspect plan.json and rego outputs to identify denied resources.
-
Drift issue created unexpectedly: Review plan diff and decide between reconcile-to-code or accept-and-import.
-
Apply blocked by lock: Wait for existing run completion or investigate stale lock in backend.
| Capability | Status |
|---|---|
| Multi-cloud Terraform modules | Complete |
| Env isolation (dev/staging/prod) | Complete |
| PR plan + comments | Complete |
| Cloud governance (AWS/Azure/GCP) | Complete |
| Policy-as-code (Checkov + OPA) | Complete |
| Drift detection | Complete |
| Cost estimation in PR | Complete |
| Apply hardening + notifications | Complete |
This repository is designed as an enterprise platform foundation. Adapt policy IDs, region restrictions, and approval controls to your organizational standards before production onboarding.