A complete, production-ready e-commerce platform infrastructure deployed on AWS EKS, demonstrating enterprise-level DevOps practices, cloud architecture, and platform engineering skills.
Live Demo: Deployment Guide
Architecture: System Design
Monitoring: Observability Stack
- β AWS EKS Cluster with multi-AZ deployment
- β Terraform IaC for complete infrastructure automation
- β Auto-scaling with HPA and Cluster Autoscaler
- β Cost Optimization via spot instances and nightly cleanup (~93% savings)
- β Prometheus for metrics collection (15-day retention)
- β Grafana with 3 custom dashboards
- β 15+ Alert Rules for proactive monitoring
- β Custom Metrics for application performance
- β GitHub Actions with 5 automated workflows
- β Multi-environment support (dev/staging/production)
- β Automated Rollback on deployment failures
- β Security Scanning with Trivy on every PR
- β Nightly Cleanup for cost optimization
- β Load Testing with k6 (4 test scenarios)
- β Performance Thresholds (P95 < 500ms, P99 < 1s)
- β Capacity Planning (validated up to 500 concurrent users)
- β Stability Testing (70-minute soak tests)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AWS Cloud β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β VPC (Multi-AZ) β β
β β βββ Public Subnets (3 AZs) β β
β β βββ Private Subnets (3 AZs) β β
β β βββ NAT Gateways (HA) β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β EKS Cluster β β
β β βββ Control Plane (Managed) β β
β β βββ Worker Nodes (On-demand + Spot) β β
β β βββ Cluster Autoscaler β β
β β βββ AWS Load Balancer Controller β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Application Layer β β
β β βββ Frontend (React + Nginx) β β
β β βββ Backend (Node.js + Express) β β
β β βββ Database (PostgreSQL) β β
β β βββ Ingress (ALB) β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Monitoring Stack β β
β β βββ Prometheus (Metrics) β β
β β βββ Grafana (Dashboards) β β
β β βββ AlertManager (Notifications) β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- AWS Account with credentials configured
- Terraform >= 1.6.0
- kubectl >= 1.28
- Docker
- Helm >= 3.0
# 1. Configure AWS credentials
aws configure
# 2. Deploy infrastructure
cd infrastructure/terraform/aws
cp terraform.tfvars.example terraform.tfvars
# Edit terraform.tfvars with your settings
terraform init
terraform apply
# 3. Configure kubectl
aws eks update-kubeconfig --name demo-eks-cluster --region us-east-1
# 4. Deploy application
kubectl apply -k infrastructure/kubernetes/base/
# 5. Install monitoring
cd monitoring
./install-monitoring.sh
# 6. Get application URL
kubectl get ingressmake deploy- Max Concurrent Users: 500
- P95 Latency: < 500ms
- P99 Latency: < 1000ms
- Error Rate: < 5%
- Requests/Second: 200+
- CPU Usage: 40-60% under normal load
- Memory Usage: 50-70% under normal load
- Auto-scaling: 2-20 pods based on demand
- Database Connections: < 80% pool utilization
- Full Deployment: ~$150/month
- With Optimization: ~$10/month
- Savings: 93% via spot instances + nightly cleanup
- Cloud: AWS (EKS, VPC, RDS, ALB)
- IaC: Terraform
- Orchestration: Kubernetes
- Package Manager: Helm
- Frontend: React, Vite, Nginx
- Backend: Node.js, Express
- Database: PostgreSQL
- Containerization: Docker
- Metrics: Prometheus
- Visualization: Grafana
- Alerting: AlertManager
- Logging: CloudWatch
- Pipeline: GitHub Actions
- Registry: Docker Hub
- Testing: k6, Jest
- Security: Trivy
.
βββ infrastructure/
β βββ terraform/aws/ # AWS infrastructure
β βββ kubernetes/ # K8s manifests
βββ app/
β βββ frontend/ # React application
β βββ backend/ # Node.js API
β βββ docker/ # Docker Compose
βββ monitoring/
β βββ prometheus/ # Prometheus config
β βββ grafana/ # Grafana dashboards
βββ ci-cd/
β βββ .github/workflows/ # GitHub Actions
βββ load-tests/
β βββ scenarios/ # k6 test scripts
βββ docs/ # Documentation
βββ screenshots/ # Portfolio screenshots
- Multi-AZ deployment across 3 availability zones
- Auto-scaling based on CPU and memory metrics
- Health checks and automatic pod recovery
- Load balancing with AWS ALB
- Private subnets for application workloads
- Security groups with least privilege
- Secrets management via Kubernetes secrets
- IMDSv2 for EC2 metadata
- Vulnerability scanning in CI/CD
- Real-time metrics with Prometheus
- Custom Grafana dashboards
- Proactive alerting for critical issues
- Application performance monitoring
- Resource utilization tracking
- Spot instances for 70% cost savings
- Nightly infrastructure cleanup
- Auto-scaling to match demand
- Resource limits and requests
- Optional RDS vs in-cluster PostgreSQL
- Cluster CPU and memory usage
- Pod count by namespace
- Node status and health
- Network I/O metrics
- Request rate by endpoint
- Latency percentiles (P50, P95, P99)
- Error rates and status codes
- Active connections
- Connection pool utilization
- Query performance
- Cache hit ratio
- Transaction rates
- Build and Deploy - Triggered on push to main
- PR Checks - Linting, validation, security scanning
- Deploy Infrastructure - Manual Terraform deployment
- Nightly Cleanup - Automated cost optimization
- Destroy - Safe infrastructure teardown
Code Push β Build β Test β Security Scan β Deploy β Health Check β Rollback (if needed)
Full Deployment:
- EKS Control Plane: $73
- EC2 Instances: $60
- RDS (optional): $30
- Load Balancer: $20
- Monitoring: $20
- Total: ~$150/month
Optimized:
- Spot Instances: -70% on EC2
- In-cluster PostgreSQL: -$30
- Nightly Cleanup: -90% uptime
- Total: ~$10/month
- AWS services (EKS, VPC, RDS, ALB, IAM)
- Infrastructure as Code (Terraform)
- Kubernetes orchestration
- Multi-AZ high availability
- Cost optimization strategies
- CI/CD pipeline design
- Automated deployment
- Monitoring and alerting
- Incident response
- Capacity planning
- Container orchestration
- Service mesh concepts
- Auto-scaling strategies
- Resource management
- Performance optimization
- Network security (VPC, security groups)
- Secrets management
- Vulnerability scanning
- Least privilege access
- Security best practices
- Makefile Commands:
make help - Terraform Docs:
infrastructure/terraform/aws/README.md - Kubernetes Docs:
infrastructure/kubernetes/README.md - Monitoring Docs:
monitoring/README.md
This is a portfolio project for demonstration purposes.