🚀 I build production-grade cloud platforms that scale
⚙️ I automate everything that shouldn’t be manual
🔥 I run systems after go-live, not just diagrams
- Design and operate high-availability Kubernetes platforms
- Run GPU-backed ML & LLM inference workloads
- Build CI/CD pipelines that ship fast without breaking prod
- Optimize cloud cost, performance, and reliability
- Help engineers unblock themselves and ship with confidence
I care about uptime, p99s, and root cause analysis — not buzzwords.
Cloud : AWS • GCP • Hybrid / On-Prem
Containers : Kubernetes • Docker • Helm
MLOps : vLLM • NVIDIA Triton • CUDA • MIG • MLflow
IaC : Terraform • Ansible
CI/CD : Argo CD • Jenkins • GitHub Actions
Observability: Prometheus • Grafana • Loki • VictoriaMetrics
Security : Vault • Kyverno • Trivy • Falco
Languages : Python • Bash • Go (working knowledge)
- ☸️🐋 Migrated 60+ monoliths → microservices, scaling to 10k RPS with p99 < 500ms
- 🤖 Deployed LLM inference platforms with multi-GPU scheduling with NVIDIA MIG
- 🔄 Built CI/CD systems delivering zero-downtime releases
- 📊 Ran data pipelines handling billions of clickstream events per day
- 💸 Cut cloud costs by 25%+ without sacrificing performance
- 🚨 Owned production incidents, led RCAs, and fixed problems permanently
- Production-first mindset
- Strong bias toward automation
- Calm during incidents
- Direct, honest communication
- Big believer in “teach, don’t gatekeep”
If developers are stuck — I jump in.
- 💼 LinkedIn: https://www.linkedin.com/in/sam-sundar-6a4746130/
- 📧 Email: samsundar997@yahoo.com
- 🌍 Location: San Diego, CA
💡 “Make it reliable first. Make it fast second. Make it pretty later.”




