Short summary: AURA is a research platform to build a Multi-Agent RL autoscaler that coordinates horizontal scaling (replicas + optional soft resource limits) of a 3-tier microservice (frontend → backend → database). We train in simulation, validate locally on k3d, then run final experiments on a real Kubernetes cluster to compare MARL vs HPA/VPA baselines.
AURA/
├── docker-compose.yml
├── infra/
│ ├── k3d-setup.md
│ ├── manifests/
│ │ └── three-tier/
│ └── helm/
├── microservices/
│ ├── three-tier/
│ └── locust/
├── metrics/
│ ├── prometheus/
│ └── grafana/
├── simulator/
├── marl/
├── deployment/
├── experiments/
├── docs/
└── tools/
├── setup_project.sh
├── setup_k3d.sh
├── deploy_stack.sh
└── run_training.sh
These commands assume the repository is already cloned and you are in the repository root. Do not use absolute paths.
# from wherever you keep your code locally
git clone https://github.com/Zane-Dev14/AURA.git
cd AURA# ensure tools are executable and run
chmod +x tools/setup_k3d.sh
./tools/setup_k3d.sh
# deploy manifests (3-tier app + monitoring)
chmod +x tools/deploy_stack.sh
./tools/deploy_stack.sh1️⃣ Prerequisites
Install the following on your machine:
Docker Desktop (with WSL 2 backend if on Windows)
kubectl
Helm
k3d
Check versions:
docker --version kubectl version --client helm version k3d version
2️⃣ Create k3d Cluster cd infra k3d cluster create aura --config k3d-cluster.yaml
Exposes NodePorts for Prometheus (30090) and Grafana (32322).
Cluster has 1 server + 2 agents.
Set kubeconfig for your shell session:
export KUBECONFIG=$(pwd)/kubeconfig-aura.yaml kubectl get nodes
3️⃣ Install Prometheus + Grafana
Add Helm repo:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts helm repo update
Install stack:
kubectl create namespace monitoring || true
helm install kube-prom prometheus-community/kube-prometheus-stack
-n monitoring -f metrics/prometheus/prometheus-values.yaml
Verify pods:
kubectl get pods -n monitoring kubectl get svc -n monitoring
Access UIs in browser:
Prometheus → http://localhost:30090
Grafana → http://localhost:32322 (user: admin, pass: admin)
4️⃣ Deploy Microservices
Apply manifests:
kubectl apply -f infra/manifests/three-tier/
Check pods/services:
kubectl get pods kubectl get svc
Port-forward microservices for local testing:
kubectl port-forward svc/api 5001:5001 curl http://localhost:5001/metrics # Should return Prometheus metrics
5️⃣ Verify Metrics Pipeline
Check Prometheus scrape targets:
kubectl port-forward svc/kube-prom-kube-prometheus-prometheus 9090:9090 -n monitoring
Open browser: http://localhost:9090/targets
You should see api, app, node-exporters, and kube-state-metrics as UP.
6️⃣ Optional — Grafana Dashboards
Import dashboards via JSON in metrics/grafana/dashboards/.
Or create dashboards manually in Grafana.
7️⃣ Test Microservice + Metrics End-to-End
kubectl port-forward svc/kube-prom-kube-prometheus-prometheus 9090:9090 -n monitoring
curl http://localhost:9090/api/v1/query?query=api_requests_total
kubectl port-forward svc/api 5001:5001 curl http://localhost:5001/ curl http://localhost:5001/metrics
Prometheus should now record the requests.
Grafana dashboards will update accordingly. Check status
kubectl get nodes
kubectl get pods --all-namespaces
kubectl get svc --all-namespacesWe use a simple, safe workflow:
main— production/stable; protecteddev— integration branch; everything merges here firstfeature/<short-name>— personal feature branches
# ensure you have latest dev
git fetch origin
git checkout dev
git pull origin dev
# create your feature branch from dev
git checkout -b feature/your-feature-name
# do work, then:
git add .
git commit -m "feat: short description of work"
git push -u origin feature/your-feature-name- Open a Pull Request on GitHub: base =
dev, compare =feature/your-feature-name - Request reviewers and link related issue(s).
# fetch upstream changes
git fetch origin
# rebase your feature branch on latest dev
git checkout feature/your-feature-name
git rebase origin/dev
# if rebase succeeded:
git push --force-with-lease origin feature/your-feature-nameIf you prefer merge instead of rebase:
git checkout feature/your-feature-name
git merge origin/dev
# resolve conflicts if any, commit, then push
git push origin feature/your-feature-name- After PR review, tests and approvals, merge PR (prefer Squash and merge to keep history clean).
- CI must pass before merging (we recommend enabling CI checks).
If a rebase reveals conflicts:
# you're mid-rebase and git stops with conflict
# open files, fix conflicts, then:
git add <fixed-files>
git rebase --continue
# if you want to abort:
git rebase --abortIf you merged dev into your branch and see conflicts:
- Resolve in your editor,
git addresolved files,git commit, thengit push origin feature/....
After rebasing, pushing requires force (safe):
git push --force-with-lease origin feature/your-feature-name--force-with-lease is safer than --force — it fails if upstream changed unexpectedly.
-
Open GitHub Desktop and File → Clone repository → choose
AURA. -
From the Current Branch dropdown, select New Branch.
- Base branch:
dev(pulldevfrom origin first if not present). - Name:
feature/your-feature-name.
- Base branch:
-
Make edits in your code editor.
-
In GitHub Desktop: write commit message → Commit to feature/your-feature-name.
-
Click Push origin (top bar).
-
Click Branch → Create pull request (or open the PR on GitHub.com).
-
To update your branch with
dev:- Fetch origin (Fetch origin button).
- Switch to
devbranch, click Fetch origin then Pull. - Switch to your feature branch → Branch → Merge into current branch → choose
devto merge indevchanges. Resolve conflicts via editor and commit.
-
On GitHub.com: go to your repository → Settings → Manage access (or Collaborators & teams).
-
Click Invite teams or people → add teammates’ GitHub usernames or emails.
-
Choose role:
- Write (can push branches and create PRs)
- Maintain / Admin for more permissions (be careful)
- Contributors fork the repo, push to their fork, and open PRs to
dev. - For forks, recommend adding the original repo as
upstream:
git remote add upstream https://github.com/<ORG_OR_USER>/AURA.git
git fetch upstream
git checkout -b feature/xyz upstream/dev(Repository → Settings → Branches → Branch protection rules)
Recommended rules for main and dev:
- Require PR reviews before merge (1 or 2 reviewers)
- Require status checks / CI to pass
- Require up-to-date with base branch before merging
- Restrict who can push to
main(disallow direct pushes)
Add a CODEOWNERS file in .github/ if specific teams must review specific paths.
- Code builds locally
- Lint and unit tests passed
- If changed service images, updated Dockerfile and tested image build
- If K8s manifests changed, validated
kubectl apply --dry-run=server -flocally - Add/Update documentation (README or infra docs)
- Assigned reviewer(s)
-
DevOps / Infra (Parik)
- k3d scripts, cluster manifests, helm values, RBAC for agent
tools/setup_k3d.sh,tools/deploy_stack.sh
-
Microservices Dev (Vaish)
- Integrate 3-tier app, containerize, expose metrics endpoints
- Locust load scripts
-
Metrics / Observability (brainrot)
- Prometheus scrape config, Grafana dashboards, PromQL evaluation queries
-
Simulator / Env (Parik)
- Simulator API that mimics K8s pod delays and scrape lag
-
ML / MARL (Muscle Man)
- PettingZoo env, QMIX trainer, checkpoints in
marl/policies/
- PettingZoo env, QMIX trainer, checkpoints in
-
Agent Controller (Parik + ML)
deployment/agent-controller.py, safe action clipping, cooldown, RBAC
-
Experiments / Evaluation (All)
- Run baselines (HPA/VPA), collect pod-hours, P95 latency; produce plots & CSVs
Each feature task should be implemented on its own
feature/*branch. Assign reviewers from the role most relevant to the change.
- Use
docker+k3dlocally (scripts provided intools/) - Store model checkpoints & experiment outputs in
experiments/results/(persisted volume if running containers) - Add
requirements.txtfor Python components, and pin versions in Dockerfiles
- Pick one of the sample 3-tier apps linked above and add it under
microservices/three-tier(submodule recommended). - Push
devbranch to remote and protectmain&devin repo settings. - Add teammates as collaborators (or instruct them to fork).
- Each teammate creates
feature/*branches fromdevand follow the workflow above. - Optionally: Add a simple GitHub Action that lints Python and builds Docker images on PRs.
If anything fails during setup, include:
git statusoutputkubectl get pods --all-namespaces./tools/setup_k3d.shlogs (paste terminal output) Open an issue in the repo with that information and tag @Parik (Infra) or @Vaish (Microservices).