Every bug starts as an assumption that went unchecked.
Assumption Miner finds them before your users do.
Most production incidents don't come from code that's obviously wrong. They come from code that looks right but relies on assumptions nobody wrote down.
Things like: "this API always returns in under 500ms," "the database connection will be there when we need it," "file uploads will never exceed 10MB," or "payments always succeed on the first try."
Linters won't catch these. Test suites won't flag them. Code reviews might, if the reviewer happens to think about it. But usually, nobody does. These assumptions sit quietly in the codebase, invisible, until something changes in production and they all break at once.
The worst part? Once something breaks, nobody can point to where the assumption was made. It's not documented. It's not tracked. It's just a belief that was baked into the code months ago by someone who's probably on a different team now.
This is assumption debt, and every codebase carries it. It grows silently sprint after sprint, and it's the number one source of "but it worked on my machine" incidents.
Assumption Miner is a GitLab Duo Agent that detects implicit assumptions in your code, tracks them over time, and helps you fix them before they cause outages.
It hooks into your GitLab workflow and runs automatically when you create a merge request. Under the hood, it combines AST-level code analysis with AI-powered reasoning (via Groq or OpenRouter) to identify patterns that static analysis tools miss entirely: unhandled failure modes, hardcoded thresholds, missing input validation, and undocumented environmental dependencies.
Every assumption gets a DNA fingerprint, so it can be tracked across renames, refactors, and branch merges. The agent doesn't just find problems; it predicts which assumptions will break next, maps them to security frameworks (CWE, OWASP, SOC2), and auto-generates fix MRs so your team can resolve issues in one click.
The result: fewer surprises in production, clearer risk visibility for your team, and a codebase that documents its own blind spots.
graph TB
subgraph GitLab["GitLab Platform"]
MR["Merge Request Created"]
DUO["GitLab Duo Chat"]
CICD["CI/CD Pipeline"]
end
subgraph Agent["Assumption Miner Agent"]
direction TB
SCAN["AST Parser + Custom Rules"]
AI["AI Classification<br/>Groq / OpenRouter"]
DNA["DNA Fingerprinting"]
SEC["Security Mapper<br/>CWE · OWASP · SOC2"]
end
subgraph Scoring["Scoring Engine"]
RUST["Rust + WASM<br/>Monte Carlo Simulation"]
ML["Trend Prediction<br/>Linear Regression + WMA"]
end
subgraph Cloud["Cloud Infrastructure"]
GCR["Google Cloud Run<br/>FastAPI Backend"]
VERCEL["Vercel<br/>React Frontend"]
end
subgraph Output["Output Layer"]
COMMENT["MR Comment with Findings"]
AUTOFIX["Auto-Fix MR"]
GATE["Quality Gate Verdict"]
DASH["React Dashboard<br/>3D Graph · Timeline · Health"]
end
MR -->|trigger| SCAN
DUO -->|on-demand| SCAN
SCAN --> AI
AI --> DNA
DNA --> SEC
SEC --> RUST
RUST --> ML
ML --> COMMENT
ML --> AUTOFIX
ML --> GATE
ML --> DASH
CICD -->|quality gate| GATE
GCR -->|REST API| VERCEL
VERCEL -->|serves| DASH
Most code quality tools look at what your code does. Assumption Miner looks at what your code believes.
| What makes it unique | Why it matters |
|---|---|
| Finds what linters can't | Detects implicit assumptions (not just syntax or style issues) using AI reasoning on top of AST parsing |
| DNA fingerprinting | Each assumption gets a stable identity that survives renames, moves, and refactors, so nothing gets lost |
| Predicts before it breaks | Forecasts your health score 2 to 4 sprints ahead, so you can prioritize before things go wrong |
| Security-aware by default | Auto-maps every assumption to CWE, OWASP Top 10, and compliance frameworks (SOC2, PCI-DSS, GDPR) |
| One-click auto-fix | Generates fix patches, creates a branch, and opens a merge request. No context-switching required |
| Lives in your pipeline | Plugs directly into GitLab CI/CD as a quality gate. If the health score drops, the merge gets blocked |
| Gets smarter over time | Learns from developer feedback (slash commands, emoji reactions) to calibrate confidence and reduce noise |
| Sub-millisecond scoring | Health scoring runs in Rust compiled to WebAssembly with Monte Carlo simulation, directly in the browser |
| Production-ready deployment | Backend on Google Cloud Run (auto-scaling, zero cold start), frontend on Vercel (global CDN, instant deploys) |
| Icon | Feature | What It Does |
|---|---|---|
| 🔍 | AST + AI Analysis | Parses code structure and uses LLMs to classify assumptions that static analysis misses |
| 🧬 | DNA Fingerprinting | Assigns a stable identity to each assumption so it survives renames, moves, and refactors |
| 🕸️ | Dependency Graph | Interactive 3D holographic visualization showing how assumptions cluster and connect |
| ⏳ | Timeline & Time Travel | Tracks how assumptions evolve sprint over sprint with time-travel navigation |
| 🏥 | Health Scoring | A+ through F grade powered by a Rust WASM engine with Monte Carlo simulation |
| Icon | Feature | What It Does |
|---|---|---|
| 🛠️ | Auto-Fix MR Creation | Generates fix patches from templates (error handling, null checks, timeout, input validation, type safety, hardcoded values), creates a branch, and opens a merge request |
| 🚦 | CI/CD Quality Gate | Plugs into your pipeline and blocks merges when the health score dips below threshold |
| 📈 | Trend Prediction | Forecasts your health score 2 to 4 sprints ahead using regression and weighted moving average on historical snapshots |
| 🎯 | Breaking Change Predictor | Flags which assumptions are most likely to cause failures during upcoming refactors |
| Icon | Feature | What It Does |
|---|---|---|
| 🛡️ | Security Impact Mapping | Links every assumption to CWE entries, OWASP Top 10 categories, and compliance controls (SOC2, PCI-DSS, GDPR) |
| 💬 | Feedback Learning Loop | Developers react with assumption-miner feedback or emoji; the agent adjusts confidence weights accordingly |
| ⚙️ | Custom Rules Engine | Define project-specific detection rules in YAML with regex, AST, literal, function call, and context matching |
| 🌐 | Cross-Repo Drift Detection | Spots assumption divergence across microservices sharing the same interfaces |
git clone https://gitlab.com/gitlab-ai-hackathon/participants/34658878.git assumption-miner
cd assumption-minercp .env.example .env
# Edit .env and fill in your keys| Variable | Provider | Notes |
|---|---|---|
GROQ_API_KEY |
Groq | Free tier available, fastest inference |
OPENROUTER_API_KEY |
OpenRouter | Access to Claude, Gemini, Mixtral, and others |
GITLAB_TOKEN |
GitLab | Personal access token for MR creation and API access |
# Option 1: Using the setup script
chmod +x scripts/setup.sh scripts/run.sh
./scripts/setup.sh
./scripts/run.sh
# Option 2: Manual setup
# Backend
pip install -r backend/python/requirements.txt
uvicorn backend.python.api.main:app --port 8000
# Frontend (separate terminal)
cd frontend && npm install && npm run devThe frontend runs at http://localhost:3000 and the backend at http://localhost:8000 (Swagger docs at /docs).
docker compose up -dAssumption Miner is deployed as two independent services - a Python backend on Google Cloud Run and a React frontend on Vercel. Both are production-ready and publicly accessible.
The FastAPI backend is containerized and deployed on Google Cloud Run, Google's fully managed serverless container platform.
| Property | Detail |
|---|---|
| Platform | Google Cloud Run (asia-southeast1) |
| Runtime | Python 3.11 + FastAPI in Docker container |
| Scaling | Auto-scales from 0 to N instances based on traffic |
| Auth | Public endpoint, no authentication required for demo |
| Swagger UI | Available at /docs on the backend URL |
Why Google Cloud Run?
- Zero infrastructure management - no servers to provision or maintain
- Scales to zero when idle (cost-efficient for a hackathon project)
- Instant scale-up when the GitLab agent pushes scan results
- Full container support - identical to local Docker environment
Deploy to Cloud Run:
# Build and push container
gcloud builds submit --tag gcr.io/YOUR_PROJECT_ID/assumption-miner-backend
# Deploy to Cloud Run
gcloud run deploy assumption-miner-backend \
--image gcr.io/YOUR_PROJECT_ID/assumption-miner-backend \
--platform managed \
--region asia-southeast1 \
--allow-unauthenticated \
--set-env-vars GROQ_API_KEY=your_key,OPENROUTER_API_KEY=your_key,GITLAB_TOKEN=your_tokenOr using the included script:
chmod +x scripts/deploy-gcloud.sh
./scripts/deploy-gcloud.shThe React frontend is deployed on Vercel, connected directly to the GitLab repository. Every push to main triggers an automatic deployment.
| Property | Detail |
|---|---|
| Platform | Vercel (Global Edge Network) |
| Framework | React + Vite (TypeScript) |
| Live URL | assumption-miner.vercel.app |
| CDN | Served from Vercel's global edge - fast everywhere |
| Deploys | Automatic on every push to main |
Why Vercel?
- Zero-config deployment for Vite + React
- Global CDN ensures fast load times for judges and reviewers worldwide
- Preview deployments for every branch - easy to review UI changes
- Environment variables managed via Vercel dashboard (no secrets in repo)
Environment variable required in Vercel dashboard:
VITE_API_URL=https://your-cloud-run-backend-url
This tells the frontend where the backend lives. Set it in Vercel → Project → Settings → Environment Variables.
Manual deploy (if needed):
npm install -g vercel
cd frontend
vercel --prodGitLab Duo Agent
│
▼
.assumption-miner-latest.json
│
▼
python scripts/push_agent_results.py
│ (HTTP POST)
▼
Google Cloud Run (FastAPI Backend)
│ (REST API via VITE_API_URL)
▼
Vercel (React Frontend)
│
▼
assumption-miner.vercel.app
The agent writes scan results to .assumption-miner-latest.json. The push script reads this file and POSTs it to the Cloud Run backend via /api/v1/analyze. The React frontend (served by Vercel) then fetches the latest results from the backend and renders the dashboard in real-time.
| Layer | Technology | Why |
|---|---|---|
| Orchestration | GitLab Duo Agent + Flow (YAML) | Native GitLab integration; triggers on MR events and schedules |
| Backend | Python, FastAPI | AST parsing, AI orchestration, REST API |
| Backend Hosting | Google Cloud Run | Serverless containers, auto-scaling, zero ops |
| Scoring Engine | Rust compiled to WebAssembly | Sub-millisecond health scoring and Monte Carlo simulation in the browser |
| AI | Groq (Llama 3.3), OpenRouter | Pattern classification, risk assessment, and fix generation |
| ML | Python (linear regression, WMA) | Trend prediction and feedback-driven confidence calibration |
| Frontend | React, TypeScript, Tailwind CSS, Three.js | Dashboard, 3D dependency graph, and real-time health visualization |
| Frontend Hosting | Vercel | Global CDN, automatic deploys on push, zero config |
| Storage | SQLite | Assumption registry, feedback history, and pattern adjustments |
flowchart TD
A["Trigger: MR created or weekly schedule"] --> B["Parse code with AST + custom rules"]
B --> C["AI classifies and scores risk\nDNA fingerprint assigned"]
C --> D["Map to CWE / OWASP / SOC2"]
D --> E["Output"]
E --> E1["MR comment with findings"]
E --> E2["Auto-fix MR for critical issues"]
E --> E3["CI/CD quality gate verdict"]
E --> E4["Health score + trend forecast"]
E --> E5["Issue creation for unresolved items"]
E4 --> F["Push to Google Cloud Run backend"]
F --> G["React dashboard on Vercel updates in real-time"]
Assumption Miner integrates with GitLab Duo as both an Agent and a Flow:
The agent is published as Assumption Miner Agent V9 in the GitLab AI Hackathon group. Chat with it directly in the GitLab Duo sidebar to analyze files, scan for assumptions, or explain risk scores.
@assumption-miner scan src/api/payment.py for implicit assumptions
You can also trigger a full scan, auto-save results, and sync to the dashboard in one instruction:
scan assumptions in this repository, save the results to .assumption-miner-latest.json, then tell me to run python scripts/push_agent_results.py
Or use the built-in slash command provided by the Agent Skill:
scan-assumptions backend/python/services/scorer.py
After completing its analysis, the agent automatically saves all findings as structured JSON to .assumption-miner-latest.json in the repository root. Run the push script locally to sync results to the dashboard:
git pull hackathon main
python scripts/push_agent_results.pyThe flow triggers automatically on merge requests, running a three-step pipeline:
- Scan & Classify: reads changed files, identifies assumptions, maps to CWE/OWASP
- Save Results: writes findings as structured JSON to
.assumption-miner-latest.jsonin the repo root - Report to MR: posts a structured comment with health grade, findings table, and priority fixes
To sync flow results to the dashboard after a scan:
git pull hackathon main
python scripts/push_agent_results.pySee flows/assumption-miner-flow.yml for the flow definition.
The included CI template blocks merges when the health score drops below your configured threshold:
include:
- local: '.gitlab/ci-templates/quality-gate.yml'See .gitlab-ci.yml for the full pipeline configuration.
| Method | Endpoint | Description |
|---|---|---|
POST |
/api/v1/analyze |
Run assumption analysis on submitted files |
GET |
/api/v1/health/{repo} |
Retrieve the current health score and grade |
GET |
/api/v1/timeline |
Fetch the assumption evolution timeline |
POST |
/api/v1/auto-fix/generate |
Preview generated fixes without side effects |
POST |
/api/v1/auto-fix/apply |
Apply fixes: create branch, commit, and open MR |
POST |
/api/v1/health-gate |
Evaluate health score against threshold (CI/CD) |
GET |
/api/v1/health-gate/badge/{id} |
Embeddable SVG health badge |
| Method | Endpoint | Description |
|---|---|---|
GET |
/api/v1/predictions/{project_id} |
Forecast health score N sprints ahead |
GET |
/api/v1/security/{assumption_id} |
CWE, OWASP, and compliance mapping |
GET |
/api/v1/security/report/{project_id} |
Full project security posture report |
GET |
/api/v1/compliance/{project_id} |
Compliance status by framework |
POST |
/api/v1/feedback |
Submit developer feedback |
POST |
/api/v1/feedback/learn |
Trigger learning cycle |
POST |
/api/v1/webhooks/gitlab |
GitLab webhook receiver |
GET |
/api/v1/rules/{project_id} |
List all active rules |
POST |
/api/v1/rules/{project_id} |
Create a new detection rule |
POST |
/api/v1/rules/{project_id}/test |
Test a rule against code |
Full API documentation is also available at docs/api/.
assumption-miner/
├── AGENTS.md # Repository-level agent context and guidelines
├── agents/
│ ├── assumption-miner.yml # GitLab Duo Agent definition
│ └── agent.yml.template # Agent template
├── flows/
│ ├── assumption-miner-flow.yml # GitLab Duo Flow definition
│ └── flow.yml.template # Flow template
├── skills/
│ └── scan-assumptions/
│ └── SKILL.md # scan-assumptions slash command skill
├── .gitlab/
│ ├── agents/assumption-miner.yml # Agent registration
│ └── ci-templates/quality-gate.yml # CI quality gate template
├── backend/python/
│ ├── ai/ # Groq + OpenRouter clients, prompt templates
│ ├── analyzer/ # AST parser, DNA fingerprinting, multi-lang support, patterns
│ ├── api/ # FastAPI routes: core, auto-fix, feedback, predictions, quality gate, rules, security, webhooks
│ ├── data/ # CWE, OWASP, compliance databases
│ ├── db/ # SQLAlchemy models, migrations
│ ├── ml/ # Trend model, forecaster, feature extractor, feedback learner
│ ├── models/ # Data models: assumption, graph, health
│ ├── rules/ # Custom rules engine with matchers (literal, pattern, context, function call)
│ ├── services/ # Business logic: scorer, predictor, auto-fix, MR creator, security mapper, graph builder, cross-repo, time travel
│ │ └── templates/fixes/ # Fix templates: error handling, null checks, timeout, input validation, type safety, hardcoded values
│ └── utils/ # Git and GitLab utilities
├── backend/rust/
│ └── src/ # WASM scoring engine: scorer, Monte Carlo, DNA, graph, types
├── frontend/
│ ├── public/ # Static assets
│ └── src/
│ ├── 3d/ # Three.js holographic graph: Scene, ParticleField, AnimatedEdge, AssumptionSphere, CentralCore
│ ├── components/ # Dashboard, GraphView, SecurityPage, PredictionsPage, TimelinePage, SettingsPage, AboutPage, AutoFixModal
│ ├── data/ # Demo data
│ ├── hooks/ # Zustand store, API hooks, WASM hooks, animation hooks
│ ├── styles/ # Global styles, animations
│ ├── types/ # TypeScript type definitions
│ └── utils/ # API client, WASM loader, formatting, color utilities
├── docs/
│ ├── api/ # API endpoint docs and model reference
│ └── guides/ # Getting started, customization, GitLab integration guides
├── examples/
│ ├── ASSUMPTIONS.md # Example assumption documentation
│ └── sample-repo/ # Sample Python files to test against
├── scripts/ # Setup, run, build, deploy, test, quality-gate scripts
│ ├── push_agent_results.py # Reads .assumption-miner-latest.json and POSTs to backend dashboard
│ └── deploy-gcloud.sh # Deploy backend to Google Cloud Run
├── tests/
│ ├── python/ # Backend tests: analyzer, API, graph, scorer
│ ├── rust/ # WASM engine tests: scorer, graph
│ └── frontend/ # React component tests
├── .gitlab-ci.yml # CI/CD pipeline configuration
├── docker-compose.yml # Docker setup
├── Makefile # Build commands
└── package.json # Root package.json
Detailed documentation is available in the docs/ directory:
| Guide | Description |
|---|---|
| Architecture | System design and component overview |
| Getting Started | Step-by-step setup guide |
| GitLab Integration | Configuring Duo Agent, Flow, and CI/CD |
| Customization | Custom rules, thresholds, and project-specific tuning |
| API Endpoints | Full API reference |
| API Models | Request/response schemas |
| Contributing | How to contribute |
MIT License © 2026 Wiqi Lee. Built for the GitLab AI Hackathon 2026. See LICENSE for details.
This project was created by Wiqi Lee as a submission for the GitLab AI Hackathon 2026. The project template and license structure are provided by GitLab under the MIT License.
If you use, fork, or build upon this code, please:
- Give proper attribution. Credit the original author (Wiqi Lee) and link back to this repository.
- Keep the license intact. Do not remove or alter the MIT License file.
- Don't misrepresent authorship. Do not claim this work as your own in any competition, portfolio, or submission.
- Respect the spirit of open source. Contribute back improvements when possible, and use this code to learn and build, not to plagiarize.
"Good code is shared freely. Good ethics means acknowledging where it came from."














