🧬 Assumption Miner

Every bug starts as an assumption that went unchecked.

Assumption Miner finds them before your users do.

⚡ The Problem

Most production incidents don't come from code that's obviously wrong. They come from code that looks right but relies on assumptions nobody wrote down.

Things like: "this API always returns in under 500ms," "the database connection will be there when we need it," "file uploads will never exceed 10MB," or "payments always succeed on the first try."

Linters won't catch these. Test suites won't flag them. Code reviews might, if the reviewer happens to think about it. But usually, nobody does. These assumptions sit quietly in the codebase, invisible, until something changes in production and they all break at once.

The worst part? Once something breaks, nobody can point to where the assumption was made. It's not documented. It's not tracked. It's just a belief that was baked into the code months ago by someone who's probably on a different team now.

This is assumption debt, and every codebase carries it. It grows silently sprint after sprint, and it's the number one source of "but it worked on my machine" incidents.

💡 The Solution

Assumption Miner is a GitLab Duo Agent that detects implicit assumptions in your code, tracks them over time, and helps you fix them before they cause outages.

It hooks into your GitLab workflow and runs automatically when you create a merge request. Under the hood, it combines AST-level code analysis with AI-powered reasoning (via Groq or OpenRouter) to identify patterns that static analysis tools miss entirely: unhandled failure modes, hardcoded thresholds, missing input validation, and undocumented environmental dependencies.

Every assumption gets a DNA fingerprint, so it can be tracked across renames, refactors, and branch merges. The agent doesn't just find problems; it predicts which assumptions will break next, maps them to security frameworks (CWE, OWASP, SOC2), and auto-generates fix MRs so your team can resolve issues in one click.

The result: fewer surprises in production, clearer risk visibility for your team, and a codebase that documents its own blind spots.

🧪 In Action

📊 Dashboard: System Health Overview

🕸️ Dependency Graph: Hidden Assumption Relationships

⏱️ Timeline: Evolution of Risks Over Time

🔮 Predictions: Future Failure Signals

🛡️ Security: Risk & Vulnerability Insights

⚙️ Settings Page

🏗️ Architecture

graph TB
    subgraph GitLab["GitLab Platform"]
        MR["Merge Request Created"]
        DUO["GitLab Duo Chat"]
        CICD["CI/CD Pipeline"]
    end

    subgraph Agent["Assumption Miner Agent"]
        direction TB
        SCAN["AST Parser + Custom Rules"]
        AI["AI Classification<br/>Groq / OpenRouter"]
        DNA["DNA Fingerprinting"]
        SEC["Security Mapper<br/>CWE · OWASP · SOC2"]
    end

    subgraph Scoring["Scoring Engine"]
        RUST["Rust + WASM<br/>Monte Carlo Simulation"]
        ML["Trend Prediction<br/>Linear Regression + WMA"]
    end

    subgraph Cloud["Cloud Infrastructure"]
        GCR["Google Cloud Run<br/>FastAPI Backend"]
        VERCEL["Vercel<br/>React Frontend"]
    end

    subgraph Output["Output Layer"]
        COMMENT["MR Comment with Findings"]
        AUTOFIX["Auto-Fix MR"]
        GATE["Quality Gate Verdict"]
        DASH["React Dashboard<br/>3D Graph · Timeline · Health"]
    end

    MR -->|trigger| SCAN
    DUO -->|on-demand| SCAN
    SCAN --> AI
    AI --> DNA
    DNA --> SEC
    SEC --> RUST
    RUST --> ML
    ML --> COMMENT
    ML --> AUTOFIX
    ML --> GATE
    ML --> DASH
    CICD -->|quality gate| GATE
    GCR -->|REST API| VERCEL
    VERCEL -->|serves| DASH

🎯 Why It's Different

Most code quality tools look at what your code does. Assumption Miner looks at what your code believes.

What makes it unique	Why it matters
Finds what linters can't	Detects implicit assumptions (not just syntax or style issues) using AI reasoning on top of AST parsing
DNA fingerprinting	Each assumption gets a stable identity that survives renames, moves, and refactors, so nothing gets lost
Predicts before it breaks	Forecasts your health score 2 to 4 sprints ahead, so you can prioritize before things go wrong
Security-aware by default	Auto-maps every assumption to CWE, OWASP Top 10, and compliance frameworks (SOC2, PCI-DSS, GDPR)
One-click auto-fix	Generates fix patches, creates a branch, and opens a merge request. No context-switching required
Lives in your pipeline	Plugs directly into GitLab CI/CD as a quality gate. If the health score drops, the merge gets blocked
Gets smarter over time	Learns from developer feedback (slash commands, emoji reactions) to calibrate confidence and reduce noise
Sub-millisecond scoring	Health scoring runs in Rust compiled to WebAssembly with Monte Carlo simulation, directly in the browser
Production-ready deployment	Backend on Google Cloud Run (auto-scaling, zero cold start), frontend on Vercel (global CDN, instant deploys)

✨ Features

Detection & Tracking

Icon	Feature	What It Does
🔍	AST + AI Analysis	Parses code structure and uses LLMs to classify assumptions that static analysis misses
🧬	DNA Fingerprinting	Assigns a stable identity to each assumption so it survives renames, moves, and refactors
🕸️	Dependency Graph	Interactive 3D holographic visualization showing how assumptions cluster and connect
⏳	Timeline & Time Travel	Tracks how assumptions evolve sprint over sprint with time-travel navigation
🏥	Health Scoring	A+ through F grade powered by a Rust WASM engine with Monte Carlo simulation

Remediation & Prevention

Icon	Feature	What It Does
🛠️	Auto-Fix MR Creation	Generates fix patches from templates (error handling, null checks, timeout, input validation, type safety, hardcoded values), creates a branch, and opens a merge request
🚦	CI/CD Quality Gate	Plugs into your pipeline and blocks merges when the health score dips below threshold
📈	Trend Prediction	Forecasts your health score 2 to 4 sprints ahead using regression and weighted moving average on historical snapshots
🎯	Breaking Change Predictor	Flags which assumptions are most likely to cause failures during upcoming refactors

Intelligence & Compliance

Icon	Feature	What It Does
🛡️	Security Impact Mapping	Links every assumption to CWE entries, OWASP Top 10 categories, and compliance controls (SOC2, PCI-DSS, GDPR)
💬	Feedback Learning Loop	Developers react with `assumption-miner feedback` or emoji; the agent adjusts confidence weights accordingly
⚙️	Custom Rules Engine	Define project-specific detection rules in YAML with regex, AST, literal, function call, and context matching
🌐	Cross-Repo Drift Detection	Spots assumption divergence across microservices sharing the same interfaces

🚀 Quick Start

Clone

git clone https://gitlab.com/gitlab-ai-hackathon/participants/34658878.git assumption-miner
cd assumption-miner

Environment Variables

cp .env.example .env
# Edit .env and fill in your keys

Variable	Provider	Notes
`GROQ_API_KEY`	Groq	Free tier available, fastest inference
`OPENROUTER_API_KEY`	OpenRouter	Access to Claude, Gemini, Mixtral, and others
`GITLAB_TOKEN`	GitLab	Personal access token for MR creation and API access

Install & Run

# Option 1: Using the setup script
chmod +x scripts/setup.sh scripts/run.sh
./scripts/setup.sh
./scripts/run.sh

# Option 2: Manual setup
# Backend
pip install -r backend/python/requirements.txt
uvicorn backend.python.api.main:app --port 8000

# Frontend (separate terminal)
cd frontend && npm install && npm run dev

The frontend runs at http://localhost:3000 and the backend at http://localhost:8000 (Swagger docs at /docs).

Docker

docker compose up -d

☁️ Deployment

Assumption Miner is deployed as two independent services - a Python backend on Google Cloud Run and a React frontend on Vercel. Both are production-ready and publicly accessible.

Backend - Google Cloud Run

The FastAPI backend is containerized and deployed on Google Cloud Run, Google's fully managed serverless container platform.

Property	Detail
Platform	Google Cloud Run (asia-southeast1)
Runtime	Python 3.11 + FastAPI in Docker container
Scaling	Auto-scales from 0 to N instances based on traffic
Auth	Public endpoint, no authentication required for demo
Swagger UI	Available at `/docs` on the backend URL

Why Google Cloud Run?

Zero infrastructure management - no servers to provision or maintain
Scales to zero when idle (cost-efficient for a hackathon project)
Instant scale-up when the GitLab agent pushes scan results
Full container support - identical to local Docker environment

Deploy to Cloud Run:

# Build and push container
gcloud builds submit --tag gcr.io/YOUR_PROJECT_ID/assumption-miner-backend

# Deploy to Cloud Run
gcloud run deploy assumption-miner-backend \
  --image gcr.io/YOUR_PROJECT_ID/assumption-miner-backend \
  --platform managed \
  --region asia-southeast1 \
  --allow-unauthenticated \
  --set-env-vars GROQ_API_KEY=your_key,OPENROUTER_API_KEY=your_key,GITLAB_TOKEN=your_token

Or using the included script:

chmod +x scripts/deploy-gcloud.sh
./scripts/deploy-gcloud.sh

Frontend - Vercel

The React frontend is deployed on Vercel, connected directly to the GitLab repository. Every push to main triggers an automatic deployment.

Property	Detail
Platform	Vercel (Global Edge Network)
Framework	React + Vite (TypeScript)
Live URL	assumption-miner.vercel.app
CDN	Served from Vercel's global edge - fast everywhere
Deploys	Automatic on every push to `main`

Why Vercel?

Zero-config deployment for Vite + React
Global CDN ensures fast load times for judges and reviewers worldwide
Preview deployments for every branch - easy to review UI changes
Environment variables managed via Vercel dashboard (no secrets in repo)

Environment variable required in Vercel dashboard:

VITE_API_URL=https://your-cloud-run-backend-url

This tells the frontend where the backend lives. Set it in Vercel → Project → Settings → Environment Variables.

Manual deploy (if needed):

npm install -g vercel
cd frontend
vercel --prod

How Frontend and Backend Connect

GitLab Duo Agent
      │
      ▼
.assumption-miner-latest.json
      │
      ▼
python scripts/push_agent_results.py
      │  (HTTP POST)
      ▼
Google Cloud Run (FastAPI Backend)
      │  (REST API via VITE_API_URL)
      ▼
Vercel (React Frontend)
      │
      ▼
assumption-miner.vercel.app

The agent writes scan results to .assumption-miner-latest.json. The push script reads this file and POSTs it to the Cloud Run backend via /api/v1/analyze. The React frontend (served by Vercel) then fetches the latest results from the backend and renders the dashboard in real-time.

🛠️ Tech Stack

Layer	Technology	Why
Orchestration	GitLab Duo Agent + Flow (YAML)	Native GitLab integration; triggers on MR events and schedules
Backend	Python, FastAPI	AST parsing, AI orchestration, REST API
Backend Hosting	Google Cloud Run	Serverless containers, auto-scaling, zero ops
Scoring Engine	Rust compiled to WebAssembly	Sub-millisecond health scoring and Monte Carlo simulation in the browser
AI	Groq (Llama 3.3), OpenRouter	Pattern classification, risk assessment, and fix generation
ML	Python (linear regression, WMA)	Trend prediction and feedback-driven confidence calibration
Frontend	React, TypeScript, Tailwind CSS, Three.js	Dashboard, 3D dependency graph, and real-time health visualization
Frontend Hosting	Vercel	Global CDN, automatic deploys on push, zero config
Storage	SQLite	Assumption registry, feedback history, and pattern adjustments

⚙️ How It Works

flowchart TD
    A["Trigger: MR created or weekly schedule"] --> B["Parse code with AST + custom rules"]
    B --> C["AI classifies and scores risk\nDNA fingerprint assigned"]
    C --> D["Map to CWE / OWASP / SOC2"]
    D --> E["Output"]
    E --> E1["MR comment with findings"]
    E --> E2["Auto-fix MR for critical issues"]
    E --> E3["CI/CD quality gate verdict"]
    E --> E4["Health score + trend forecast"]
    E --> E5["Issue creation for unresolved items"]
    E4 --> F["Push to Google Cloud Run backend"]
    F --> G["React dashboard on Vercel updates in real-time"]

🤖 GitLab Duo Integration

Assumption Miner integrates with GitLab Duo as both an Agent and a Flow:

Agent (On-Demand)

The agent is published as Assumption Miner Agent V9 in the GitLab AI Hackathon group. Chat with it directly in the GitLab Duo sidebar to analyze files, scan for assumptions, or explain risk scores.

@assumption-miner scan src/api/payment.py for implicit assumptions

You can also trigger a full scan, auto-save results, and sync to the dashboard in one instruction:

scan assumptions in this repository, save the results to .assumption-miner-latest.json, then tell me to run python scripts/push_agent_results.py

Or use the built-in slash command provided by the Agent Skill:

scan-assumptions backend/python/services/scorer.py

After completing its analysis, the agent automatically saves all findings as structured JSON to .assumption-miner-latest.json in the repository root. Run the push script locally to sync results to the dashboard:

git pull hackathon main
python scripts/push_agent_results.py

Flow (Automated)

The flow triggers automatically on merge requests, running a three-step pipeline:

Scan & Classify: reads changed files, identifies assumptions, maps to CWE/OWASP
Save Results: writes findings as structured JSON to .assumption-miner-latest.json in the repo root
Report to MR: posts a structured comment with health grade, findings table, and priority fixes

To sync flow results to the dashboard after a scan:

git pull hackathon main
python scripts/push_agent_results.py

See flows/assumption-miner-flow.yml for the flow definition.

CI/CD Quality Gate

The included CI template blocks merges when the health score drops below your configured threshold:

include:
  - local: '.gitlab/ci-templates/quality-gate.yml'

See .gitlab-ci.yml for the full pipeline configuration.

📚 API Reference

Core, Auto-Fix, Quality Gate

Method	Endpoint	Description
`POST`	`/api/v1/analyze`	Run assumption analysis on submitted files
`GET`	`/api/v1/health/{repo}`	Retrieve the current health score and grade
`GET`	`/api/v1/timeline`	Fetch the assumption evolution timeline
`POST`	`/api/v1/auto-fix/generate`	Preview generated fixes without side effects
`POST`	`/api/v1/auto-fix/apply`	Apply fixes: create branch, commit, and open MR
`POST`	`/api/v1/health-gate`	Evaluate health score against threshold (CI/CD)
`GET`	`/api/v1/health-gate/badge/{id}`	Embeddable SVG health badge

Predictions, Security, Feedback, Rules

Method	Endpoint	Description
`GET`	`/api/v1/predictions/{project_id}`	Forecast health score N sprints ahead
`GET`	`/api/v1/security/{assumption_id}`	CWE, OWASP, and compliance mapping
`GET`	`/api/v1/security/report/{project_id}`	Full project security posture report
`GET`	`/api/v1/compliance/{project_id}`	Compliance status by framework
`POST`	`/api/v1/feedback`	Submit developer feedback
`POST`	`/api/v1/feedback/learn`	Trigger learning cycle
`POST`	`/api/v1/webhooks/gitlab`	GitLab webhook receiver
`GET`	`/api/v1/rules/{project_id}`	List all active rules
`POST`	`/api/v1/rules/{project_id}`	Create a new detection rule
`POST`	`/api/v1/rules/{project_id}/test`	Test a rule against code

Full API documentation is also available at docs/api/.

🗂️ Project Structure

assumption-miner/
├── AGENTS.md                          # Repository-level agent context and guidelines
├── agents/
│   ├── assumption-miner.yml           # GitLab Duo Agent definition
│   └── agent.yml.template             # Agent template
├── flows/
│   ├── assumption-miner-flow.yml      # GitLab Duo Flow definition
│   └── flow.yml.template              # Flow template
├── skills/
│   └── scan-assumptions/
│       └── SKILL.md                   # scan-assumptions slash command skill
├── .gitlab/
│   ├── agents/assumption-miner.yml    # Agent registration
│   └── ci-templates/quality-gate.yml  # CI quality gate template
├── backend/python/
│   ├── ai/                # Groq + OpenRouter clients, prompt templates
│   ├── analyzer/          # AST parser, DNA fingerprinting, multi-lang support, patterns
│   ├── api/               # FastAPI routes: core, auto-fix, feedback, predictions, quality gate, rules, security, webhooks
│   ├── data/              # CWE, OWASP, compliance databases
│   ├── db/                # SQLAlchemy models, migrations
│   ├── ml/                # Trend model, forecaster, feature extractor, feedback learner
│   ├── models/            # Data models: assumption, graph, health
│   ├── rules/             # Custom rules engine with matchers (literal, pattern, context, function call)
│   ├── services/          # Business logic: scorer, predictor, auto-fix, MR creator, security mapper, graph builder, cross-repo, time travel
│   │   └── templates/fixes/  # Fix templates: error handling, null checks, timeout, input validation, type safety, hardcoded values
│   └── utils/             # Git and GitLab utilities
├── backend/rust/
│   └── src/               # WASM scoring engine: scorer, Monte Carlo, DNA, graph, types
├── frontend/
│   ├── public/            # Static assets
│   └── src/
│       ├── 3d/            # Three.js holographic graph: Scene, ParticleField, AnimatedEdge, AssumptionSphere, CentralCore
│       ├── components/    # Dashboard, GraphView, SecurityPage, PredictionsPage, TimelinePage, SettingsPage, AboutPage, AutoFixModal
│       ├── data/          # Demo data
│       ├── hooks/         # Zustand store, API hooks, WASM hooks, animation hooks
│       ├── styles/        # Global styles, animations
│       ├── types/         # TypeScript type definitions
│       └── utils/         # API client, WASM loader, formatting, color utilities
├── docs/
│   ├── api/               # API endpoint docs and model reference
│   └── guides/            # Getting started, customization, GitLab integration guides
├── examples/
│   ├── ASSUMPTIONS.md     # Example assumption documentation
│   └── sample-repo/       # Sample Python files to test against
├── scripts/               # Setup, run, build, deploy, test, quality-gate scripts
│   ├── push_agent_results.py  # Reads .assumption-miner-latest.json and POSTs to backend dashboard
│   └── deploy-gcloud.sh       # Deploy backend to Google Cloud Run
├── tests/
│   ├── python/            # Backend tests: analyzer, API, graph, scorer
│   ├── rust/              # WASM engine tests: scorer, graph
│   └── frontend/          # React component tests
├── .gitlab-ci.yml         # CI/CD pipeline configuration
├── docker-compose.yml     # Docker setup
├── Makefile               # Build commands
└── package.json           # Root package.json

📖 Documentation

Detailed documentation is available in the docs/ directory:

Guide	Description
Architecture	System design and component overview
Getting Started	Step-by-step setup guide
GitLab Integration	Configuring Duo Agent, Flow, and CI/CD
Customization	Custom rules, thresholds, and project-specific tuning
API Endpoints	Full API reference
API Models	Request/response schemas
Contributing	How to contribute

⚖️ License

🤝 Ethics & Attribution

This project was created by Wiqi Lee as a submission for the GitLab AI Hackathon 2026. The project template and license structure are provided by GitLab under the MIT License.

If you use, fork, or build upon this code, please:

Give proper attribution. Credit the original author (Wiqi Lee) and link back to this repository.
Keep the license intact. Do not remove or alter the MIT License file.
Don't misrepresent authorship. Do not claim this work as your own in any competition, portfolio, or submission.
Respect the spirit of open source. Contribute back improvements when possible, and use this code to learn and build, not to plagiarize.

"Good code is shared freely. Good ethics means acknowledging where it came from."

Wiqi Lee · Built for GitLab AI Hackathon 2026 · Discord · X

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
.gitlab		.gitlab
agents		agents
backend		backend
docs		docs
examples		examples
flows		flows
frontend		frontend
scripts		scripts
skills/scan-assumptions		skills/scan-assumptions
tests		tests
.ai-catalog-mapping.json		.ai-catalog-mapping.json
.assumption-miner-latest.json		.assumption-miner-latest.json
.env.example		.env.example
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
AGENTS.md		AGENTS.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
STRUCTURE.md		STRUCTURE.md
docker-compose.yml		docker-compose.yml
package-lock.json		package-lock.json
package.json		package.json

Folders and files

Latest commit

History

Repository files navigation

🧬 Assumption Miner

⚡ The Problem

💡 The Solution

🧪 In Action

📊 Dashboard: System Health Overview

🕸️ Dependency Graph: Hidden Assumption Relationships

⏱️ Timeline: Evolution of Risks Over Time

🔮 Predictions: Future Failure Signals

🛡️ Security: Risk & Vulnerability Insights

⚙️ Settings Page

🏗️ Architecture

🎯 Why It's Different

✨ Features

Detection & Tracking

Remediation & Prevention

Intelligence & Compliance

🚀 Quick Start

Clone

Environment Variables

Install & Run

Docker

☁️ Deployment

Backend - Google Cloud Run

Frontend - Vercel

How Frontend and Backend Connect

🛠️ Tech Stack

⚙️ How It Works

🤖 GitLab Duo Integration

Agent (On-Demand)

Flow (Automated)

CI/CD Quality Gate

📚 API Reference

Core, Auto-Fix, Quality Gate

Predictions, Security, Feedback, Rules

🗂️ Project Structure

📖 Documentation

⚖️ License

🤝 Ethics & Attribution

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages