AI-Powered Developer Intelligence Platform & Repository Analytics Engine
Index any GitHub repository • MoE LLM Router • AST-aware code parsing • Hybrid RAG retrieval • Architecture Visualization • AI PR Review • Code Health Analysis
Developers lack a single place to understand their GitHub activity at a glance — contribution velocity, language breakdown, streak tracking — and there's no easy way to have an AI conversation about a codebase without copy-pasting files. GitMetrix solves both problems in one tool, and goes further with architecture visualization, automated PR reviews, code health analysis, and developer productivity analytics.
- Velocity Score — Weighted composite score measuring overall development output
- Active Streak — Consecutive days of contributions with streak tracking
- Total Output — Commits, PRs merged, and issues resolved with yearly breakdown
- Language Breakdown — Pie chart distribution of languages across all repos
- Activity Graph — Monthly commit activity visualization using Recharts
- Top Repositories — Sorted by stars, showing language, stars, and forks
- Code Health Tab — Static analysis with health gauge, risk files, and AI recommendations
- Advanced Analytics Tab — Bus Factor, Contributor Influence scores, and team insights
- User Repo Selector — Browse and select your own repositories to chat with
- Universal Search — Paste any public GitHub URL to index and chat
- AST-Aware Parsing — Multi-language symbol extraction (functions, classes, methods, imports, exports) across TypeScript, JavaScript, Python, Java, Go, Rust, C/C++, Ruby, and PHP
- Intelligent Chunking — Code is chunked by logical structure (1 function = 1 chunk, 1 class = 1 chunk) instead of fixed-size splits, with a 100-char minimum to eliminate noise
- Dependency Graph — Import/export relationships are tracked across files, enabling contextual reasoning about how modules connect
- Hybrid Retrieval — Combines pgvector cosine similarity search with PostgreSQL full-text search for higher accuracy
- Multi-Query Search — Each user question generates 3 semantic variations, all searched independently and merged for broader coverage
- Cohere Cross-Encoder Reranking — Top results are reranked using Cohere's reranker API for precision (gracefully falls back if unavailable)
- Neighbor Chunk Expansion — Retrieved chunks automatically include adjacent chunks (±1) for fuller code context
- File Importance Weighting — Chunks from
src/,app/,lib/,core/get a 1.15× relevance boost over test/docs files - Streaming Responses — Real-time SSE streaming via MoE LLM Router
- AI-Generated Suggestions — Three repo-specific starter questions generated by analyzing symbol metadata and language breakdown
- File References — Each response includes clickable file path tags showing which files were referenced
- Interactive Dependency Graph — React Flow–powered visualization of file-level dependencies
- Circular Dependency Detection — DFS-based algorithm highlights circular imports in red
- Node Importance Scoring — Nodes colored by in-degree + out-degree importance
- Detail Panel — Click any node to see dependents, dependencies, language, and risk
- Stats Overlay — Total files, edges, and circular dependency count at a glance
- GitHub PR Diff Analysis — Paste any PR URL for automated code review
- Categorized Findings — Bugs, security issues, optimizations, and suggestions
- Severity Classification — Critical, warning, and info severity badges
- Score Gauge — Overall quality score (0–100) with visual indicator
- AI Summary — Concise actionable summary generated via deep reasoning
- Static Analysis — Cyclomatic complexity, nesting depth, function lengths, risk scoring
- Health Score — Aggregate health gauge for the entire repository
- Risk File Listing — Top risky files ranked with progress bar visualization
- AI Recommendations — DeepSeek-powered refactoring recommendations via OpenRouter
- Bus Factor — Contributor concentration risk analysis
- Contributor Influence Score — Weighted scoring by commits and code changes
- Influence Leaderboard — Animated progress bars ranking team members
- Entry Point Detection — Identifies starting files from natural language queries
- Multi-Hop Dependency Traversal — Walks dependency graph up to 3 levels deep
- Architecture Explanation — AI-generated code flow explanations referencing specific files
- GitHub OAuth via Clerk — Sign in with GitHub, access profile and repos
- Protected Routes — Dashboard, Chat, Architecture, and PR Review require authentication
- Environment-based Config — All secrets stored as environment variables
User selects repo → GitHub API fetches file tree → Redis-cached tree (1hr TTL)
→ Files prioritized (src/app/lib first) → Capped at 1500 files
→ 15 concurrent file fetches with rate-limit backoff
→ AST parser extracts functions, classes, imports per file
→ Chunked by symbol boundaries (100-char min, 800-token max)
→ Embeddings generated via HuggingFace (batch 32)
→ Stored in Supabase pgvector (batch 100 rows)
→ Dependency graph built and stored
Indexing Pipeline (Inngest):
/api/indexreceives a repo URL, creates a DB record, sends an Inngest event- Inngest function
index-github-reporuns in the background:- Validates repository exists in database
- Fetches full repo tree via GitHub API (default branch → commit SHA → tree SHA)
- Caches tree in Redis (
repo-tree:{owner}/{repo}, 1-hour TTL) - Filters out 36+ excluded directories, binaries, lock files, and files > 200KB
- Prioritizes
src/,app/,lib/,packages/,core/directories first - Caps at 1500 files for large repository stability
- Fetches file content at 15 concurrent requests with rate-limit pause
- Parses each file using regex-based AST extractors (10 languages supported)
- Chunks by symbol boundaries — 1 function = 1 chunk, skips chunks < 100 chars
- Generates embeddings in batches of 32 via HuggingFace MiniLM-L6-v2
- Inserts rows in batches of 100 into Supabase
- Builds a dependency graph from import/export relationships
- Updates progress every 30 processed files
/api/index/statuspolls for completion with enriched stats (files discovered, processed, chunks, vectors, languages)
Chat Pipeline (Hybrid RAG with MoE Router):
- User query → LLM generates 2 alternative search queries (3 total) via MoE router
- All 3 queries embedded → pgvector cosine similarity search (8 results each)
- PostgreSQL full-text search runs in parallel using GIN index
- Vector + FTS results merged and deduplicated
- Top 5 matches expand: neighbor chunks (±1 chunk index) fetched
- Cohere cross-encoder reranking applied (top 12 results)
- Dependency graph neighbors retrieved for matched files
- Architecture query detection for enhanced dependency context
- File importance weighting applied (1.15× for
src/,app/,lib/,core/) - Top 12 context blocks assembled with symbol metadata and line ranges
- Context + conversation history → MoE Router selects optimal provider
- Response streamed back via Server-Sent Events
| Layer | Technology |
|---|---|
| Framework | Next.js 16 (App Router, Turbopack) |
| Language | TypeScript 5 |
| Styling | Tailwind CSS 4 (@theme tokens) |
| Auth | Clerk (GitHub OAuth) |
| Database | Supabase (PostgreSQL + pgvector) |
| Cache | Upstash Redis (tree caching, 1hr TTL) |
| Embeddings | HuggingFace Inference API (MiniLM-L6-v2, 384-dim) |
| LLM Router | Mixture-of-Experts task-based routing |
| LLM — Chat Streaming | Cerebras (LLaMA 3.3 70B) → Groq fallback |
| LLM — Query Expansion | Groq (LLaMA 3.3 70B) → OpenRouter fallback |
| LLM — Deep Reasoning | DeepSeek V3 via OpenRouter → Groq fallback |
| LLM — Large Context | Gemini Flash (1M context) → OpenRouter fallback |
| LLM — Consensus | OpenRouter (Qwen 2.5 7B) → Together AI fallback |
| Reranking | Cohere Cross-Encoder |
| Graph Visualization | React Flow (@xyflow/react) |
| Background Jobs | Inngest (serverless, runs on Vercel) |
| GitHub API | Octokit |
| Charts | Recharts |
| Animations | Framer Motion |
| Validation | Zod |
| Icons | Lucide React |
GitMetrix/
├── src/
│ ├── app/
│ │ ├── api/
│ │ │ ├── analytics/
│ │ │ │ └── route.ts # Bus Factor + Contributor Influence API
│ │ │ ├── architecture/
│ │ │ │ └── graph/
│ │ │ │ └── route.ts # Dependency graph API with circular detection
│ │ │ ├── chat/
│ │ │ │ ├── route.ts # Hybrid RAG chat (multi-query + Cohere reranking)
│ │ │ │ └── suggestions/
│ │ │ │ └── route.ts # Symbol-aware starter questions
│ │ │ ├── code-health/
│ │ │ │ └── route.ts # Static analysis + AI recommendations
│ │ │ ├── index/
│ │ │ │ ├── route.ts # Start repo indexing via Inngest
│ │ │ │ └── status/
│ │ │ │ └── route.ts # Enriched indexing progress stats
│ │ │ ├── inngest/
│ │ │ │ └── route.ts # Inngest serve endpoint
│ │ │ ├── pr-review/
│ │ │ │ └── route.ts # AI PR review (diff analysis + scoring)
│ │ │ └── repos/
│ │ │ └── route.ts # Fetch user's GitHub repos
│ │ ├── architecture/
│ │ │ ├── layout.tsx # Architecture page layout with navigation
│ │ │ └── page.tsx # Interactive dependency graph visualizer
│ │ ├── chat/
│ │ │ └── page.tsx # User repo chat page
│ │ ├── dashboard/
│ │ │ └── page.tsx # Dashboard with Overview/Health/Analytics tabs
│ │ ├── pr-review/
│ │ │ ├── layout.tsx # PR review layout with navigation
│ │ │ └── page.tsx # AI pull request reviewer
│ │ ├── search/
│ │ │ └── page.tsx # Universal repo search chat
│ │ ├── sign-in/[[...sign-in]]/
│ │ │ └── page.tsx
│ │ ├── sign-up/[[...sign-up]]/
│ │ │ └── page.tsx
│ │ ├── globals.css # Tailwind v4 theme + custom animations
│ │ ├── layout.tsx # Root layout with Clerk + fonts
│ │ └── page.tsx # Landing page
│ ├── components/
│ │ ├── ui/
│ │ │ ├── card.tsx # Glassmorphism card component
│ │ │ ├── charts.tsx # Recharts activity + language charts
│ │ │ └── skeleton.tsx # Loading skeleton component
│ │ ├── analytics-tab.tsx # Bus Factor + Influence leaderboard
│ │ ├── animated-background.tsx # Animated beams + glow background
│ │ ├── architecture-graph.tsx # React Flow dependency graph component
│ │ ├── chat-interface.tsx # Full chat UI (phases, streaming, markdown)
│ │ ├── code-health-tab.tsx # Health gauge + risk files + AI recommendations
│ │ ├── dashboard-content.tsx # Dashboard cards + grid layout
│ │ ├── dashboard-header.tsx # Glassmorphism navbar + mobile drawer
│ │ ├── dashboard-tabs.tsx # Tab switcher (Overview/Health/Analytics)
│ │ ├── navigation.tsx # Shared nav bar with mobile hamburger
│ │ ├── repo-selector.tsx # Repository browser with search/filter
│ │ └── username-search.tsx # GitHub username search input
│ ├── inngest/
│ │ └── index-github-repo.ts # Multi-step background indexing function
│ ├── lib/
│ │ ├── agentic-navigator.ts # Multi-hop dependency traversal + AI explanation
│ │ ├── chunker.ts # AST-aware code chunking (100-char min, symbol boundaries)
│ │ ├── dependency-graph.ts # Import/export graph builder + BFS traversal
│ │ ├── embeddings.ts # HuggingFace embedding client (batch 32, retry)
│ │ ├── github.ts # GitHub API helpers (dashboard stats, repos)
│ │ ├── groq.ts # Groq SDK singleton (legacy, used by llm/groq.ts)
│ │ ├── indexer.ts # Pipeline orchestrator (fetch→parse→chunk→embed→store)
│ │ ├── inngest.ts # Inngest client instance
│ │ ├── llm/
│ │ │ ├── cerebras.ts # Cerebras provider (LLaMA 3.3 — fast streaming)
│ │ │ ├── cohere.ts # Cohere cross-encoder reranker
│ │ │ ├── gemini.ts # Gemini Flash provider (1M context window)
│ │ │ ├── groq.ts # Groq provider (LLaMA 3.3 70B)
│ │ │ ├── llmRouter.ts # MoE task-based router (6 task types)
│ │ │ ├── openrouter.ts # OpenRouter provider (Qwen 2.5 + DeepSeek V3)
│ │ │ └── together.ts # Together AI provider (Qwen 2.5 7B Turbo)
│ │ ├── parser.ts # Multi-language AST symbol extractor (10 languages)
│ │ ├── redis.ts # Upstash Redis client
│ │ ├── static-analysis.ts # Cyclomatic complexity + nesting + risk scoring
│ │ ├── supabase.ts # Supabase client
│ │ ├── types.ts # TypeScript type definitions
│ │ ├── utils.ts # Utility functions
│ │ └── validators.ts # Zod validation schemas
│ └── middleware.ts # Clerk auth middleware
├── supabase/
│ ├── migration.sql # Base schema (repositories, repository_files, pgvector)
│ ├── migration_v2.sql # Symbol metadata columns + dependency_edges table
│ ├── migration_v3.sql # GIN full-text search index
│ └── migration_v4.sql # Code metrics, PR reviews, file hashes tables
├── .gitignore
├── package.json
├── tsconfig.json
└── next.config.ts
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY=
CLERK_SECRET_KEY=
NEXT_PUBLIC_CLERK_SIGN_IN_URL=/sign-in
NEXT_PUBLIC_CLERK_SIGN_UP_URL=/sign-up
GITHUB_TOKEN=
NEXT_PUBLIC_SUPABASE_URL=
SUPABASE_SERVICE_ROLE_KEY=
UPSTASH_REDIS_REST_URL=
UPSTASH_REDIS_REST_TOKEN=
HUGGINGFACE_API_KEY=
GROQ_API_KEY=
OPENROUTER_API_KEY=
TOGETHER_API_KEY=
CEREBRAS_API_KEY=
GEMINI_API_KEY=
COHERE_API_KEY=
INNGEST_EVENT_KEY=
INNGEST_SIGNING_KEY=GitMetrix is designed for Vercel deployment. Inngest runs as a serverless function via the /api/inngest route — no separate server or CLI required.
- Push to GitHub
- Import into Vercel
- Add all environment variables
- Connect Inngest Cloud to your Vercel deployment URL
- Run the Supabase migrations in order:
supabase/migration.sql— base schemasupabase/migration_v2.sql— symbol metadata + dependency edgessupabase/migration_v3.sql— full-text search indexsupabase/migration_v4.sql— code metrics, PR reviews, file hashes
User Query → MoE Router → Task Classification
│
├── chat_stream → Cerebras → Groq → OpenRouter → Together
├── query_expansion → Groq → OpenRouter → Together
├── large_context → Gemini Flash → OpenRouter → Together
├── deep_reasoning → DeepSeek V3 (via OpenRouter) → Groq
├── consensus → OpenRouter (Qwen) → Together → Groq
└── general → Groq → Cerebras → OpenRouter → Together
The system uses a task-based Mixture-of-Experts router:
- Each task type has a prioritized fallback chain of LLM providers
- DeepSeek V3 is accessed through OpenRouter (no direct API) for deep reasoning tasks
- Cerebras handles fast chat streaming with Groq as fallback
- Gemini Flash processes large context tasks (up to 1M tokens)
- Cohere provides cross-encoder reranking for RAG retrieval precision
- Auto-escalates to
large_contextwhen token count exceeds 30K - The system works with only
GROQ_API_KEY+OPENROUTER_API_KEY— other providers activate when their keys are present
| Metric | Value |
|---|---|
| File Fetch Concurrency | 15 parallel requests |
| Max Files Per Repo | 1,500 |
| Embedding Batch Size | 32 |
| DB Insert Batch Size | 100 rows |
| Redis Tree Cache TTL | 1 hour |
| Supported Languages | 10 (TS, JS, Python, Java, Go, Rust, C/C++, Ruby, PHP) |
| Retrieval Queries Per Chat | 3 (multi-query) |
| Context Chunks Per Response | Up to 12 (Cohere-reranked) |
| LLM Providers | 6 (Cerebras, Groq, Gemini, DeepSeek via OR, OpenRouter, Together) |
| Task Types | 6 (chat_stream, query_expansion, large_context, deep_reasoning, consensus, general) |
MIT