Code Index

A standalone, AI-oriented code indexing framework that speeds up code discovery for AI agents (like Claude or Gemini). Think of it as the indexing engine behind an IDE like IntelliJ or Eclipse, but exposed as a standalone service.

Supported Languages: Java, Kotlin, Rust, Go, C, C++
Interfaces: MCP (stdio) for AI agents, JSON-RPC (HTTP) for custom scripts
Storage: Fast, local SQLite database

What it does

Instead of forcing AI agents to grep blindly through hundreds of files, Code Index parses your source code and builds a relational graph of your codebase.

An AI agent can instantly ask precise questions like:

"What methods does the UserController class have?"
"Find every function that calls processPayment()."
"What classes implement the PaymentGateway interface?"
"Show me the definition of the User struct."

It supports full cross-file resolution, tracks type hierarchies, and even updates incrementally in real-time as you modify files. Mixed-language projects (e.g., Chromium with C++ and Java, or Android apps with Java and Kotlin) are indexed into a single unified graph.

Features

Blazing Fast Symbol Lookup: Instantly jump to classes, functions, methods, traits, and macros.
Call Graphs & Type Hierarchies: Navigate up and down the call stack and inheritance trees.
Cross-References: Find all reads, writes, and usages of a specific symbol.
Mixed-Language Projects: All supported languages are indexed into one database — query across language boundaries in a single project.
Live File Watching: The index stays up-to-date automatically as you type and save files.
Full Text & Semantic Search: Search by exact keywords (FTS5) or use AI embeddings to search by natural language meaning.
Build-System Aware: Automatically detects source roots for Gradle, Maven, and Cargo projects, ignoring generated folders like build/ or target/.

Installation

You need the Rust toolchain installed.

# Clone the repository
git clone https://github.com/your-username/code-index.git
cd code-index

# Build the release binary with semantic search enabled
cargo build --release -p code-index-server --features semantic

# The binary will be available at target/release/code-index-server

(Note: The first build may take a moment as it compiles the Tree-sitter C parsers and bundled SQLite).

Usage: Connecting AI Agents (MCP)

The easiest way to use Code Index is by connecting it to an AI CLI tool using the Model Context Protocol (MCP). The configuration is exactly the same for both Gemini CLI and Claude CLI.

Create or update the appropriate settings file for your agent:

Gemini CLI: .gemini/settings.json (in your project root)
Claude CLI: .mcp.json (in your project root)

Add the following configuration:

{
  "mcpServers": {
    "code-index": {
      "command": "/absolute/path/to/code-index/target/release/code-index-server",
      "args": ["--transport", "stdio", "--root", "."]
    }
  }
}

The CLI will automatically detect the server and provide the indexing and search tools.

Setting up Semantic Search (Optional)

By default, Code Index uses exact-match and structural search. If you want the ability to search your codebase using natural language (e.g., "How is the user profile updated?"), you can enable Semantic Search.

Semantic Search requires an embedding provider. The server is compatible with both paid OpenAI models and free, locally-hosted models (like Ollama).

Using a Free Local Model (Ollama)

Install Ollama on Linux or macOS:

curl -fsSL https://ollama.com/install.sh | sh

Download the local embedding model (nomic-embed-text is a small ~274MB model perfect for code):
```
ollama pull nomic-embed-text
```

Start the code-index-server with the following environment variables (you can put these in your .bashrc or export them before running your AI CLI):

export CODE_INDEX_EMBEDDING_API_URL="http://localhost:11434/v1/embeddings"
export CODE_INDEX_EMBEDDING_MODEL="nomic-embed-text"
export CODE_INDEX_EMBEDDING_API_KEY="dummy"  # Required by the client, but ignored by Ollama
export CODE_INDEX_EMBEDDING_DIMENSIONS="768"

Using OpenAI

Export your API key. The server will default to text-embedding-3-small.

export CODE_INDEX_EMBEDDING_API_KEY="sk-your-openai-key"

Note: If no API key is provided, the server will gracefully disable semantic search and continue providing fast structural and text searches.

Advanced Usage

Running as a standalone JSON-RPC Server

You can run the server directly and query it using HTTP/JSON-RPC (defaults to port 9120):

cargo run -p code-index-server -- --root /path/to/your/project

Brand / Flavor Configuration (Android)

For Android projects with build variants (flavors), use --brand and --source-rule to control which source sets get indexed:

{
  "mcpServers": {
    "code-index": {
      "command": "/absolute/path/to/code-index/target/release/code-index-server",
      "args": [
        "--transport", "stdio",
        "--root", ".",
        "--brand", "exampleBrand",
        "--source-rule", "common:src/main/java",
        "--source-rule", "brand:src/{brand}/java"
      ]
    }
  }
}

To switch brands at runtime without restarting, agents can use the set_brand MCP tool.

Large codebases

For projects with more than 10,000 source files, the default per-file indexing may be slow. Use --bulk to switch to a faster bulk indexing pipeline that resolves imports, calls, and type relations via SQL batch operations:

code-index index --root /path/to/project --bulk

Or in .mcp.json:

{
  "mcpServers": {
    "code-index": {
      "command": "code-index-server",
      "args": ["--transport", "stdio", "--root", ".", "--bulk"]
    }
  }
}

Benchmarks

We run ablation benchmarks to measure the impact of Code Index on AI agent performance. Each benchmark task is executed twice — once with MCP tools enabled, once without — to compare token usage, cost, and speed.

Methodology: Average of 3 runs, Claude CLI with Opus 4.6 (1M context).

Android project (~860 files, ~8000 symbols)

Task type	With MCP	Without MCP	Winner
File symbol listing	18.4s, $0.13	69.4s, $0.30	MCP (3.8x faster, 2.3x cheaper)
Call graph traversal	44.5s, $0.24	79.2s, $0.30	MCP (1.8x faster, 1.3x cheaper)
Cross-module tracing	39.3s, $0.19	66.8s, $0.24	MCP (1.7x faster, 1.2x cheaper)
Find implementations	22.5s, $0.14	29.4s, $0.16	MCP (1.3x faster, 1.2x cheaper)
Find definition	20.3s, $0.23	16.8s, $0.21	Grep (faster and cheaper for simple lookup)

Quality: 100% both modes across all tasks. Overall: MCP 44% faster and 23% cheaper ($2.78 vs $3.63).

Chromium project (~86K files, ~1.8M symbols, C++ and Java)

Task type	With MCP	Without MCP	Winner
C++ cross-module trace	314s, $0.50, 93% q	334s, $1.39, 100% q	MCP (2.8x cheaper, similar speed)
C++ call graph	253s, $0.51, 93% q	344s, $0.61, 93% q	MCP (1.4x faster, 1.2x cheaper)
C++ find definition	66s, $0.31, 100% q	84s, $0.39, 100% q	MCP (1.3x faster, 1.3x cheaper)
C++ file symbols	268s, $0.89, 80% q	333s, $1.03, 100% q	MCP faster/cheaper, grep better quality
C++ interface impls	232s, $0.45, 93% q	162s, $0.42, 60% q	MCP better quality, grep faster
Java interface impls	50s, $0.16, 100% q	72s, $0.23, 100% q	MCP (1.4x faster, 1.5x cheaper)
Java call graph	106s, $0.30, 83% q	111s, $0.35, 100% q	MCP cheaper, grep better quality
Java find definition	49s, $0.15, 100% q	33s, $0.12, 100% q	Grep (faster and cheaper)

Overall: MCP 28% cheaper ($9.81 vs $13.67) and 9% faster (4014s vs 4421s). Quality is similar (MCP 93% vs grep 94%) — MCP occasionally misses facts on complex exploration, while grep occasionally misses implementations that require structural traversal to find.

Summary

Metric	Android (860 files)	Chromium (86K files)
MCP speed advantage	44% faster	9% faster
MCP cost advantage	23% cheaper	28% cheaper
Quality	identical (100%)	similar (93% vs 94%)
MCP cost-effective wins	4/5 tasks	5/8 tasks

Takeaway: Code Index consistently reduces cost across both project sizes. The speed advantage is largest on the small project (44%) where grep has to search many files per query. On the large project, the cost savings are more significant (28%) because grep operations on 86K files are expensive. Quality is comparable — MCP sometimes explores too aggressively (missing facts in 26-turn call graph sessions), while grep sometimes misses implementations that require structural traversal.

When Code Index helps most

Best for:

Cost reduction on large codebases — grep on 86K files is expensive; structured index queries are cheap per call
Multi-hop structural queries — call graphs, cross-module tracing, type hierarchies
Completeness-sensitive tasks — finding all implementations of a widely-inherited class
Disambiguation — querying a specific connect method out of thousands of matches

Not needed for:

Simple "where is X defined?" lookups — grep is faster for single-hop queries
Tasks where answer quality is critical and exploration depth is unpredictable

Running your own benchmarks

Results and scripts are in tests/benchmark/.

./tests/benchmark/bench.sh --prompts prompts_example.txt --runs 3

See prompts_example.txt for the prompt format.

Contributing and Technical Details

If you are looking to contribute, understand the architecture, or view the raw JSON-RPC API payloads, please see:

CLAUDE.md - For workspace layout, build instructions, and development guidelines.
Design Docs - For architecture, API reference, and resolution algorithm.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.config		.config
.gemini		.gemini
crates		crates
docs		docs
scripts		scripts
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.maestroignore		.maestroignore
.mcp.json.example		.mcp.json.example
CLAUDE.md		CLAUDE.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
README.md		README.md
compose.yaml		compose.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Code Index

What it does

Features

Installation

Usage: Connecting AI Agents (MCP)

Setting up Semantic Search (Optional)

Using a Free Local Model (Ollama)

Using OpenAI

Advanced Usage

Running as a standalone JSON-RPC Server

Brand / Flavor Configuration (Android)

Large codebases

Benchmarks

Android project (~860 files, ~8000 symbols)

Chromium project (~86K files, ~1.8M symbols, C++ and Java)

Summary

When Code Index helps most

Running your own benchmarks

Contributing and Technical Details

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Code Index

What it does

Features

Installation

Usage: Connecting AI Agents (MCP)

Setting up Semantic Search (Optional)

Using a Free Local Model (Ollama)

Using OpenAI

Advanced Usage

Running as a standalone JSON-RPC Server

Brand / Flavor Configuration (Android)

Large codebases

Benchmarks

Android project (~860 files, ~8000 symbols)

Chromium project (~86K files, ~1.8M symbols, C++ and Java)

Summary

When Code Index helps most

Running your own benchmarks

Contributing and Technical Details

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages