MCP Vector Search

A high-performance MCP (Model Context Protocol) server designed specifically for Claude Code that enables precision semantic search through codebases and documentation. Built for agentic search.

Unlike traditional search tools that flood agents with irrelevant data, MCP Vector Search uses a sophisticated 3-stage pipeline to deliver only the most relevant results.

Search Pipeline Architecture

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│ Claude Code     │    │  Vector Search  │    │   Reranker      │    │   AI Filter     │
│ Query           │───▶│                 │───▶│                 │───▶│                 │───▶ Results
│ "find auth code"│    │ ChromaDB        │    │ VoyageAI        │    │ Claude CLI      │
│                 │    │ ~20 candidates  │    │ Relevance       │    │ Precision       │
└─────────────────┘    └─────────────────┘    │ Scoring         │    │ Filtering       │
                                              └─────────────────┘    └─────────────────┘

Stage Impact:           Recall: ~100%          Precision: ~19%       Precision: ~55%
                       Precision: ~5%          Speed: +1s            Speed: +14s
                       Speed: ~0.2s            (Optional)            (Optional)

Pipeline Stages:

Vector Search: Semantic similarity matching using embeddings
Reranker: Advanced relevance scoring (VoyageAI rerank models)
AI Filter: Intelligent result filtering using Claude for precision

Performance Analysis

All performance tests conducted on 506 Netwrix Auditor knowledge base articles. Each article paired with a specific question, automated testing scripts attempt to find correct articles. Results show real-world search accuracy across different configurations.

Stage 1: Semantic Search Results

Configuration	Recall@1	Recall@3	Recall@5	Recall@10	Recall@20	Recall@50	MRR	Mean Position	Avg Search Time
voyage-3-large-2048dimensions	86.03%	93.41%	97.60%	100.00%	100.00%	100.00%	0.907	1.37	0.232s
voyage-3-large-1024dimensions	84.43%	93.81%	98.20%	100.00%	100.00%	100.00%	0.900	1.37	0.227s
voyage-3.5-lite-2048dimensions	83.23%	93.01%	96.41%	99.20%	99.60%	99.80%	0.888	1.79	0.249s
openai-3-large-chunk-1024	77.05%	92.61%	97.41%	99.00%	99.40%	99.80%	0.856	1.75	0.371s
ollama-snowflake-chunk-128	72.26%	87.62%	91.42%	95.61%	97.41%	98.40%	0.807	4.01	0.139s

Voyage AI leads with 2048-dimension models. Chunking disabled by default, using full context window. Large files auto-chunk when exceeding limits.

Optimal chunking: OpenAI (1024 tokens), Ollama (128 tokens).

Enterprise usage: Voyage AI (open docs only), OpenAI (company-supported), Ollama (local/private).

Stage 2: Reranker Analysis

Relevance Score Thresholds Distribution

Configuration	Min	1%	3%	5%	10%	25%	Median	Mean	Max
voyage-3-large-2048-rerank-2.5	0.746	0.797	0.844	0.859	0.891	0.922	0.941	0.932	0.973

Reranker requires Voyage AI API key, open documentation only.

RERANKER_THRESHOLD controls precision/recall trade-off. Netwrix Auditor dataset shows minimum relevance 0.746 for correct files. Safe threshold 0.7+ includes all relevant files.

Recommended thresholds: Documentation (0.6-0.7), Code (0.5).

Threshold Impact on Document Count (501 questions dataset)

Threshold	Description	Mean Docs	Median	Min	Max	Precision*
0.700	Current (baseline)	6.52	5	1	20	15.3%
0.797	Safe (99% pass)	3.83	2	0	20	26.1%
0.844	Balanced (97% pass)	2.65	1	0	16	37.7%
0.859	Aggressive (95% pass)	2.45	1	0	15	40.8%

*Precision = 1 / Mean Docs × 100%

Higher thresholds reduce document count and improve precision from 15.3% to 40.8%.

Stage 3: AI Filter Analysis

Model Performance Testing (v4 Precision Prompt)

Model	Precision	Recall	F1 Score	Accuracy
claude-sonnet-4-20250514	0.900	0.900	0.900	0.900
claude-opus-4-1-20250805	0.818	0.900	0.857	0.850
claude-3-5-haiku-20241022	0.750	0.300	0.429	0.600

Claude Sonnet 4: 90% precision/recall - optimal for filtering. Claude Opus 4: 81.8% precision - acceptable with more false positives. Claude Haiku: 30% recall - unsuitable for filtering.

Precision Impact Analysis

Configuration	Recall@1	Recall@3	Recall@5	Recall@10	MRR	Mean Position	Precision*	Avg Search Time
voyage-3-large-rerank-threshold-0.888	81.4%	88.8%	89.4%	90.2%	0.852	1.17	54.9%	~1.0s
voyage-3-large-rerank-aifilter	77.45%	82.04%	83.23%	84.63%	0.800	1.22	55.25%	14.78s
voyage-3-large-2048-rerank	88.02%	96.61%	98.40%	100.00%	0.927	1.26	18.98%	0.985s
voyage-3-large-2048dimensions	86.03%	93.41%	97.60%	100.00%	0.907	1.37	5.00%	0.232s

*Precision = 1 / Mean Documents Returned

Key Findings:

Optimal approach: Reranker with threshold=0.888 achieves same precision (54.9%) as AI Filter but with better recall
Performance comparison at ~55% precision:
- Reranker (0.888): Recall@1=81.4%, Recall@5=89.4%, Speed=1s
- AI Filter: Recall@1=77.45%, Recall@5=83.23%, Speed=15s

Recommendations by Use Case:

Open documentation: Use Voyage AI with high reranker threshold (0.5-0.8) for optimal speed and precision
Company codebases: Use OpenAI models with AI Filter to achieve similar precision results when Voyage AI unavailable

Why Precision is Critical for Claude Code:

Precision is the most important metric for agent search systems due to Claude Code's limited context window. Low precision has severe consequences:

Context Window Waste: With 5% precision (semantic search only), returning 20 documents means only 1 is relevant - wasting 95% of valuable context on irrelevant data. Claude Code agents often perform multiple searches, and context depletion ends conversations prematurely.

Model Confusion: Irrelevant documents can mislead the model, causing incorrect conclusions or actions.

Semantic Search Limitations:

Always returns K results regardless of relevance
No concept of "no relevant results" - will return 4 documents even when none are relevant
Cannot handle variable result counts - returns 4 when 20 relevant files exist

Reranker Solution: Unlike semantic search, reranker evaluates query-document relevance with threshold filtering:

No relevant docs: Returns empty results (threshold not met)
Multiple relevant docs: Returns all above threshold
Adaptive results: Result count varies based on actual relevance

This threshold-based filtering is the key improvement that makes search practical for agent use.

Prerequisites

Python 3.10+
One of the following embedding providers:
- VoyageAI API key for cloud embeddings
- OpenAI API key for cloud embeddings
- Ollama for local embeddings
(Optional) Claude CLI for AI filtering

Installation

Option A: Install from GitHub (recommended)

Pick the extra that matches your embedding provider:

# VoyageAI only
pip install "git+https://github.com/Amico1285/mcp-vector-search.git#egg=mcp-vector-search[voyage]"

# OpenAI only
pip install "git+https://github.com/Amico1285/mcp-vector-search.git#egg=mcp-vector-search[openai]"

# Ollama only
pip install "git+https://github.com/Amico1285/mcp-vector-search.git#egg=mcp-vector-search[ollama]"

# Everything
pip install "git+https://github.com/Amico1285/mcp-vector-search.git#egg=mcp-vector-search[all]"

This installs the mcp-vector-search command into your active environment.

Option B: From source (for development)

git clone https://github.com/Amico1285/mcp-vector-search.git
cd mcp-vector-search
python3 -m venv venv
source venv/bin/activate    # Windows: venv\Scripts\activate
pip install -e ".[all]"

Verify

mcp-vector-search

You should see the FastMCP startup banner. Press Ctrl+C to stop.

If mcp-vector-search isn't on your PATH (typical when installed inside a virtualenv that isn't activated), you can run the package as a module instead — same result:

python -m code_search_mcp

Quick Start with Claude Code

This MCP server is designed to work seamlessly with Claude Code, providing agents with powerful search and database management capabilities.

1. Add MCP server to Claude Code

Option A: Using Claude CLI command

# Add as project-scoped server (creates .mcp.json in project root)
claude mcp add code-search --scope project mcp-vector-search

Option B: Manual configuration

Copy .mcp.json.example to .mcp.json in your project root and fill in the values
Exit Claude Code
Open Claude Code again in the same project
When prompted "Use MCP servers found in this project?", answer "Yes"
The MCP server is now connected

2. Configure the server

Edit the .mcp.json file to add your configuration. Choose one of the embedding providers:

Option A: VoyageAI (Cloud) - Best Performance with Reranker

{
  "mcpServers": {
    "code-search": {
      "command": "mcp-vector-search",
      "env": {
        "CODEBASE_PATH": "/path/to/your/codebase",
        
        "EMBEDDING_PROVIDER": "voyage",
        "VOYAGE_API_KEY": "your-voyage-api-key-here",
        "VOYAGE_EMBEDDING_MODEL": "voyage-3-large",
        "VOYAGE_OUTPUT_DIMENSION": "2048",
        
        "SEMANTIC_SEARCH_N_RESULTS": "20",
        
        "RERANKER_ENABLED": "true",
        "RERANKER_THRESHOLD": "0.7",
        "RERANKER_MODEL": "rerank-2.5",
        "RERANKER_INSTRUCTIONS": "",
        
        "AI_FILTER_ENABLED": "false",
        "AI_FILTER_MODEL": "claude-sonnet-4-20250514",
        "AI_FILTER_TIMEOUT_SECONDS": "120",
        
        "MAX_RESULTS": "10",
        
        "LOGGING_VERBOSE": "false",
        "LOGGING_FILE_ENABLED": "false",
        "LOGGING_FILE_PATH": "Logs/search_operations.log",
        
        "PREVIEW_LINES_VECTORIZATION": "-1",
        "PREVIEW_LINES_STORAGE": "-1",
        "PREVIEW_LINES_RERANKER": "-1",
        "PREVIEW_LINES_AI_FILTER": "-1",
        "PREVIEW_CHARS_OUTPUT": "0"
      }
    }
  }
}

Option B: OpenAI (Cloud) - Good Performance, Flexible

{
  "mcpServers": {
    "code-search": {
      "command": "mcp-vector-search",
      "env": {
        "CODEBASE_PATH": "/path/to/your/codebase",
        
        "EMBEDDING_PROVIDER": "openai",
        "OPENAI_API_KEY": "sk-...",
        "OPENAI_EMBEDDING_MODEL": "text-embedding-3-large",

        "SEMANTIC_SEARCH_N_RESULTS": "20",

        "RERANKER_ENABLED": "false",

        "AI_FILTER_ENABLED": "true",
        "AI_FILTER_MODEL": "claude-sonnet-4-20250514",
        "AI_FILTER_TIMEOUT_SECONDS": "120",

        "MAX_RESULTS": "10",

        "LOGGING_VERBOSE": "false",
        "LOGGING_FILE_ENABLED": "false",
        "LOGGING_FILE_PATH": "Logs/search_operations.log",

        "PREVIEW_LINES_VECTORIZATION": "-1",
        "PREVIEW_LINES_STORAGE": "-1",
        "PREVIEW_LINES_RERANKER": "100",
        "PREVIEW_LINES_AI_FILTER": "40",
        "PREVIEW_CHARS_OUTPUT": "0"
      }
    }
  }
}

Option C: Ollama (Local) - Complete Privacy

{
  "mcpServers": {
    "code-search": {
      "command": "mcp-vector-search",
      "env": {
        "CODEBASE_PATH": "/path/to/your/codebase",
        
        "EMBEDDING_PROVIDER": "ollama",
        "OLLAMA_BASE_URL": "http://localhost:11434",
        "OLLAMA_EMBEDDING_MODEL": "snowflake-arctic-embed2",

        "SEMANTIC_SEARCH_N_RESULTS": "20",

        "RERANKER_ENABLED": "false",

        "AI_FILTER_ENABLED": "true",
        "AI_FILTER_MODEL": "claude-sonnet-4-20250514",
        "AI_FILTER_TIMEOUT_SECONDS": "120",

        "MAX_RESULTS": "10",

        "LOGGING_VERBOSE": "false",
        "LOGGING_FILE_ENABLED": "false",
        "LOGGING_FILE_PATH": "Logs/search_operations.log",

        "PREVIEW_LINES_VECTORIZATION": "-1",
        "PREVIEW_LINES_STORAGE": "-1",
        "PREVIEW_LINES_AI_FILTER": "40",
        "PREVIEW_CHARS_OUTPUT": "0"
      }
    }
  }
}

Note: If mcp-vector-search is installed inside a virtualenv that is not on Claude Code's PATH, set command to the absolute path of that binary, e.g. /path/to/venv/bin/mcp-vector-search.

3. Reconnect to the server

In Claude Code, use the /mcp command to reconnect and verify the server is running.

4. Agent Capabilities

Once connected, Claude Code agents can autonomously:

Initialize and manage the database:

Step 1: Create configuration

"Please analyze my codebase and create search configuration"

The agent will:

Analyze your codebase structure and detect frameworks
Generate configuration showing which files will be indexed
Display total file count and exclusion patterns
Allow you to review and adjust the configuration

Step 2: Start vectorization

"Start vectorizing the database with current configuration"

After reviewing the configuration, the agent uses a separate tool to begin vectorization. For large codebases, enable logging in .mcp.json to track progress:

"LOGGING_VERBOSE": "true",
"LOGGING_FILE_ENABLED": "true"

Search with natural language:

"Find the authentication implementation"
"Show me where error handling happens"
"Locate the API endpoint definitions"
"Find documentation about deployment"

Update and reconfigure on demand:

"Update the search index with the latest changes"
"Exclude .txt files from search"
"Exclude /folder files from search"
"Add .md files to the search scope"

Configuration

The server behavior is customized via environment variables.

Important: After changing configuration in .mcp.json, you need to restart Claude Code for changes to take effect (this also restarts the MCP server). To preserve your conversation progress, use:

claude --continue

Required Variables

Variable	Description	Default
`CODEBASE_PATH`	Absolute path to the codebase to index	(required)
`EMBEDDING_PROVIDER`	Embedding provider: `voyage`, `openai`, or `ollama`	voyage
`DB_NAME`	ChromaDB collection name. Use a unique value per indexed codebase to keep multiple projects separate.	codebase_files

Embedding Provider Settings

Only the variables for your selected EMBEDDING_PROVIDER are read; the others are ignored.

VoyageAI (Cloud)

Variable	Description	Default
`VOYAGE_API_KEY`	Your VoyageAI API key	(required)
`VOYAGE_EMBEDDING_MODEL`	VoyageAI model for embeddings	voyage-code-3
`VOYAGE_OUTPUT_DIMENSION`	Output vector dimensions (256/512/1024/2048)	(model default)
`VOYAGE_ENABLE_CHUNKING`	Enable chunking for Voyage models	true
`VOYAGE_MAX_CHUNK_TOKENS`	Maximum tokens per chunk for Voyage	(model-specific)
`VOYAGE_CHUNK_OVERLAP_TOKENS`	Overlap tokens between chunks for Voyage	(model-specific)
`VOYAGE_MIN_CHUNK_TOKENS`	Minimum tokens per chunk for Voyage	(model-specific)

OpenAI (Cloud)

Variable	Description	Default
`OPENAI_API_KEY`	Your OpenAI API key	(required)
`OPENAI_EMBEDDING_MODEL`	OpenAI model for embeddings	text-embedding-3-large
`OPENAI_BATCH_SIZE`	Batch size for OpenAI API requests	2048

Ollama (Local)

Variable	Description	Default
`OLLAMA_BASE_URL`	Ollama server URL	http://localhost:11434
`OLLAMA_EMBEDDING_MODEL`	Ollama model name	snowflake-arctic-embed2

Search Pipeline Settings

Variable	Description	Default
`SEMANTIC_SEARCH_N_RESULTS`	Initial candidates from vector search	20
`RERANKER_ENABLED`	Enable VoyageAI reranker (Voyage provider only)	true
`RERANKER_THRESHOLD`	Minimum relevance score; files below are filtered out	0.5
`RERANKER_MODEL`	Reranker model name	rerank-2.5
`RERANKER_INSTRUCTIONS`	Custom instructions sent to the reranker to bias relevance	(empty)
`RERANKER_USE_CHUNKS`	Send chunk text to the reranker instead of full file content	false
`AI_FILTER_ENABLED`	Enable Claude CLI filtering	true
`AI_FILTER_MODEL`	Claude model used for filtering	claude-sonnet-4-20250514
`AI_FILTER_TIMEOUT_SECONDS`	Timeout for the Claude CLI call	120
`MAX_RESULTS`	Maximum results returned by `search_files`	10

Hybrid Search (BM25 + Vector + RRF)

Disabled by default. Set HYBRID_SEARCH_ENABLED=true to combine BM25 keyword matching with semantic search using Reciprocal Rank Fusion. Useful when literal token matches matter (function names, error strings, exact phrases).

Variable	Description	Default
`HYBRID_SEARCH_ENABLED`	Enable BM25 + vector + RRF pipeline	false
`BM25_ONLY_MODE`	Skip vector search and use only BM25	false
`RRF_K_PARAMETER`	RRF k constant (1–1000)	60
`RRF_WEIGHTS_ENABLED`	Use weighted RRF instead of standard RRF	false
`RRF_VECTOR_WEIGHT`	Weight for vector results (0–1; pair must sum to ~1)	0.6
`RRF_BM25_WEIGHT`	Weight for BM25 results (0–1; pair must sum to ~1)	0.4
`BM25_K1_PARAMETER`	BM25 term-frequency saturation (0.1–3.0)	1.2
`BM25_B_PARAMETER`	BM25 length normalisation (0–1)	0.75
`BM25_N_RESULTS`	Top BM25 candidates passed to RRF	20
`BM25_MIN_TOKEN_LENGTH`	Minimum token length when indexing	2
`BM25_REMOVE_STOPWORDS`	Remove stopwords during indexing/search	true
`BM25_LANGUAGE`	Language for stemming and stopwords	english
`BM25_USE_STEMMING`	Apply stemming when tokenising	false
`BM25_USE_CHUNKING`	Index chunks (instead of full files) for BM25	false

Logging Settings

Variable	Description	Default
`LOGGING_VERBOSE`	Enable verbose console logging	false
`LOGGING_FILE_ENABLED`	Enable file logging	false
`LOGGING_FILE_PATH`	Path to log file	Logs/search_operations.log

Preview Lines Settings

Controls how many lines of code are used at each stage of the search pipeline:

Variable	Description	Default	Details
`PREVIEW_LINES_VECTORIZATION`	Lines used to create search embeddings	30	First N lines of each file are vectorized for semantic search. Higher = better context but slower. -1 = entire file
`PREVIEW_LINES_STORAGE`	Lines stored in database	-1	How much code to save per file. -1 = entire file (recommended)
`PREVIEW_LINES_RERANKER`	Lines sent to VoyageAI reranker	100	Code context for relevance scoring. Balance between accuracy and speed. -1 = entire file
`PREVIEW_LINES_AI_FILTER`	Lines sent to Claude for filtering	40	Code context for AI relevance evaluation. More lines = better judgment. -1 = entire file
`PREVIEW_CHARS_OUTPUT`	Per-file content shown in search results	0	`0` = paths and scores only (agent reads files via Read/Grep/Glob); `N` = first N characters; `-1` = entire file

All settings are configured via environment variables in your MCP server configuration.

Recommended Configurations

The right embedding model and chunking strategy depend on the kind of corpus you're indexing. The four presets below are sensible starting points; tune from there with your own measurements. Every preset assumes you've also set CODEBASE_PATH and DB_NAME.

Documentation / knowledge base (markdown, prose)

Best for: product docs, runbooks, user guides, mostly-prose corpora.

"EMBEDDING_PROVIDER": "voyage",
"VOYAGE_API_KEY": "...",
"VOYAGE_EMBEDDING_MODEL": "voyage-context-3",
"VOYAGE_ENABLE_CHUNKING": "true",
"VOYAGE_MAX_CHUNK_TOKENS": "64",

"SEMANTIC_SEARCH_N_RESULTS": "30",
"RERANKER_ENABLED": "true",
"RERANKER_THRESHOLD": "0.5",
"AI_FILTER_ENABLED": "false",
"MAX_RESULTS": "10",

"PREVIEW_CHARS_OUTPUT": "200"

voyage-context-3 produces contextualised chunk embeddings — each chunk vector encodes its surrounding chunks, not just the chunk text in isolation. With small chunks (~64 tokens), this gives accurate retrieval on long documents. PREVIEW_CHARS_OUTPUT=200 shows the agent the first ~200 characters of each candidate so it can confirm relevance without an extra Read.

Code search (mixed code + docs)

Best for: source-code repositories and monorepos, where exact file paths matter more than long paragraphs.

"EMBEDDING_PROVIDER": "voyage",
"VOYAGE_API_KEY": "...",
"VOYAGE_EMBEDDING_MODEL": "voyage-3-large",
"VOYAGE_OUTPUT_DIMENSION": "1024",
"VOYAGE_ENABLE_CHUNKING": "false",

"SEMANTIC_SEARCH_N_RESULTS": "20",
"RERANKER_ENABLED": "true",
"RERANKER_THRESHOLD": "0.5",
"AI_FILTER_ENABLED": "false",
"MAX_RESULTS": "10",

"PREVIEW_CHARS_OUTPUT": "0"

voyage-3-large is general-purpose and embeds each file whole (no chunking). PREVIEW_CHARS_OUTPUT=0 returns just paths and scores — the agent then opens the most relevant files with Read/Grep/Glob. This keeps token cost low on large codebases.

Local / privacy-sensitive

Best for: corpora that can't leave the machine. No cloud API calls.

"EMBEDDING_PROVIDER": "ollama",
"OLLAMA_BASE_URL": "http://localhost:11434",
"OLLAMA_EMBEDDING_MODEL": "snowflake-arctic-embed2",

"SEMANTIC_SEARCH_N_RESULTS": "20",
"RERANKER_ENABLED": "false",
"AI_FILTER_ENABLED": "false",
"MAX_RESULTS": "10",

"PREVIEW_CHARS_OUTPUT": "100"

The Voyage reranker requires API access, so it's off here. Search quality depends heavily on the Ollama model — snowflake-arctic-embed2 is a solid baseline; see OLLAMA_SETUP.md for alternatives.

Cheap / fast cloud (no reranker)

Best for: quick prototyping, very large corpora, or environments already on the OpenAI stack.

"EMBEDDING_PROVIDER": "openai",
"OPENAI_API_KEY": "sk-...",
"OPENAI_EMBEDDING_MODEL": "text-embedding-3-large",

"SEMANTIC_SEARCH_N_RESULTS": "20",
"RERANKER_ENABLED": "false",
"AI_FILTER_ENABLED": "true",
"AI_FILTER_MODEL": "claude-sonnet-4-20250514",
"MAX_RESULTS": "10",

"PREVIEW_CHARS_OUTPUT": "0"

OpenAI has no built-in reranker, so AI_FILTER_ENABLED=true brings Claude in to filter relevance after vector search. Requires the Claude CLI on PATH.

Advanced Usage

Use Cases

This tool excels in various scenarios:

Large Codebases: Navigate complex projects with thousands of files effortlessly
Documentation Search: Find relevant documentation sections instantly
Code Reviews: Quickly locate related code sections during reviews
Onboarding: Help new team members explore and understand the codebase
Refactoring: Find all instances of patterns that need updating
Debugging: Locate error handling and logging implementations

Troubleshooting

`command not found: mcp-vector-search`

The package was installed but the binary isn't on Claude Code's PATH. This is typical when the install lives inside a virtualenv.

In .mcp.json, point at the absolute path to the binary:

"command": "/absolute/path/to/venv/bin/mcp-vector-search"

Or invoke the module directly:

"command": "/absolute/path/to/venv/bin/python",
"args": ["-m", "code_search_mcp"]

`ModuleNotFoundError: No module named 'fastmcp'` (or any other dep)

Dependencies were not installed for the active Python environment. Reinstall with the right extra:

pip install "git+https://github.com/Amico1285/mcp-vector-search.git#egg=mcp-vector-search[all]"

`externally-managed-environment`

Modern macOS/Linux protect system Python. Always install into a virtualenv:

python3 -m venv venv
source venv/bin/activate
pip install "git+https://github.com/Amico1285/mcp-vector-search.git#egg=mcp-vector-search[all]"

`CODEBASE_PATH environment variable not set`

Add it to the env block of your .mcp.json — see .mcp.json.example.

`Claude CLI not found`

The AI filter is optional. Set AI_FILTER_ENABLED=false, or install the Claude CLI.

Search returns no results

Check get_server_info() — has the codebase been vectorized yet?
Run update_db() to (re)build the index.
Verify the embedding provider's API key is correct.

Enable verbose logs:

"LOGGING_VERBOSE": "true",
"LOGGING_FILE_ENABLED": "true"

and tail Logs/search_operations.log.

Using one installation across multiple projects

Install once, then point each project's .mcp.json at the same binary with a different DB_NAME:

{
  "mcpServers": {
    "code-search": {
      "command": "mcp-vector-search",
      "env": {
        "CODEBASE_PATH": "/absolute/path/to/project1",
        "DB_NAME": "project1_db",
        "EMBEDDING_PROVIDER": "voyage",
        "VOYAGE_API_KEY": "your-api-key"
      }
    }
  }
}

Windows

If mcp-vector-search doesn't resolve through Claude Code on Windows, point at the venv binary directly:

"command": "C:\\path\\to\\venv\\Scripts\\mcp-vector-search.exe"

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Acknowledgments

VoyageAI for embeddings
ChromaDB for vector storage
Anthropic for Claude CLI
FastMCP for the MCP framework

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
code_search_mcp		code_search_mcp
.gitignore		.gitignore
.mcp.json.example		.mcp.json.example
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
OLLAMA_SETUP.md		OLLAMA_SETUP.md
README.md		README.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

MCP Vector Search

Search Pipeline Architecture

Performance Analysis

Stage 1: Semantic Search Results

Stage 2: Reranker Analysis

Relevance Score Thresholds Distribution

Threshold Impact on Document Count (501 questions dataset)

Stage 3: AI Filter Analysis

Model Performance Testing (v4 Precision Prompt)

Precision Impact Analysis

Prerequisites

Installation

Option A: Install from GitHub (recommended)

Option B: From source (for development)

Verify

Quick Start with Claude Code

1. Add MCP server to Claude Code

2. Configure the server

Option A: VoyageAI (Cloud) - Best Performance with Reranker

Option B: OpenAI (Cloud) - Good Performance, Flexible

Option C: Ollama (Local) - Complete Privacy

3. Reconnect to the server

4. Agent Capabilities

Initialize and manage the database:

Search with natural language:

Update and reconfigure on demand:

Configuration

Required Variables

Embedding Provider Settings

VoyageAI (Cloud)

OpenAI (Cloud)

Ollama (Local)

Search Pipeline Settings

Hybrid Search (BM25 + Vector + RRF)

Logging Settings

Preview Lines Settings

Recommended Configurations

Documentation / knowledge base (markdown, prose)

Code search (mixed code + docs)

Local / privacy-sensitive

Cheap / fast cloud (no reranker)

Advanced Usage

Use Cases

Troubleshooting

command not found: mcp-vector-search

ModuleNotFoundError: No module named 'fastmcp' (or any other dep)

externally-managed-environment

CODEBASE_PATH environment variable not set

Claude CLI not found

Search returns no results

Using one installation across multiple projects

Windows

Contributing

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`command not found: mcp-vector-search`

`ModuleNotFoundError: No module named 'fastmcp'` (or any other dep)

`externally-managed-environment`

`CODEBASE_PATH environment variable not set`

`Claude CLI not found`

Packages