A high-performance MCP (Model Context Protocol) server designed specifically for Claude Code that enables precision semantic search through codebases and documentation. Built for agentic search.
Unlike traditional search tools that flood agents with irrelevant data, MCP Vector Search uses a sophisticated 3-stage pipeline to deliver only the most relevant results.
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Claude Code │ │ Vector Search │ │ Reranker │ │ AI Filter │
│ Query │───▶│ │───▶│ │───▶│ │───▶ Results
│ "find auth code"│ │ ChromaDB │ │ VoyageAI │ │ Claude CLI │
│ │ │ ~20 candidates │ │ Relevance │ │ Precision │
└─────────────────┘ └─────────────────┘ │ Scoring │ │ Filtering │
└─────────────────┘ └─────────────────┘
Stage Impact: Recall: ~100% Precision: ~19% Precision: ~55%
Precision: ~5% Speed: +1s Speed: +14s
Speed: ~0.2s (Optional) (Optional)
Pipeline Stages:
- Vector Search: Semantic similarity matching using embeddings
- Reranker: Advanced relevance scoring (VoyageAI rerank models)
- AI Filter: Intelligent result filtering using Claude for precision
All performance tests conducted on 506 Netwrix Auditor knowledge base articles. Each article paired with a specific question, automated testing scripts attempt to find correct articles. Results show real-world search accuracy across different configurations.
| Configuration | Recall@1 | Recall@3 | Recall@5 | Recall@10 | Recall@20 | Recall@50 | MRR | Mean Position | Avg Search Time |
|---|---|---|---|---|---|---|---|---|---|
| voyage-3-large-2048dimensions | 86.03% | 93.41% | 97.60% | 100.00% | 100.00% | 100.00% | 0.907 | 1.37 | 0.232s |
| voyage-3-large-1024dimensions | 84.43% | 93.81% | 98.20% | 100.00% | 100.00% | 100.00% | 0.900 | 1.37 | 0.227s |
| voyage-3.5-lite-2048dimensions | 83.23% | 93.01% | 96.41% | 99.20% | 99.60% | 99.80% | 0.888 | 1.79 | 0.249s |
| openai-3-large-chunk-1024 | 77.05% | 92.61% | 97.41% | 99.00% | 99.40% | 99.80% | 0.856 | 1.75 | 0.371s |
| ollama-snowflake-chunk-128 | 72.26% | 87.62% | 91.42% | 95.61% | 97.41% | 98.40% | 0.807 | 4.01 | 0.139s |
Voyage AI leads with 2048-dimension models. Chunking disabled by default, using full context window. Large files auto-chunk when exceeding limits.
Optimal chunking: OpenAI (1024 tokens), Ollama (128 tokens).
Enterprise usage: Voyage AI (open docs only), OpenAI (company-supported), Ollama (local/private).
| Configuration | Min | 1% | 3% | 5% | 10% | 25% | Median | Mean | Max |
|---|---|---|---|---|---|---|---|---|---|
| voyage-3-large-2048-rerank-2.5 | 0.746 | 0.797 | 0.844 | 0.859 | 0.891 | 0.922 | 0.941 | 0.932 | 0.973 |
Reranker requires Voyage AI API key, open documentation only.
RERANKER_THRESHOLD controls precision/recall trade-off. Netwrix Auditor dataset shows minimum relevance 0.746 for correct files. Safe threshold 0.7+ includes all relevant files.
Recommended thresholds: Documentation (0.6-0.7), Code (0.5).
| Threshold | Description | Mean Docs | Median | Min | Max | Precision* |
|---|---|---|---|---|---|---|
| 0.700 | Current (baseline) | 6.52 | 5 | 1 | 20 | 15.3% |
| 0.797 | Safe (99% pass) | 3.83 | 2 | 0 | 20 | 26.1% |
| 0.844 | Balanced (97% pass) | 2.65 | 1 | 0 | 16 | 37.7% |
| 0.859 | Aggressive (95% pass) | 2.45 | 1 | 0 | 15 | 40.8% |
*Precision = 1 / Mean Docs × 100%
Higher thresholds reduce document count and improve precision from 15.3% to 40.8%.
| Model | Precision | Recall | F1 Score | Accuracy |
|---|---|---|---|---|
| claude-sonnet-4-20250514 | 0.900 | 0.900 | 0.900 | 0.900 |
| claude-opus-4-1-20250805 | 0.818 | 0.900 | 0.857 | 0.850 |
| claude-3-5-haiku-20241022 | 0.750 | 0.300 | 0.429 | 0.600 |
Claude Sonnet 4: 90% precision/recall - optimal for filtering. Claude Opus 4: 81.8% precision - acceptable with more false positives. Claude Haiku: 30% recall - unsuitable for filtering.
| Configuration | Recall@1 | Recall@3 | Recall@5 | Recall@10 | MRR | Mean Position | Precision* | Avg Search Time |
|---|---|---|---|---|---|---|---|---|
| voyage-3-large-rerank-threshold-0.888 | 81.4% | 88.8% | 89.4% | 90.2% | 0.852 | 1.17 | 54.9% | ~1.0s |
| voyage-3-large-rerank-aifilter | 77.45% | 82.04% | 83.23% | 84.63% | 0.800 | 1.22 | 55.25% | 14.78s |
| voyage-3-large-2048-rerank | 88.02% | 96.61% | 98.40% | 100.00% | 0.927 | 1.26 | 18.98% | 0.985s |
| voyage-3-large-2048dimensions | 86.03% | 93.41% | 97.60% | 100.00% | 0.907 | 1.37 | 5.00% | 0.232s |
*Precision = 1 / Mean Documents Returned
Key Findings:
- Optimal approach: Reranker with threshold=0.888 achieves same precision (54.9%) as AI Filter but with better recall
- Performance comparison at ~55% precision:
- Reranker (0.888): Recall@1=81.4%, Recall@5=89.4%, Speed=1s
- AI Filter: Recall@1=77.45%, Recall@5=83.23%, Speed=15s
Recommendations by Use Case:
- Open documentation: Use Voyage AI with high reranker threshold (0.5-0.8) for optimal speed and precision
- Company codebases: Use OpenAI models with AI Filter to achieve similar precision results when Voyage AI unavailable
Why Precision is Critical for Claude Code:
Precision is the most important metric for agent search systems due to Claude Code's limited context window. Low precision has severe consequences:
Context Window Waste: With 5% precision (semantic search only), returning 20 documents means only 1 is relevant - wasting 95% of valuable context on irrelevant data. Claude Code agents often perform multiple searches, and context depletion ends conversations prematurely.
Model Confusion: Irrelevant documents can mislead the model, causing incorrect conclusions or actions.
Semantic Search Limitations:
- Always returns K results regardless of relevance
- No concept of "no relevant results" - will return 4 documents even when none are relevant
- Cannot handle variable result counts - returns 4 when 20 relevant files exist
Reranker Solution: Unlike semantic search, reranker evaluates query-document relevance with threshold filtering:
- No relevant docs: Returns empty results (threshold not met)
- Multiple relevant docs: Returns all above threshold
- Adaptive results: Result count varies based on actual relevance
This threshold-based filtering is the key improvement that makes search practical for agent use.
- Python 3.10+
- One of the following embedding providers:
- VoyageAI API key for cloud embeddings
- OpenAI API key for cloud embeddings
- Ollama for local embeddings
- (Optional) Claude CLI for AI filtering
Pick the extra that matches your embedding provider:
# VoyageAI only
pip install "git+https://github.com/Amico1285/mcp-vector-search.git#egg=mcp-vector-search[voyage]"
# OpenAI only
pip install "git+https://github.com/Amico1285/mcp-vector-search.git#egg=mcp-vector-search[openai]"
# Ollama only
pip install "git+https://github.com/Amico1285/mcp-vector-search.git#egg=mcp-vector-search[ollama]"
# Everything
pip install "git+https://github.com/Amico1285/mcp-vector-search.git#egg=mcp-vector-search[all]"This installs the mcp-vector-search command into your active environment.
git clone https://github.com/Amico1285/mcp-vector-search.git
cd mcp-vector-search
python3 -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -e ".[all]"mcp-vector-searchYou should see the FastMCP startup banner. Press Ctrl+C to stop.
If mcp-vector-search isn't on your PATH (typical when installed inside a virtualenv that isn't activated), you can run the package as a module instead — same result:
python -m code_search_mcpThis MCP server is designed to work seamlessly with Claude Code, providing agents with powerful search and database management capabilities.
Option A: Using Claude CLI command
# Add as project-scoped server (creates .mcp.json in project root)
claude mcp add code-search --scope project mcp-vector-searchOption B: Manual configuration
- Copy
.mcp.json.exampleto.mcp.jsonin your project root and fill in the values - Exit Claude Code
- Open Claude Code again in the same project
- When prompted "Use MCP servers found in this project?", answer "Yes"
- The MCP server is now connected
Edit the .mcp.json file to add your configuration. Choose one of the embedding providers:
{
"mcpServers": {
"code-search": {
"command": "mcp-vector-search",
"env": {
"CODEBASE_PATH": "/path/to/your/codebase",
"EMBEDDING_PROVIDER": "voyage",
"VOYAGE_API_KEY": "your-voyage-api-key-here",
"VOYAGE_EMBEDDING_MODEL": "voyage-3-large",
"VOYAGE_OUTPUT_DIMENSION": "2048",
"SEMANTIC_SEARCH_N_RESULTS": "20",
"RERANKER_ENABLED": "true",
"RERANKER_THRESHOLD": "0.7",
"RERANKER_MODEL": "rerank-2.5",
"RERANKER_INSTRUCTIONS": "",
"AI_FILTER_ENABLED": "false",
"AI_FILTER_MODEL": "claude-sonnet-4-20250514",
"AI_FILTER_TIMEOUT_SECONDS": "120",
"MAX_RESULTS": "10",
"LOGGING_VERBOSE": "false",
"LOGGING_FILE_ENABLED": "false",
"LOGGING_FILE_PATH": "Logs/search_operations.log",
"PREVIEW_LINES_VECTORIZATION": "-1",
"PREVIEW_LINES_STORAGE": "-1",
"PREVIEW_LINES_RERANKER": "-1",
"PREVIEW_LINES_AI_FILTER": "-1",
"PREVIEW_CHARS_OUTPUT": "0"
}
}
}
}{
"mcpServers": {
"code-search": {
"command": "mcp-vector-search",
"env": {
"CODEBASE_PATH": "/path/to/your/codebase",
"EMBEDDING_PROVIDER": "openai",
"OPENAI_API_KEY": "sk-...",
"OPENAI_EMBEDDING_MODEL": "text-embedding-3-large",
"SEMANTIC_SEARCH_N_RESULTS": "20",
"RERANKER_ENABLED": "false",
"AI_FILTER_ENABLED": "true",
"AI_FILTER_MODEL": "claude-sonnet-4-20250514",
"AI_FILTER_TIMEOUT_SECONDS": "120",
"MAX_RESULTS": "10",
"LOGGING_VERBOSE": "false",
"LOGGING_FILE_ENABLED": "false",
"LOGGING_FILE_PATH": "Logs/search_operations.log",
"PREVIEW_LINES_VECTORIZATION": "-1",
"PREVIEW_LINES_STORAGE": "-1",
"PREVIEW_LINES_RERANKER": "100",
"PREVIEW_LINES_AI_FILTER": "40",
"PREVIEW_CHARS_OUTPUT": "0"
}
}
}
}{
"mcpServers": {
"code-search": {
"command": "mcp-vector-search",
"env": {
"CODEBASE_PATH": "/path/to/your/codebase",
"EMBEDDING_PROVIDER": "ollama",
"OLLAMA_BASE_URL": "http://localhost:11434",
"OLLAMA_EMBEDDING_MODEL": "snowflake-arctic-embed2",
"SEMANTIC_SEARCH_N_RESULTS": "20",
"RERANKER_ENABLED": "false",
"AI_FILTER_ENABLED": "true",
"AI_FILTER_MODEL": "claude-sonnet-4-20250514",
"AI_FILTER_TIMEOUT_SECONDS": "120",
"MAX_RESULTS": "10",
"LOGGING_VERBOSE": "false",
"LOGGING_FILE_ENABLED": "false",
"LOGGING_FILE_PATH": "Logs/search_operations.log",
"PREVIEW_LINES_VECTORIZATION": "-1",
"PREVIEW_LINES_STORAGE": "-1",
"PREVIEW_LINES_AI_FILTER": "40",
"PREVIEW_CHARS_OUTPUT": "0"
}
}
}
}Note: If mcp-vector-search is installed inside a virtualenv that is not on Claude Code's PATH, set command to the absolute path of that binary, e.g. /path/to/venv/bin/mcp-vector-search.
In Claude Code, use the /mcp command to reconnect and verify the server is running.
Once connected, Claude Code agents can autonomously:
Step 1: Create configuration
"Please analyze my codebase and create search configuration"
The agent will:
- Analyze your codebase structure and detect frameworks
- Generate configuration showing which files will be indexed
- Display total file count and exclusion patterns
- Allow you to review and adjust the configuration
Step 2: Start vectorization
"Start vectorizing the database with current configuration"
After reviewing the configuration, the agent uses a separate tool to begin vectorization. For large codebases, enable logging in .mcp.json to track progress:
"LOGGING_VERBOSE": "true",
"LOGGING_FILE_ENABLED": "true""Find the authentication implementation"
"Show me where error handling happens"
"Locate the API endpoint definitions"
"Find documentation about deployment"
"Update the search index with the latest changes"
"Exclude .txt files from search"
"Exclude /folder files from search"
"Add .md files to the search scope"
The server behavior is customized via environment variables.
Important: After changing configuration in .mcp.json, you need to restart Claude Code for changes to take effect (this also restarts the MCP server). To preserve your conversation progress, use:
claude --continue| Variable | Description | Default |
|---|---|---|
CODEBASE_PATH |
Absolute path to the codebase to index | (required) |
EMBEDDING_PROVIDER |
Embedding provider: voyage, openai, or ollama |
voyage |
DB_NAME |
ChromaDB collection name. Use a unique value per indexed codebase to keep multiple projects separate. | codebase_files |
Only the variables for your selected EMBEDDING_PROVIDER are read; the others are ignored.
| Variable | Description | Default |
|---|---|---|
VOYAGE_API_KEY |
Your VoyageAI API key | (required) |
VOYAGE_EMBEDDING_MODEL |
VoyageAI model for embeddings | voyage-code-3 |
VOYAGE_OUTPUT_DIMENSION |
Output vector dimensions (256/512/1024/2048) | (model default) |
VOYAGE_ENABLE_CHUNKING |
Enable chunking for Voyage models | true |
VOYAGE_MAX_CHUNK_TOKENS |
Maximum tokens per chunk for Voyage | (model-specific) |
VOYAGE_CHUNK_OVERLAP_TOKENS |
Overlap tokens between chunks for Voyage | (model-specific) |
VOYAGE_MIN_CHUNK_TOKENS |
Minimum tokens per chunk for Voyage | (model-specific) |
| Variable | Description | Default |
|---|---|---|
OPENAI_API_KEY |
Your OpenAI API key | (required) |
OPENAI_EMBEDDING_MODEL |
OpenAI model for embeddings | text-embedding-3-large |
OPENAI_BATCH_SIZE |
Batch size for OpenAI API requests | 2048 |
| Variable | Description | Default |
|---|---|---|
OLLAMA_BASE_URL |
Ollama server URL | http://localhost:11434 |
OLLAMA_EMBEDDING_MODEL |
Ollama model name | snowflake-arctic-embed2 |
| Variable | Description | Default |
|---|---|---|
SEMANTIC_SEARCH_N_RESULTS |
Initial candidates from vector search | 20 |
RERANKER_ENABLED |
Enable VoyageAI reranker (Voyage provider only) | true |
RERANKER_THRESHOLD |
Minimum relevance score; files below are filtered out | 0.5 |
RERANKER_MODEL |
Reranker model name | rerank-2.5 |
RERANKER_INSTRUCTIONS |
Custom instructions sent to the reranker to bias relevance | (empty) |
RERANKER_USE_CHUNKS |
Send chunk text to the reranker instead of full file content | false |
AI_FILTER_ENABLED |
Enable Claude CLI filtering | true |
AI_FILTER_MODEL |
Claude model used for filtering | claude-sonnet-4-20250514 |
AI_FILTER_TIMEOUT_SECONDS |
Timeout for the Claude CLI call | 120 |
MAX_RESULTS |
Maximum results returned by search_files |
10 |
Disabled by default. Set HYBRID_SEARCH_ENABLED=true to combine BM25 keyword matching with semantic search using Reciprocal Rank Fusion. Useful when literal token matches matter (function names, error strings, exact phrases).
| Variable | Description | Default |
|---|---|---|
HYBRID_SEARCH_ENABLED |
Enable BM25 + vector + RRF pipeline | false |
BM25_ONLY_MODE |
Skip vector search and use only BM25 | false |
RRF_K_PARAMETER |
RRF k constant (1–1000) | 60 |
RRF_WEIGHTS_ENABLED |
Use weighted RRF instead of standard RRF | false |
RRF_VECTOR_WEIGHT |
Weight for vector results (0–1; pair must sum to ~1) | 0.6 |
RRF_BM25_WEIGHT |
Weight for BM25 results (0–1; pair must sum to ~1) | 0.4 |
BM25_K1_PARAMETER |
BM25 term-frequency saturation (0.1–3.0) | 1.2 |
BM25_B_PARAMETER |
BM25 length normalisation (0–1) | 0.75 |
BM25_N_RESULTS |
Top BM25 candidates passed to RRF | 20 |
BM25_MIN_TOKEN_LENGTH |
Minimum token length when indexing | 2 |
BM25_REMOVE_STOPWORDS |
Remove stopwords during indexing/search | true |
BM25_LANGUAGE |
Language for stemming and stopwords | english |
BM25_USE_STEMMING |
Apply stemming when tokenising | false |
BM25_USE_CHUNKING |
Index chunks (instead of full files) for BM25 | false |
| Variable | Description | Default |
|---|---|---|
LOGGING_VERBOSE |
Enable verbose console logging | false |
LOGGING_FILE_ENABLED |
Enable file logging | false |
LOGGING_FILE_PATH |
Path to log file | Logs/search_operations.log |
Controls how many lines of code are used at each stage of the search pipeline:
| Variable | Description | Default | Details |
|---|---|---|---|
PREVIEW_LINES_VECTORIZATION |
Lines used to create search embeddings | 30 | First N lines of each file are vectorized for semantic search. Higher = better context but slower. -1 = entire file |
PREVIEW_LINES_STORAGE |
Lines stored in database | -1 | How much code to save per file. -1 = entire file (recommended) |
PREVIEW_LINES_RERANKER |
Lines sent to VoyageAI reranker | 100 | Code context for relevance scoring. Balance between accuracy and speed. -1 = entire file |
PREVIEW_LINES_AI_FILTER |
Lines sent to Claude for filtering | 40 | Code context for AI relevance evaluation. More lines = better judgment. -1 = entire file |
PREVIEW_CHARS_OUTPUT |
Per-file content shown in search results | 0 | 0 = paths and scores only (agent reads files via Read/Grep/Glob); N = first N characters; -1 = entire file |
All settings are configured via environment variables in your MCP server configuration.
The right embedding model and chunking strategy depend on the kind of corpus you're indexing. The four presets below are sensible starting points; tune from there with your own measurements. Every preset assumes you've also set CODEBASE_PATH and DB_NAME.
Best for: product docs, runbooks, user guides, mostly-prose corpora.
"EMBEDDING_PROVIDER": "voyage",
"VOYAGE_API_KEY": "...",
"VOYAGE_EMBEDDING_MODEL": "voyage-context-3",
"VOYAGE_ENABLE_CHUNKING": "true",
"VOYAGE_MAX_CHUNK_TOKENS": "64",
"SEMANTIC_SEARCH_N_RESULTS": "30",
"RERANKER_ENABLED": "true",
"RERANKER_THRESHOLD": "0.5",
"AI_FILTER_ENABLED": "false",
"MAX_RESULTS": "10",
"PREVIEW_CHARS_OUTPUT": "200"voyage-context-3 produces contextualised chunk embeddings — each chunk vector encodes its surrounding chunks, not just the chunk text in isolation. With small chunks (~64 tokens), this gives accurate retrieval on long documents. PREVIEW_CHARS_OUTPUT=200 shows the agent the first ~200 characters of each candidate so it can confirm relevance without an extra Read.
Best for: source-code repositories and monorepos, where exact file paths matter more than long paragraphs.
"EMBEDDING_PROVIDER": "voyage",
"VOYAGE_API_KEY": "...",
"VOYAGE_EMBEDDING_MODEL": "voyage-3-large",
"VOYAGE_OUTPUT_DIMENSION": "1024",
"VOYAGE_ENABLE_CHUNKING": "false",
"SEMANTIC_SEARCH_N_RESULTS": "20",
"RERANKER_ENABLED": "true",
"RERANKER_THRESHOLD": "0.5",
"AI_FILTER_ENABLED": "false",
"MAX_RESULTS": "10",
"PREVIEW_CHARS_OUTPUT": "0"voyage-3-large is general-purpose and embeds each file whole (no chunking). PREVIEW_CHARS_OUTPUT=0 returns just paths and scores — the agent then opens the most relevant files with Read/Grep/Glob. This keeps token cost low on large codebases.
Best for: corpora that can't leave the machine. No cloud API calls.
"EMBEDDING_PROVIDER": "ollama",
"OLLAMA_BASE_URL": "http://localhost:11434",
"OLLAMA_EMBEDDING_MODEL": "snowflake-arctic-embed2",
"SEMANTIC_SEARCH_N_RESULTS": "20",
"RERANKER_ENABLED": "false",
"AI_FILTER_ENABLED": "false",
"MAX_RESULTS": "10",
"PREVIEW_CHARS_OUTPUT": "100"The Voyage reranker requires API access, so it's off here. Search quality depends heavily on the Ollama model — snowflake-arctic-embed2 is a solid baseline; see OLLAMA_SETUP.md for alternatives.
Best for: quick prototyping, very large corpora, or environments already on the OpenAI stack.
"EMBEDDING_PROVIDER": "openai",
"OPENAI_API_KEY": "sk-...",
"OPENAI_EMBEDDING_MODEL": "text-embedding-3-large",
"SEMANTIC_SEARCH_N_RESULTS": "20",
"RERANKER_ENABLED": "false",
"AI_FILTER_ENABLED": "true",
"AI_FILTER_MODEL": "claude-sonnet-4-20250514",
"MAX_RESULTS": "10",
"PREVIEW_CHARS_OUTPUT": "0"OpenAI has no built-in reranker, so AI_FILTER_ENABLED=true brings Claude in to filter relevance after vector search. Requires the Claude CLI on PATH.
This tool excels in various scenarios:
- Large Codebases: Navigate complex projects with thousands of files effortlessly
- Documentation Search: Find relevant documentation sections instantly
- Code Reviews: Quickly locate related code sections during reviews
- Onboarding: Help new team members explore and understand the codebase
- Refactoring: Find all instances of patterns that need updating
- Debugging: Locate error handling and logging implementations
The package was installed but the binary isn't on Claude Code's PATH. This is typical when the install lives inside a virtualenv.
In .mcp.json, point at the absolute path to the binary:
"command": "/absolute/path/to/venv/bin/mcp-vector-search"Or invoke the module directly:
"command": "/absolute/path/to/venv/bin/python",
"args": ["-m", "code_search_mcp"]Dependencies were not installed for the active Python environment. Reinstall with the right extra:
pip install "git+https://github.com/Amico1285/mcp-vector-search.git#egg=mcp-vector-search[all]"Modern macOS/Linux protect system Python. Always install into a virtualenv:
python3 -m venv venv
source venv/bin/activate
pip install "git+https://github.com/Amico1285/mcp-vector-search.git#egg=mcp-vector-search[all]"Add it to the env block of your .mcp.json — see .mcp.json.example.
The AI filter is optional. Set AI_FILTER_ENABLED=false, or install the Claude CLI.
- Check
get_server_info()— has the codebase been vectorized yet? - Run
update_db()to (re)build the index. - Verify the embedding provider's API key is correct.
- Enable verbose logs:
and tail
"LOGGING_VERBOSE": "true", "LOGGING_FILE_ENABLED": "true"
Logs/search_operations.log.
Install once, then point each project's .mcp.json at the same binary with a different DB_NAME:
{
"mcpServers": {
"code-search": {
"command": "mcp-vector-search",
"env": {
"CODEBASE_PATH": "/absolute/path/to/project1",
"DB_NAME": "project1_db",
"EMBEDDING_PROVIDER": "voyage",
"VOYAGE_API_KEY": "your-api-key"
}
}
}
}If mcp-vector-search doesn't resolve through Claude Code on Windows, point at the venv binary directly:
"command": "C:\\path\\to\\venv\\Scripts\\mcp-vector-search.exe"Contributions are welcome! Please feel free to submit a Pull Request.