A Model Context Protocol (MCP) server that provides seamless integration with Cohere's AI platform. This server exposes Cohere's powerful language models, embeddings, and reranking capabilities through the MCP protocol, enabling AI assistants like Claude to leverage Cohere's tools.
- Chat & Completion - Conversational AI with Command models (including command-a-03-2025)
- Embeddings - Generate semantic embeddings for search, RAG, and clustering
- Reranking - Improve search relevance for RAG systems
- Multilingual Chat - 23+ language support with Aya models
- Text Summarization - Condense long documents
- Classification - Few-shot text classification
- Streaming Support - Real-time response streaming for chat
command-a-03-2025- Latest Command model for complex reasoningcommand-r-plus- Excellent for RAG and tool usecommand-r- Balanced performance and costcommand-light- Fast, lightweight model
embed-english-v3.0- Best English embedding model (1024 dimensions)embed-multilingual-v3.0- 100+ language supportembed-english-light-v3.0- Lightweight English embeddingsembed-multilingual-light-v3.0- Lightweight multilingual
rerank-english-v3.0- Best English rerankingrerank-multilingual-v3.0- Multilingual reranking support
aya-expanse-32b- Powerful multilingual model (23+ languages)aya-expanse-8b- Efficient multilingual model
- Python 3.10 or higher
- A Cohere API key (get one here)
- Clone or download this repository:
cd /home/<user>/Projects/cohere-mcp-server- Install the package:
pip install -e .- Set up your API key:
# Create a .env file in the project root
echo "COHERE_API_KEY=your-api-key-here" > .envOr set it as an environment variable:
export COHERE_API_KEY="your-api-key-here"Run the MCP server directly:
cohere-mcpOr using Python:
python -m cohere_mcp.serverThe server communicates via stdio and follows the MCP protocol specification.
Add the following to your Claude Desktop configuration file:
MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json
{
"mcpServers": {
"cohere": {
"command": "cohere-mcp",
"env": {
"COHERE_API_KEY": "your-api-key-here"
}
}
}
}Or if using Python directly:
{
"mcpServers": {
"cohere": {
"command": "python",
"args": ["-m", "cohere_mcp.server"],
"env": {
"COHERE_API_KEY": "your-api-key-here"
}
}
}
}Any MCP-compatible client can connect to this server. The server uses stdio transport and follows the MCP specification.
Chat with Cohere's Command models for conversational AI and reasoning tasks.
Parameters:
message(string, required) - The user messagemodel(string) - Model to use (default: "command-a-03-2025")temperature(number) - Sampling temperature 0-1 (default: 0.7)max_tokens(number) - Maximum tokens to generate (default: 4096)system_prompt(string) - Optional system instructions
Example:
{
"message": "Explain quantum computing in simple terms",
"model": "command-a-03-2025",
"temperature": 0.7
}Streaming version of chat for real-time responses.
Parameters: Same as cohere_chat
Generate embeddings for semantic search, RAG, and clustering.
Parameters:
texts(array of strings, required) - Texts to embed (max 96 per request)model(string) - Embedding model (default: "embed-english-v3.0")input_type(string) - Type: "search_document", "search_query", "classification", or "clustering"
Example:
{
"texts": ["Document 1", "Document 2"],
"model": "embed-english-v3.0",
"input_type": "search_document"
}Rerank documents based on relevance to a query (ideal for RAG systems).
Parameters:
query(string, required) - The search querydocuments(array of strings, required) - Documents to rerank (max 1000)model(string) - Rerank model (default: "rerank-english-v3.0")top_n(number) - Number of results to return (default: 10)
Example:
{
"query": "What is machine learning?",
"documents": ["Doc about ML", "Doc about cooking", "Doc about AI"],
"top_n": 5
}Chat using multilingual Aya models (supports 23+ languages).
Parameters:
message(string, required) - User message in any supported languagemodel(string) - Aya model (default: "aya-expanse-32b")language(string) - Target response language (optional)temperature(number) - Sampling temperature (default: 0.7)max_tokens(number) - Maximum tokens (default: 4096)
Summarize text content.
Parameters:
text(string, required) - Text to summarizemodel(string) - Model to use (default: "command-r")length(string) - "short", "medium", or "long"format(string) - "paragraph" or "bullets"
Classify texts based on example training data.
Parameters:
inputs(array of strings, required) - Texts to classifyexamples(array of objects, required) - Training examples with "text" and "label" keysmodel(string) - Model to use (default: "embed-english-v3.0")
Lists all available Cohere models with their capabilities, context lengths, and recommended use cases.
Shows current server configuration including default models and settings.
The server can be configured via environment variables:
| Variable | Description | Default |
|---|---|---|
COHERE_API_KEY |
Your Cohere API key | Required |
COHERE_DEFAULT_CHAT_MODEL |
Default chat model | command-a-03-2025 |
COHERE_DEFAULT_EMBED_MODEL |
Default embedding model | embed-english-v3.0 |
COHERE_DEFAULT_RERANK_MODEL |
Default rerank model | rerank-english-v3.0 |
COHERE_TIMEOUT |
API request timeout (seconds) | 60 |
COHERE_MAX_RETRIES |
Maximum API retry attempts | 3 |
pip install -e ".[dev]"This installs testing and linting tools:
- pytest - Testing framework
- black - Code formatter
- ruff - Linter
- mypy - Type checker
pytestRun with coverage:
pytest --cov=cohere_mcp --cov-report=htmlFormat code:
black src/ tests/Lint code:
ruff check src/ tests/Type check:
mypy src/cohere-mcp-server/
├── src/
│ └── cohere_mcp/
│ ├── __init__.py # Package initialization
│ ├── config.py # Configuration management
│ ├── client.py # Cohere API client wrapper
│ └── server.py # MCP server implementation
├── tests/ # Test suite
│ ├── conftest.py
│ ├── test_config.py
│ ├── test_client.py
│ └── test_server.py
├── pyproject.toml # Project configuration
└── README.md # This file
- Embed documents: Use
cohere_embedwithinput_type="search_document" - Embed query: Use
cohere_embedwithinput_type="search_query" - Rerank results: Use
cohere_rerankto improve relevance - Generate response: Use
cohere_chatwith retrieved context
- Index documents using
cohere_embed - Search with query embeddings
- Optionally rerank with
cohere_rerank
Use cohere_aya_chat for conversations in:
- English, Spanish, French, German, Italian, Portuguese
- Arabic, Hebrew, Turkish
- Chinese, Japanese, Korean
- Hindi, Bengali, and many more
Make sure you've set the COHERE_API_KEY environment variable:
export COHERE_API_KEY="your-key-here"- Verify your API key at https://dashboard.cohere.com/api-keys
- Ensure the key has proper permissions
- Check for any whitespace in the key value
- Check your internet connection
- Verify Cohere API status
- Increase timeout with
COHERE_TIMEOUTenvironment variable
Refer to Cohere's pricing page for current API costs.
MIT License - See LICENSE file for details.
For issues and questions:
- Cohere API issues: Cohere Support
- MCP Server issues: Open an issue in this repository
- MCP Protocol: MCP Documentation
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests for new functionality
- Submit a pull request
Built with Cohere and Model Context Protocol