Trio is an AI model composition framework that creates new models out of three base models.
Let A, B, and C be the three base models.
Models A and B both generate a response to the prompt independently, without influencing each other in any way. Then model C synthesizes the two perspectives into a final response.
┌─────────────────────────────────────────┐
│ Trio Server │
│ │
Request ────────────┼──► Parse TrioModel │
│ │ │
│ ┌────┴────┐ │
│ │ Model A │──────┐ │
│ └─────────┘ │ │
│ ║ ▼ │
│ ┌─────────┐ ┌─────────┐ │
│ │ Model B │──│ Model C │─── Synthesize
│ └─────────┘ └────┬────┘ │
│ │ │
└──────────────────────┼──────────────────┘
▼
OpenAI-format response
This pattern is inspired by how stereoscopic vision works. Models A and B are like two eyes providing independent perspectives, and model C is like the visual cortex of the brain that infers depth information by comparing the two perspectives.
By detecting invariance across multiple perspectives, the third model detects new information that is not accessible by any of the three base models in isolation.
In order to control the variance vector of a given pair of models, each model can be given a different system prompt. For example, A can be instructed to be detail-oriented while B is instructed to consider the bigger picture. Different dialectical polarities can be used for different scenarios.
| A | B |
|---|---|
| short-term | long-term |
| optimistic | pessimistic |
| pragmatic | theoretical |
| simple | complex |
| quantitative | qualitative |
| specific | general |
| emotional | intellectual |
| goal-oriented | awareness-oriented |
Any model in the trio can itself be a trio, enabling hierarchical synthesis:
{
"model": {
"trio": [
{
"model": {
"trio": [
{"model": "model-x"},
{"model": "model-y"},
{"model": "model-z"}
]
}
},
{"model": "model-b"},
{"model": "model-c"}
]
}
}This allows for multiple variance vectors to be composed in order to construct thinking frameworks such as the Rumsfeld matrix (known-knowns, known-unknowns, unknown-knowns, unknown-unknowns) or the Eisenhower matrix (important and urgent, important and not urgent, not important and urgent, not important and not urgent).
Trio exposes an OpenAI-compatible API at /v1/chat/completions.
To use trio synthesis, pass a TrioModel object as the model parameter:
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": {
"trio": [
{"model": "llama3.2:1b"},
{"model": "mistral"},
{"model": "llama3.2:3b"}
]
},
"messages": [{"role": "user", "content": "What is the meaning of life?"}]
}'Each member can have custom messages (typically system prompts) to create dialectical tension:
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": {
"trio": [
{
"model": "llama3.2:1b",
"messages": [{"role": "system", "content": "Focus on short-term, practical implications."}]
},
{
"model": "mistral",
"messages": [{"role": "system", "content": "Consider long-term, strategic implications."}]
},
{"model": "llama3.2:3b"}
]
},
"messages": [{"role": "user", "content": "Should I change careers?"}]
}'If model is a string, Trio bypasses synthesis and forwards directly to the backend:
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "mistral", "messages": [{"role": "user", "content": "Hello!"}]}'| Field | Type | Required | Description |
|---|---|---|---|
model |
string | TrioModel |
Yes | Model name (pass-through) or trio definition |
messages |
ChatMessage[] |
Yes | Conversation history |
max_tokens |
int |
No | Max tokens to generate (default: 500) |
temperature |
float |
No | Sampling temperature (default: 0.7) |
TrioModel Schema:
| Field | Type | Description |
|---|---|---|
trio |
TrioMember[3] |
Exactly 3 members: A, B, and C |
TrioMember Schema:
| Field | Type | Required | Description |
|---|---|---|---|
model |
string | TrioModel |
Yes | Model name or nested trio |
messages |
ChatMessage[] |
No | Messages prepended to request messages |
Standard OpenAI chat completion format:
{
"id": "trio-abc123",
"object": "chat.completion",
"created": 1705000000,
"model": "trio-1.0",
"choices": [{
"index": 0,
"message": {"role": "assistant", "content": "Synthesized response..."},
"finish_reason": "stop"
}],
"usage": {"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0}
}X-Trio-Details Header:
The response includes a custom header with synthesis details:
{
"response_a": "Response from model A...",
"response_b": "Response from model B...",
"model_a": "llama3.2:1b",
"model_b": "mistral",
"model_c": "llama3.2:3b"
}# Start all services
docker compose up -d
# Pull required Ollama models
docker compose exec ollama ollama pull llama3.2:1b
docker compose exec ollama ollama pull llama3.2:3b
docker compose exec ollama ollama pull mistral
# Test the health endpoint
curl http://localhost:8000/health
# Make a trio request
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": {
"trio": [
{"model": "llama3.2:1b"},
{"model": "mistral"},
{"model": "llama3.2:3b"}
]
},
"messages": [{"role": "user", "content": "What is 2+2?"}]
}'git clone https://github.com/ibis-coordination/trio.git
cd trio
pip install .
TRIO_BACKEND_URL=http://localhost:4000 uvicorn src.main:app --host 0.0.0.0 --port 8000Environment variables:
| Variable | Description | Default |
|---|---|---|
TRIO_BACKEND_URL |
LiteLLM/Ollama URL | http://litellm:4000 |
TRIO_PORT |
Service port | 8000 |
TRIO_TIMEOUT |
Request timeout (seconds) | 120 |
# Install dependencies
pip install -e ".[dev]"
# Run locally (requires LiteLLM backend)
TRIO_BACKEND_URL=http://localhost:4000 uvicorn src.main:app --reload
# Run tests
pytest
# Run type checker
mypy src/See CHANGELOG.md for release history and breaking changes.