A quiver of methods for seeking truth through AI councils.
v1.2.0
Query multiple AI models simultaneously. Compare, debate, verify, refine, and synthesize their responses.
📊 More Screenshots
Track model performance, costs, and usage patterns.
See detailed rankings, scores by criteria, and AI-generated analysis.
Philosophy • The Quiver • Quick Start • Docker • Configuration
There is no single method that reliably produces truth.
But there are appropriate methods for different kinds of questions.
Consilium (Latin for "council" or "deliberation") is an epistemological framework instantiated in software. Different questions require different methods of inquiry. Each mode is an "arrow" in your quiver, designed for a specific epistemic target.
No single AI has all the answers. Each LLM has different training data, reasoning approaches, and blind spots. What one model gets wrong, another might get right.
| Challenge | How Consilium Helps |
|---|---|
| AI hallucinations | Cross-check answers across multiple models (Veritas mode) |
| Model bias | Anonymous deliberation removes reputation bias (Consensus, Arbitrium) |
| Finding the right model | Blind preference voting reveals true preferences (Arbitrium) |
| Complex decisions | Structured debate surfaces all arguments (Debate, Elenchus) |
| Quality output | Sequential refinement polishes content (Limatura) |
| Comprehensive answers | Synthesize insights from multiple sources (Synthesis) |
This is methodological pluralism — the philosophical position that different domains of inquiry require different approaches:
| Question Type | Method | Consilium Mode |
|---|---|---|
| Factual claims | Verification | Veritas |
| Complex trade-offs | Dialectic | Debate |
| Bias reduction | Deliberation | Consensus, Arbitrium |
| Quality assessment | Cross-examination | Analysis, Elenchus |
| Capability testing | Empiricism | Peira |
| Comprehensive coverage | Integration | Synthesis |
| Quality improvement | Iteration | Limatura |
Consilium provides 12 distinct modes — each an arrow designed for a different target:
| Mode | Shortcut | Purpose | When to Use |
|---|---|---|---|
| Forum | Ctrl+1 |
Compare & Judge | General questions, find best answer |
| Debate | Ctrl+2 |
Round-Robin Discussion | Complex topics with trade-offs |
| Consensus | Ctrl+3 |
Anonymous Deliberation | Bias-reduced conclusions |
| Analysis | Ctrl+4 |
Multi-Judge Critique | Deep evaluation of one answer |
| Synthesis | Ctrl+5 |
Combine into One | Comprehensive coverage needed |
| Analytics | Ctrl+6 |
Performance Stats | Review usage and costs |
| Peira | Ctrl+7 |
Capability Testing | Benchmark model abilities |
| Elenchus | Ctrl+8 |
Adversarial Red Team | Stress-test code/ideas |
| Versus | Ctrl+9 |
Local vs Commercial | Compare local to cloud models |
| Arbitrium | Ctrl+0 |
Blind Preference Vote | Discover true preferences |
| Veritas | Ctrl+- |
Fact Check & Verify | Detect hallucinations |
| Limatura | Ctrl+= |
Iterative Polish | Refine through multiple passes |
| Prompting | Ctrl+G |
Prompting Guide | Learn effective prompting techniques |
Latin: "forum" — public place of discussion
All selected models answer your question simultaneously, then an AI judge ranks them.
┌─────────────────────────────────────────────────────────┐
│ YOUR QUESTION │
└───────────────────────┬─────────────────────────────────┘
│
┌───────────────┼───────────────┐
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Model A │ │ Model B │ │ Model C │
└────┬────┘ └────┬────┘ └────┬────┘
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────────┐
│ SIDE-BY-SIDE COMPARISON │
│ + Blind Evaluation (judges see "Response A" not names)│
└─────────────────────────────────────────────────────────┘
Best for: General questions, comparing writing styles, finding the best model for your use case
Features:
- Real-time streaming responses
- Blind evaluation (prevents model reputation bias)
- Follow-up questions with context
- Auto-evaluation ranks responses when complete
Latin: "debattuere" — to fight, contend
A structured multi-round discussion where models build on each other's ideas.
┌─────────────────────────────────────────────────────────┐
│ YOUR TOPIC │
└───────────────────────┬─────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ ROUND 1 │
│ Model A → Model B → Model C (sees all previous) │
└───────────────────────┬─────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ ROUND 2 │
│ Model A → Model B → Model C (builds on Round 1) │
└───────────────────────┬─────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ AUTOMATIC CONSENSUS SUMMARY │
└─────────────────────────────────────────────────────────┘
Best for: Complex topics with trade-offs, controversial questions, exploring all sides
How to use:
- Select 2+ Participants
- Set number of Rounds (1-5)
- Models discuss round-robin, building on previous responses
- Automatic Consensus Summary generated at the end
Latin: "consensus" — agreement, harmony
Models deliberate anonymously over multiple rounds to find where they agree.
┌─────────────────────────────────────────────────────────┐
│ YOUR QUESTION │
└───────────────────────┬─────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ ROUND 0 - INITIAL POSITIONS │
│ Each model answers independently │
│ Responses anonymized: Position A, B, C, D... │
└───────────────────────┬─────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ ROUNDS 1-3 - DELIBERATION │
│ Each model sees ALL anonymized positions │
│ (but NOT who said what - prevents bias) │
│ Task: Consider others, identify agreements/disputes, │
│ refine position, move toward consensus │
└───────────────────────┬─────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ FINAL - ARBITER SYNTHESIS │
│ ✅ Consensus answer (if agreement reached) │
│ OR │
│ 📊 Summary: What they agree on + What remains disputed │
└─────────────────────────────────────────────────────────┘
Best for: Reducing model bias, finding fundamental agreements, cross-validated answers
Key difference from Debate: Models don't know who said what during deliberation, preventing "I agree with GPT because it's GPT" bias.
Greek: "analusis" — breaking up, investigation
One model answers, multiple analysts evaluate the response from different perspectives.
┌─────────────────────────────────────────────────────────┐
│ YOUR QUESTION │
└───────────────────────┬─────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ ANSWERER MODEL RESPONDS │
└───────────────────────┬─────────────────────────────────┘
│
┌───────────────┼───────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Analyst 1│ │ Analyst 2│ │ Analyst 3│
│ Evaluates│ │ Evaluates│ │ Evaluates│
└────┬─────┘ └────┬─────┘ └────┬─────┘
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────────┐
│ MULTI-PERSPECTIVE EVALUATION & SCORING │
└─────────────────────────────────────────────────────────┘
Best for: Deep critique, understanding strengths/weaknesses, academic review
Greek: "sunthesis" — putting together
Multiple models answer, one synthesizer combines the best parts into a unified response.
┌─────────────────────────────────────────────────────────┐
│ YOUR QUESTION │
└───────────────────────┬─────────────────────────────────┘
│
┌───────────────┼───────────────┐
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Source 1│ │ Source 2│ │ Source 3│
│ Answers │ │ Answers │ │ Answers │
└────┬────┘ └────┬────┘ └────┬────┘
│ │ │
└──────────────┼──────────────┘
▼
┌─────────────────────────┐
│ SYNTHESIZER MODEL │
│ Combines all responses │
│ into unified answer │
└─────────────────────────┘
Best for: Research requiring comprehensive coverage, combining expertise, unified summaries
Greek: "πεῖρα" (peira) — trial, experiment, test
Systematically test what models can and cannot do with structured benchmarks.
┌─────────────────────────────────────────────────────────┐
│ SELECT TEST CATEGORY │
│ [Coding] [Math] [Reasoning] [Knowledge] [Creative] │
└───────────────────────┬─────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ SELECT MODELS TO TEST │
│ □ Claude Sonnet 4.5 □ GPT-5.2 □ Gemini 3 Pro │
└───────────────────────┬─────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ STRUCTURED TEST BATTERY │
│ Each model receives identical test prompts │
│ for fair comparison │
└───────────────────────┬─────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ CAPABILITY REPORT │
│ ┌─────────────────────────────────────────────────┐ │
│ │ Model │ Score │ Speed │ Style │ │
│ │ Claude Sonnet │ 92% │ 45t/s │ Detailed │ │
│ │ GPT-5.2 │ 89% │ 52t/s │ Concise │ │
│ │ Gemini 3 Pro │ 87% │ 61t/s │ Structured │ │
│ └─────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
Test Categories:
- Coding: Algorithm implementation, debugging, code review
- Math/Logic: Arithmetic, word problems, proofs, puzzles
- Reasoning: Syllogisms, analogies, causal reasoning
- Knowledge: Trivia, history, science, current events
- Creativity: Storytelling, poetry, brainstorming
Unique Value: This is the only mode where the question is fundamentally about the models themselves, not the world.
Greek: "ἔλεγχος" (elenchus) — cross-examination, refutation (Socrates' method)
Stress-test ideas, code, or plans by having models attack them.
┌─────────────────────────────────────────────────────────┐
│ YOUR CONTENT TO BE CHALLENGED │
│ (code, argument, plan, proposal, idea) │
└───────────────────────┬─────────────────────────────────┘
│
┌───────────────┴───────────────┐
▼ ▼
┌───────────────┐ ┌───────────────┐
│ DEFENDER │ │ CHALLENGERS │
│ (1 model) │ ⚔️ VS ⚔️ │ (1+ models) │
│ Defends the │ │ Attack/find │
│ content │ │ flaws │
└───────┬───────┘ └───────┬───────┘
│ │
└───────────────┬───────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ ROUND 1: Challengers attack │
│ ROUND 2: Defender responds │
│ ROUND 3: Challengers counter │
│ ... │
└───────────────────────┬─────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ ARBITER VERDICT (optional) │
│ • Vulnerabilities found │
│ • Defenses successful │
│ • Final assessment │
└─────────────────────────────────────────────────────────┘
Use Cases:
- Security Review: "Find vulnerabilities in this code"
- Argument Testing: "What's wrong with this reasoning?"
- Business Plans: "What could go wrong with this strategy?"
- Risk Assessment: "Why shouldn't I do this?"
Unique Value: Systematic adversarial testing. Truth survives challenge.
Latin: "versus" — against, turned toward
Compare your local models against commercial frontier models with blind evaluation.
┌─────────────────────────────────────────────────────────┐
│ YOUR PROMPT │
└───────────────────────┬─────────────────────────────────┘
│
┌───────────────┴───────────────┐
▼ ▼
┌───────────────────────┐ ┌───────────────────────┐
│ LOCAL COUNCIL │ │ COMMERCIAL COUNCIL │
│ • llama3.3:70b │ │ • Claude Sonnet 4.5 │
│ • qwen2.5:32b │ │ • GPT-5.2 │
│ • deepseek-r1:14b │ │ • Gemini 3 Pro │
└───────────┬───────────┘ └───────────┬───────────┘
│ │
▼ ▼
┌───────────────────────┐ ┌───────────────────────┐
│ SYNTHESIZE into │ │ SYNTHESIZE into │
│ one council answer │ │ one council answer │
└───────────┬───────────┘ └───────────┬───────────┘
│ │
└───────────┬───────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ BLIND JUDGE EVALUATION │
│ (Compares councils AND local vs each individual model) │
└───────────────────────┬─────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ RESULTS & INSIGHTS │
│ 🏆 Winner: [Local Council/Commercial Council] │
│ 💰 Cost: Local $0 vs Commercial $X.XX │
│ 📊 Savings if local wins: $X.XX saved! │
│ │
│ 🎯 Local Council vs Individual Models: │
│ ┌────────┐ ┌────────┐ ┌────────┐ │
│ │ ✅ │ │ ❌ │ │ 🤝 │ │
│ │ Claude │ │ GPT-5 │ │ Gemini │ │
│ └────────┘ └────────┘ └────────┘ │
│ ✅ = Local council beats model │
│ ❌ = Model beats local council │
└─────────────────────────────────────────────────────────┘
How It Works:
- Both councils answer your question (models run serially for quality)
- Each council synthesizes individual responses into one unified answer
- Judge compares synthesized answers (Council A vs B) — blind, fair
- Judge also compares local synthesis vs each individual commercial model
- Results show: winner, cost saved, and whether your council beats frontier models individually
Best for: Testing if local models can replace paid APIs, finding which tasks locals handle well
Unique Value: Two levels of insight:
- "Is my local council as good as commercial?" (synthesis vs synthesis)
- "Can my local council beat individual frontier models?" (teamwork vs individuals)
Latin: "arbitrium" — judgment, decision, free will
Discover your true preferences without model reputation bias.
┌─────────────────────────────────────────────────────────┐
│ YOUR QUESTION │
└───────────────────────┬─────────────────────────────────┘
│
┌───────────────┴───────────────┐
▼ ▼
┌───────────────────────┐ ┌───────────────────────┐
│ RESPONSE A │ │ RESPONSE B │
│ (Model hidden) │ │ (Model hidden) │
│ │ │ │
│ [Full response │ │ [Full response │
│ displayed here] │ │ displayed here] │
│ │ │ │
└───────────┬───────────┘ └───────────┬───────────┘
│ │
└───────────┬───────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ WHICH DO YOU PREFER? │
│ [Vote A] [Vote B] │
└───────────────────────┬─────────────────────────────────┘
│ (after voting)
▼
┌─────────────────────────────────────────────────────────┐
│ REVEAL: You chose Claude Sonnet 4.5! │
│ Your preference data feeds into personal analytics │
└─────────────────────────────────────────────────────────┘
Features:
- Blind by default — no peeking at model names
- Reveal after voting — see which model you actually preferred
- Preference tracking — builds personal model rankings over time
- Arena-style data — similar to LMSYS Chatbot Arena, but personal
Unique Value: Removes reputation bias. You might discover you prefer different models than you thought!
Latin: "veritas" — truth
Detect hallucinations and verify factual claims through cross-model consensus.
┌─────────────────────────────────────────────────────────┐
│ CLAIM OR QUESTION TO VERIFY │
│ "The Great Wall of China is visible from space" │
└───────────────────────┬─────────────────────────────────┘
│
┌───────────────┼───────────────┐
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ VERIFIER 1 │ │ VERIFIER 2 │ │ VERIFIER 3 │
│ Claude Sonnet │ │ GPT-5.2 │ │ Gemini Pro │
│ │ │ │ │ │
│ Verdict: FALSE│ │ Verdict: FALSE│ │ Verdict: FALSE│
│ Confidence:95%│ │ Confidence:92%│ │ Confidence:88%│
│ Citations: ✓ │ │ Citations: ✓ │ │ Citations: ✓ │
└───────────────┘ └───────────────┘ └───────────────┘
│ │ │
└───────────────┼───────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ ANALYZER SYNTHESIS │
│ ──────────────────────────────────────────────────── │
│ OVERALL VERDICT: FALSE │
│ CONFIDENCE: 92% │
│ │
│ CONSENSUS FACTS: │
│ ✅ All models agree the claim is false │
│ ✅ Cited NASA astronaut testimonies │
│ ✅ Referenced physics of human vision │
│ │
│ KEY EVIDENCE: │
│ • Wall is ~30ft wide, not visible at orbital altitude │
│ • Myth debunked by multiple astronauts │
│ │
│ DISPUTED: None │
│ UNSUPPORTED: None │
└─────────────────────────────────────────────────────────┘
Three Verification Methods:
| Method | Description | Best For |
|---|---|---|
| 🧠 Memory Only | Uses model training data only. "If unknown, say so." | Testing model knowledge without external sources |
| 🌐 Shared Research | One search, all models get same results | Fair comparison with consistent evidence |
| 🔍 Independent Research | Each model searches independently | Seeing how models approach verification differently |
Independent Research - Source Comparison: When using Independent Research mode, Veritas compares sources found by different models:
- Common Sources: URLs found by multiple models (high confidence)
- Unique Sources: URLs only one model found (may reveal blind spots)
- Search Queries: See what each model searched for
Verification Flow:
- Select verification method (Memory Only / Shared Research / Independent Research)
- Enter claim or question to verify
- Multiple verifier models independently assess truthfulness with citations
- Analyzer model synthesizes final report
- Report shows: consensus facts, disputed claims, confidence levels
Best for: Fact-checking before publishing, detecting hallucinations, verifying information
Unique Value: Structured hallucination detection with flexible research options. Trust but verify.
Latin: "limatura" — filing, polishing, refinement
Polish and improve output through sequential model passes.
┌─────────────────────────────────────────────────────────┐
│ CONTENT TO POLISH │
│ (code, text, email, document) │
└───────────────────────┬─────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ V0: ORIGINAL (Model A creates initial response) │
│ "Here is my first draft of the email..." │
└───────────────────────┬─────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ V1: FIRST REFINEMENT (Model B improves V0) │
│ "Here is the improved version with clearer..." │
└───────────────────────┬─────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ V2: SECOND REFINEMENT (Model C improves V1) │
│ "Here is the polished final version..." │
└───────────────────────┬─────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ VERSION COMPARISON │
│ ┌─────────────────────────────────────────────────┐ │
│ │ V0 (Original) │ V1 (Refined) │ V2 (Polished) │ │
│ │ [View] │ [View] │ [View] ★ │ │
│ └─────────────────────────────────────────────────┘ │
│ Current: V2 by Model C │
│ [Copy Final] [Continue Refining] │
└─────────────────────────────────────────────────────────┘
Refinement Types:
- General improvement: "Make this better"
- Style refinement: "Make this more concise/formal/casual"
- Code refinement: "Optimize and clean this code"
- Custom instruction: User-defined refinement criteria
Best for: Code optimization, document drafting, email refinement, creative writing polish
Unique Value: Sequential improvement, not just comparison. Each model builds on the last.
Purpose: Learn and apply effective prompting techniques
A comprehensive guide to crafting effective AI prompts with 8 proven formulas.
┌─────────────────────────────────────────────────────────┐
│ PROMPTING GUIDE │
│ │
│ 📋 8 PROVEN FORMULAS: │
│ │
│ 1. RTCF - Role, Task, Context, Format │
│ 2. CREATE - Character, Request, Examples... │
│ 3. RISEN - Role, Instructions, Steps, End Goal... │
│ 4. Chain-of-Thought - Step-by-step reasoning │
│ 5. Few-Shot Learning - Input/output examples │
│ 6. STAR - Situation, Task, Action, Result │
│ 7. Code Generation - Language, Requirements... │
│ 8. Self-Critique - Generate, critique, improve │
│ │
│ Each formula includes: │
│ • Component breakdown │
│ • Real-world examples │
│ • Best use cases │
│ • One-click copy │
└─────────────────────────────────────────────────────────┘
Available Formulas:
| Formula | Components | Best For |
|---|---|---|
| RTCF | Role + Task + Context + Format | General structured prompts |
| CREATE | Character + Request + Examples + Adjustments + Type + Extras | Detailed specifications |
| RISEN | Role + Instructions + Steps + End Goal + Narrowing | Multi-step tasks |
| Chain-of-Thought | Step-by-step reasoning | Complex reasoning problems |
| Few-Shot | Input → Output examples | Pattern learning |
| STAR | Situation + Task + Action + Result | Problem-solving narratives |
| Code Generation | Language + Requirements + Standards + Edge Cases | Programming tasks |
| Self-Critique | Generate → Critique → Improve | Quality iteration |
Best for: Learning prompt engineering, improving query quality, teaching prompting techniques
Unique Value: Reference guide for effective prompting, always available with Ctrl+G.
Use this table to choose the right mode for your question:
| Question Type | Recommended Mode | Why |
|---|---|---|
| "What is X?" (Factual) | Forum or Veritas | Forum for comparison; Veritas for accuracy |
| "What's the best X?" (Opinion) | Consensus or Arbitrium | Consensus reduces bias; Arbitrium reveals preference |
| Creative writing | Forum or Limatura | Forum for variety; Limatura for polish |
| Coding/Technical | Forum or Elenchus | Forum for solutions; Elenchus for security review |
| Controversial/Ethical | Debate | Models engage with counterarguments |
| "Should I do X?" (Decision) | Consensus or Elenchus | Consensus for recommendation; Elenchus for risks |
| Research/Comprehensive | Synthesis + Veritas | Synthesis for coverage; Veritas for accuracy |
| Security Review | Elenchus | Adversarial testing finds vulnerabilities |
| Model Benchmarking | Peira | Structured capability testing |
| Quick Comparison | Arbitrium | Fast blind preference voting |
| Quality Polish | Limatura | Iterative improvement chain |
| Hallucination Check | Veritas | Cross-model fact verification |
| Local vs Cloud | Versus | Data-driven cost/quality comparison |
- Multi-Model Comparison — Query multiple LLMs simultaneously
- Streaming Responses — Real-time output from all models
- Blind Evaluation — Anonymized judging prevents bias
- URL Content Fetching — Include webpage content in prompts
- Session Management — Save, tag, search, reload sessions
- Export Options — JSON, Markdown, CSV
- Knowledge Base (RAG) — Upload documents for context-aware responses
- Vision Support — Upload images for multi-model analysis
- Research Mode (SearXNG) — Web search before querying models
- Conversation Continuity — Follow-up questions with context
- Prompt Templates — Reusable prompts with variables
- Cost Tracking — Estimated API costs per response
- Model Analytics — Track which models win evaluations
- Pin/Favorite Responses — Star great responses
- Keyboard Shortcuts — Full keyboard navigation
- Local Model Support — Ollama and LM Studio integration
- Dark/Light Themes — Beautiful UI in both modes
- Model Sync — Fetch latest models from OpenRouter API
- Benchmark Sync — Update benchmark scores from HuggingFace Leaderboard
- Prompting Guide — Learn effective prompting techniques
- Node.js 18+
- npm or yarn
- API key from OpenRouter
💡 Why OpenRouter? One API key = access to 25+ models (OpenAI, Anthropic, Google, xAI, Mistral, and more). Pay-as-you-go pricing.
# Clone the repository
git clone https://github.com/lafintiger/Consilium.git
cd Consilium
# Install backend dependencies
cd backend
npm install
# Configure environment
cp ../env.example.txt .env
# Edit .env and add your OPENROUTER_API_KEY
# Start backend
npm run dev
# In a new terminal, install and start frontend
cd frontend
npm install
npm run dev- Frontend: http://localhost:3800
- Backend API: http://localhost:3801
# Copy and configure environment
cp env.example.txt .env
# Edit .env with your API keys
# Build and start
docker compose up -d
# View logs
docker compose logs -f
# Stop
docker compose downgit pull
docker compose down
docker compose build --no-cache
docker compose up -d| Shortcut | Action |
|---|---|
Ctrl+1 |
Forum mode |
Ctrl+2 |
Debate mode |
Ctrl+3 |
Consensus mode |
Ctrl+4 |
Analysis mode |
Ctrl+5 |
Synthesis mode |
Ctrl+6 |
Analytics dashboard |
Ctrl+7 |
Peira (capability testing) |
Ctrl+8 |
Elenchus (red team) |
Ctrl+9 |
Versus (local vs commercial) |
Ctrl+0 |
Arbitrium (blind voting) |
Ctrl+- |
Veritas (fact check) |
Ctrl+= |
Limatura (iterative polish) |
Ctrl+G |
Prompting Guide |
Ctrl+R |
Toggle Research mode |
Copy env.example.txt to .env in the backend folder:
# Required
OPENROUTER_API_KEY=sk-or-v1-your-key-here
# Optional - Local Models
OLLAMA_URL=http://localhost:11434
LMSTUDIO_URL=http://localhost:1234
# Optional - Research Mode
SEARXNG_URL=http://localhost:4000
# Performance
LOCAL_MODELS_SEQUENTIAL=true # Run local models one at a timeConsilium supports 25+ models via OpenRouter:
| Provider | Models |
|---|---|
| Anthropic | Claude Sonnet 4.5, Opus 4.5, Haiku 4.5 |
| OpenAI | GPT-5.2, GPT-5.2 Pro, GPT-5.1, o3 |
| Gemini 3 Pro, Gemini 2.5 Pro/Flash | |
| xAI | Grok 4, Grok 4 Fast, Grok 3 |
| DeepSeek | DeepSeek V3.2, V3.2 Speciale |
| Mistral | Mistral Large 3, Devstral 2 |
| Local | Any Ollama/LM Studio model |
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Pull models
ollama pull llama3.3
ollama pull qwen2.5:32b
ollama pull deepseek-r1:14b
# Start Ollama server
ollama serveConsilium automatically detects running Ollama models.
| Scenario | OLLAMA_URL |
|---|---|
| Both native | http://localhost:11434 |
| Consilium in Docker, Ollama native | http://host.docker.internal:11434 |
Consilium/
├── backend/ # Express.js API server
│ ├── src/
│ │ ├── index.js # Server entry point
│ │ ├── config/ # Model configs, benchmarks
│ │ ├── db/ # SQLite database
│ │ └── routes/ # API endpoints
│ └── package.json
│
├── frontend/ # React + Vite + Tailwind
│ ├── src/
│ │ ├── components/ # React components
│ │ ├── constants/ # Mode definitions
│ │ ├── stores/ # Zustand state
│ │ └── types/ # TypeScript definitions
│ └── package.json
│
├── docker-compose.yml
├── DEVELOPER_GUIDE.md # Developer guide
└── README.md # This file
Consilium includes a built-in Retrieval Augmented Generation (RAG) system that lets you upload documents and have AI models answer questions using your own content.
┌─────────────────────────────────────────────────────────────┐
│ KNOWLEDGE BASE WORKFLOW │
└───────────────────────┬─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ 1. UPLOAD DOCUMENTS │
│ • Click Database icon (🗄️) in header │
│ • Create collections: "Tech Docs", "Research", etc. │
│ • Upload PDFs, Word docs, text files, Markdown │
└───────────────────────┬─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ 2. AUTOMATIC PROCESSING (Background) │
│ • Parse document → Extract text │
│ • Chunk text → Smart segmentation (~500 tokens each) │
│ • Generate embeddings → Ollama qwen3-embedding:8b │
│ • Store in SQLite database │
└───────────────────────┬─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ 3. QUERY WITH KNOWLEDGE │
│ • Toggle "Knowledge" button in prompt input │
│ • Select specific collection or "All Collections" │
│ • Ask your question │
└───────────────────────┬─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ 4. SEMANTIC SEARCH & AUGMENTATION │
│ • Your question → Embedded → Compare to chunks │
│ • Top 5 most relevant chunks retrieved │
│ • Chunks added as context to your prompt │
│ • All models receive the augmented prompt │
└───────────────────────┬─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ 5. VIEW SOURCES │
│ • "Knowledge Base Sources" panel shows retrieved chunks │
│ • Document name, collection, similarity score │
│ • Preview of the chunk content │
└─────────────────────────────────────────────────────────────┘
| Type | Extension | Notes |
|---|---|---|
.pdf |
Text extraction via pdf-parse | |
| Word | .docx |
Modern Word format via mammoth |
| Text | .txt |
Plain text files |
| Markdown | .md |
Markdown files |
Organize your documents into themed collections:
┌─────────────────────────────────────────────────────────────┐
│ 📂 Collections │
├─────────────────────────────────────────────────────────────┤
│ 🏥 Medical Research │ 12 docs │ 342 chunks │
│ 💻 Tech Documentation │ 8 docs │ 215 chunks │
│ 📋 Company Policies │ 5 docs │ 89 chunks │
│ 📖 General │ 3 docs │ 47 chunks │
└─────────────────────────────────────────────────────────────┘
- Create collections with custom names, colors, and descriptions
- Filter searches to specific collections or search all
- Move documents between collections as needed
- Delete collections without losing documents (they go to "uncategorized")
-
Ollama must be running with an embedding model:
# Install the embedding model ollama pull qwen3-embedding:8b # Start Ollama server ollama serve
-
Status Check: The Knowledge Panel shows embedding model status
- ✅ Green = Ready to process documents
- ❌ Red = Embedding model not available
| Environment Variable | Default | Description |
|---|---|---|
EMBEDDING_MODEL |
qwen3-embedding:8b |
Ollama embedding model to use |
KNOWLEDGE_TOP_K |
5 |
Max chunks to retrieve per query |
KNOWLEDGE_MIN_SIMILARITY |
0.3 |
Minimum similarity threshold |
KNOWLEDGE_MAX_TOKENS |
8000 |
Max tokens for context |
| Scenario | How to Use |
|---|---|
| Company Q&A Bot | Upload policy docs → Ask questions about procedures |
| Research Assistant | Upload papers → Ask for summaries and connections |
| Documentation Search | Upload tech docs → Query specific APIs or features |
| Study Helper | Upload course materials → Ask practice questions |
| Legal Research | Upload contracts → Query for specific clauses |
Knowledge Base works alongside other Consilium features:
| Combination | Result |
|---|---|
| Knowledge + Forum | Multiple models answer using your documents |
| Knowledge + Veritas | Fact-check claims against your own sources |
| Knowledge + Synthesis | Combine document insights from multiple models |
| Knowledge + Research | Use both your docs AND web search |
Polyform Noncommercial 1.0.0 — See LICENSE for details.
| Use Case | Allowed |
|---|---|
| Educators & Students | ✅ Free |
| Personal/Hobby Use | ✅ Free |
| Non-profit Organizations | ✅ Free |
| Research | ✅ Free |
| Commercial Use | ❌ Contact for license |
- OpenRouter for unified LLM API access
- Ollama for local model inference
- Vite + React + Tailwind CSS
🏹 A quiver of methods for seeking truth
Built with 🧠 — Seeking Truth Through AI Councils


