LLM Council Plus

Collective AI Intelligence — Instead of asking one LLM, convene a council of AI models that deliberate, peer-review, and synthesize the best answer.

What is LLM Council Plus?

Instead of asking a single LLM (like ChatGPT or Claude) for an answer, LLM Council Plus assembles a council of multiple AI models that:

Independently answer your question (Stage 1)
Anonymously peer-review each other's responses (Stage 2)
Synthesize a final answer through a Chairman model (Stage 3)

The result? More balanced, accurate, and thoroughly vetted responses that leverage the collective intelligence of multiple AI models.

How It Works

┌─────────────────────────────────────────────────────────────────┐
│                        YOUR QUESTION                             │
│            (+ optional web search for real-time info)            │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                    STAGE 1: DELIBERATION                         │
│  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐             │
│  │ Claude  │  │  GPT-4  │  │ Gemini  │  │  Llama  │  ...        │
│  └────┬────┘  └────┬────┘  └────┬────┘  └────┬────┘             │
│       │            │            │            │                   │
│       ▼            ▼            ▼            ▼                   │
│  Response A   Response B   Response C   Response D               │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                    STAGE 2: PEER REVIEW                          │
│  Each model reviews ALL responses (anonymized as A, B, C, D)     │
│  and ranks them by accuracy, insight, and completeness           │
│                                                                   │
│  Rankings are aggregated to identify the best responses          │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                    STAGE 3: SYNTHESIS                            │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │                    CHAIRMAN MODEL                        │    │
│  │  Reviews all responses + rankings + search context       │    │
│  │  Synthesizes the council's collective wisdom             │    │
│  └─────────────────────────────────────────────────────────┘    │
│                              │                                   │
│                              ▼                                   │
│                      FINAL ANSWER                                │
└─────────────────────────────────────────────────────────────────┘

Features

Multi-Provider Support

Mix and match models from different sources in your council:

Provider	Type	Description
OpenRouter	Cloud	100+ models via single API (GPT-4, Claude, Gemini, Mistral, etc.)
Ollama	Local	Run open-source models locally (Llama, Mistral, Phi, etc.)
Groq	Cloud	Ultra-fast inference for Llama and Mixtral models
OpenAI Direct	Cloud	Direct connection to OpenAI API
Anthropic Direct	Cloud	Direct connection to Anthropic API
Google Direct	Cloud	Direct connection to Google AI API
Mistral Direct	Cloud	Direct connection to Mistral API
DeepSeek Direct	Cloud	Direct connection to DeepSeek API
Custom Endpoint	Any	Connect to any OpenAI-compatible API (Together AI, Fireworks, vLLM, LM Studio, GitHub Models, etc.)

Execution Modes

Choose how deeply the council deliberates:

Mode	Stages	Best For
Chat Only	Stage 1 only	Quick responses, comparing model outputs
Chat + Ranking	Stages 1 & 2	See how models rank each other
Full Deliberation	All 3 stages	Complete council synthesis (default)

Web Search Integration

Ground your council's responses in real-time information:

Provider	Type	Notes
DuckDuckGo	Free	News search, no API key needed
Tavily	API Key	Purpose-built for LLMs, rich content
Brave Search	API Key	Privacy-focused, 2,000 free queries/month

Full Article Fetching: Uses Jina Reader to extract full article content from top search results (configurable 0-10 results).

Temperature Controls

Fine-tune creativity vs consistency:

Council Heat: Controls Stage 1 response creativity (default: 0.5)
Chairman Heat: Controls final synthesis creativity (default: 0.4)
Stage 2 Heat: Controls peer ranking consistency (default: 0.3)

Additional Features

Live Progress Tracking: See each model respond in real-time
Council Sizing: adjust council size from 2 to 8
Abort Anytime: Cancel in-progress requests
Conversation History: All conversations saved locally
Customizable Prompts: Edit Stage 1, 2, and 3 system prompts
Rate Limit Warnings: Alerts when your config may hit API limits (when >5 council members)
"I'm Feeling Lucky": Randomize your council composition
Import & Export: backup and share your favorite council configurations, system prompts, and settings

Quick Start

Prerequisites

Python 3.10+
Node.js 18+
uv (Python package manager)

Installation

# Clone the repository
git clone https://github.com/yourusername/llm-council-plus.git
cd llm-council-plus

# Install backend dependencies
uv sync

# Install frontend dependencies
cd frontend
npm install
cd ..

Running the Application

Option 1: Use the start script (recommended)

./start.sh

Option 2: Run manually

Terminal 1 (Backend):

uv run python -m backend.main

Terminal 2 (Frontend):

cd frontend
npm run dev

Then open http://localhost:5173 in your browser.

Network Access

The application is configured to be accessible from other devices on your local network.

Using start.sh (automatic): The start script now exposes both frontend and backend on the network automatically. Just run ./start.sh and access from any device.

Access URLs:

Local: http://localhost:5173
Network: http://YOUR_IP:5173 (e.g., http://192.168.1.100:5173)

Find your network IP:

# macOS/Linux
ifconfig | grep "inet " | grep -v 127.0.0.1

# Or use hostname
hostname -I

Manual setup (if not using start.sh):

# Backend already listens on 0.0.0.0:8001

# Frontend with network access
cd frontend
npm run dev -- --host

The frontend automatically detects the hostname and connects to the backend on the same IP. CORS is configured to allow requests from any hostname on ports 5173 and 3000.

Configuration

First-Time Setup

On first launch, the Settings panel will open automatically. Configure at least one LLM provider:

LLM API Keys tab: Enter API keys for your chosen providers
Council Config tab: Select council members and chairman
Save Changes

LLM API Keys

Provider	Get API Key
OpenRouter	openrouter.ai/keys
Groq	console.groq.com/keys
OpenAI	platform.openai.com/api-keys
Anthropic	console.anthropic.com
Google AI	aistudio.google.com/apikey
Mistral	console.mistral.ai/api-keys
DeepSeek	platform.deepseek.com

API keys are auto-saved when you click "Test" and the connection succeeds.

Ollama (Local Models)

Install Ollama
Pull models: ollama pull llama3.1
Start Ollama: ollama serve
In Settings, enter your Ollama URL (default: http://localhost:11434)
Click "Connect" to verify

Custom OpenAI-Compatible Endpoint

Connect to any OpenAI-compatible API:

Go to LLM API Keys → Custom OpenAI-Compatible Endpoint
Enter:
- Display Name: e.g., "Together AI", "My vLLM Server"
- Base URL: e.g., https://api.together.xyz/v1
- API Key: (optional for local servers)
Click "Connect" to test and save

Compatible services: Together AI, Fireworks AI, vLLM, LM Studio, Ollama (if you prefer this method), GitHub Models (https://models.inference.ai.azure.com/v1), and more.

Council Configuration

Enable Model Sources: Toggle which providers appear in model selection
Select Council Members: Choose 2-8 models for your council
Select Chairman: Pick a model to synthesize the final answer
Adjust Temperature: Use sliders for creativity control

Tips:

Mix different model families for diverse perspectives
Use faster models (Groq, Ollama) for large councils
Free OpenRouter models have rate limits (20/min, 50/day)

Search Providers

Provider	Setup
DuckDuckGo	Works out of the box, no setup needed
Tavily	Get key at tavily.com, enter in Search Providers tab
Brave	Get key at brave.com/search/api, enter in Search Providers tab

Search Query Processing:

Mode	Description	Best For
Direct (default)	Sends your exact query to the search engine	Short, focused questions. Works best with semantic search engines like Tavily and Brave.
Smart Keywords (YAKE)	Extracts key terms from your prompt before searching	Very long prompts or multi-paragraph context that might confuse the search engine. Uses YAKE keyword extraction.

Tip: Start with Direct mode. Only switch to YAKE if you notice search results are irrelevant when pasting long documents or complex prompts.

Usage

Basic Usage

Start a new conversation (+ button in sidebar)
Type your question
(Optional) Enable web search toggle for real-time info
Press Enter or click Send

Understanding the Output

Stage 1 - Council Deliberation

Tab view showing each model's individual response
Live progress as models respond

Stage 2 - Peer Rankings

Each model's evaluation and ranking of peers
Aggregate scores showing consensus rankings
De-anonymization reveals which model gave which response

Stage 3 - Chairman Synthesis

Final, synthesized answer from the Chairman
Incorporates best insights from all responses and rankings

Keyboard Shortcuts

Key	Action
`Enter`	Send message
`Shift+Enter`	New line in input

Tech Stack

Component	Technology
Backend	FastAPI, Python 3.10+, httpx (async HTTP)
Frontend	React 19, Vite, react-markdown
Styling	CSS with "Midnight Glass" dark theme
Storage	JSON files in `data/` directory
Package Management	uv (Python), npm (JavaScript)

Data Storage

All data is stored locally in the data/ directory:

data/
├── settings.json          # Your configuration (includes API keys)
└── conversations/         # Conversation history
    ├── {uuid}.json
    └── ...

Privacy: No data is sent to external servers except API calls to your configured LLM providers.

⚠️ Security Warning: API Keys Stored in Plain Text

In this build, API keys are stored in clear text in data/settings.json. The data/ folder is included in .gitignore by default to prevent accidental exposure.

Important:

Do NOT remove data/ from .gitignore — this protects your API keys from being pushed to GitHub

If you fork this repo or modify .gitignore, ensure data/ remains ignored

Never commit data/settings.json to version control

If you accidentally expose your keys, rotate them immediately at each provider's dashboard

Troubleshooting

Common Issues

"Failed to load conversations"

Backend might still be starting up
App retries automatically (3 attempts with 1s, 2s, 3s delays)

Models not appearing in dropdown

Ensure the provider is enabled in Council Config
Check that API key is configured and tested successfully
For Ollama, verify connection is active

Jina Reader returns 451 errors

HTTP 451 = site blocks AI scrapers (common with news sites)
Try Tavily/Brave instead, or set full_content_results to 0

Rate limit errors (OpenRouter)

Free models: 20 requests/min, 50/day
Consider using Groq (14,400/day) or Ollama (unlimited)
Reduce council size for free tier usage

Binary compatibility errors (node_modules)

When syncing between Intel/Apple Silicon Macs:

rm -rf frontend/node_modules && cd frontend && npm install

Logs

Backend logs: Terminal running uv run python -m backend.main
Frontend logs: Browser DevTools console

Credits & Acknowledgements

This project is a fork and enhancement of the original llm-council by Andrej Karpathy.

LLM Council Plus builds upon the original "vibe coded" foundation with:

Multi-provider support (OpenRouter, Ollama, Groq, Direct APIs, Custom endpoints)
Web search integration (DuckDuckGo, Tavily, Brave + Jina Reader)
Execution modes (Chat Only, Chat + Ranking, Full Deliberation)
Temperature controls for all stages
Enhanced Settings UI with import/export
Real-time streaming with progress tracking
And much more...

We gratefully acknowledge Andrej Karpathy for the original inspiration and codebase.

License

MIT License - see LICENSE for details.

Contributing

Contributions are welcome! This project embraces the spirit of "vibe coding" - feel free to fork and make it your own.

Built with the collective wisdom of AI
Ask the council. Get better answers.

Name		Name	Last commit message	Last commit date
Latest commit History 99 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
.python-version		.python-version
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
QUICKSTART.md		QUICKSTART.md
README.md		README.md
conv_id.txt		conv_id.txt
header.png		header.png
main.py		main.py
package-lock.json		package-lock.json
pyproject.toml		pyproject.toml
search_test_output.txt		search_test_output.txt
search_test_output_v2.txt		search_test_output_v2.txt
start.sh		start.sh
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

LLM Council Plus

What is LLM Council Plus?

How It Works

Features

Multi-Provider Support

Execution Modes

Web Search Integration

Temperature Controls

Additional Features

Quick Start

Prerequisites

Installation

Running the Application

Network Access

Configuration

First-Time Setup

LLM API Keys

Ollama (Local Models)

Custom OpenAI-Compatible Endpoint

Council Configuration

Search Providers

Usage

Basic Usage

Understanding the Output

Keyboard Shortcuts

Tech Stack

Data Storage

Troubleshooting

Common Issues

Logs

Credits & Acknowledgements

License

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages