An interactive research tool for exploring recursive reasoning in Large Language Models through iterative self-reflection. Watch as LLMs refine their thinking across multiple iterations without external feedback.
This project implements a thinking loop system where LLMs iteratively improve their responses through self-reflection. Unlike traditional single-shot prompting, this approach allows models to:
- ๐ Reflect on their previous reasoning
- ๐ฏ Identify gaps and assumptions
- ๐ Refine their answers progressively
- ๐งฉ Explore different reasoning paths
## โจ Features
- Multi-Model Support: Works with Ollama, vLLM, and OpenRouter APIs
- Reasoning Extraction: Captures explicit reasoning from
<think>tags or native fields - ๐ฏ YOLO Mode: Run iterations until convergence is detected automatically
- Configurable convergence threshold (80-99%)
- Choose similarity comparison mode: "Response Only" (default) or "Reasoning + Response"
- ๐ Prompt Templates: Pre-configured epistemic approaches
- Socratic Method, Empirical-Scientific, Dialectical Synthesis, Systems Thinking, and more
- Easy template switching via dropdown in prompt editor
- Customizable Prompts: Edit system prompts and reflection templates via UI
- Auto-Save: Experiments saved automatically in JSON format
- ๐ Export Options:
- PDF Reports: Professional reports with visualizations, charts, and complete appendix
- HTML Reports: Interactive web-based reports
- Smart Filenames: AI-generated descriptive filenames based on question content
- ๐ธ๏ธ Concept Evolution Graph: Network showing how concepts emerge and connect
- ๐ฅ Similarity Heatmap: Matrix of iteration similarities to identify convergence
- ๐ Confidence Tracking: Evolution of certainty/uncertainty markers
- ๐ Complexity Metrics: Vocabulary diversity and logical connector usage
- ๐ Topic Flow Sankey: How topics persist or change between iterations
โ๏ธ Convergence Timeline: Tracks exploration vs exploitation phases
- Python 3.10+
- One of: Ollama, vLLM server, or OpenRouter API key
# Clone the repository
git clone https://github.com/chheplo/llm-reflection-lab.git
cd llm-reflection-lab
# Install with uv
uv sync# Clone the repository
git clone https://github.com/chheplo/llm-reflection-lab.git
cd llm-reflection-lab
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt# With uv
uv run streamlit run app.py
# With pip
streamlit run app.py# With uv (minimal toolbar, headless server, no usage stats)
uv run streamlit run app.py --client.toolbarMode=minimal --server.headless true --browser.gatherUsageStats false
# With pip
streamlit run app.py --client.toolbarMode=minimal --server.headless true --browser.gatherUsageStats falseThe app will open at http://localhost:8501
- Install Ollama
- Pull a model:
ollama pull gpt-oss:20b - Start Ollama:
ollama serve - Select "Ollama (Local)" in the app
- Start your vLLM server
- Enter the server URL and API key
- Click "Load Available Models"
- Get an API key from OpenRouter
- Select "OpenRouter" and enter your key
- Choose from available models
- Enter a Question: Complex questions work best
- Set Iterations: Choose 3-10 iterations (or more!)
- Optional - Enable YOLO Mode:
- Toggle "๐ฏ YOLO Mode" to run until convergence
- Adjust convergence threshold (0.80-0.99)
- Iterations continue until consecutive responses are similar enough
- Start Loop: Click to begin the thinking process
- Watch Evolution: See reasoning improve in real-time
- Explore Visualizations: Click visualization buttons for insights
Click "โ๏ธ Prompts" to:
- Load Templates: Choose from epistemic approaches
- Default: Standard iterative reasoning
- Socratic Method: Question-driven inquiry
- Empirical-Scientific: Evidence-based analysis
- Dialectical Synthesis: Thesis-antithesis resolution
- Systems Thinking: Holistic interconnected analysis
- Iterative Refinement: Precision-focused improvement
- Edit Prompts: Customize system and reflection prompts
- Save Changes: Store your customizations
Click "๐ค Export" to generate reports:
- PDF Report: Professional document with:
- Colorful title page with research question
- Executive summary and key findings
- Visualization charts (token usage, convergence analysis)
- Detailed experiment results
- Complete appendix with all iterations
- Smart AI-generated filename based on question
- HTML Report: Web-based interactive report
llm-reflection-lab/
โโโ app.py # Main Streamlit application
โโโ src/
โ โโโ visualizations.py # Visualization modules
โ โโโ pdf_export.py # PDF report generation
โ โโโ prompts.json # Current active prompts (user customized)
โโโ templates/ # Prompt template library
โ โโโ default.json # Standard reasoning template
โ โโโ socratic-method.json
โ โโโ empirical-scientific.json
โ โโโ dialectical-synthesis.json
โ โโโ systems-thinking.json
โ โโโ iterative-refinement.json
โโโ saves/ # Auto-saved experiments
โโโ pyproject.toml # Project dependencies (uv)
โโโ requirements.txt # Project dependencies (pip)
โโโ README.md # This file
- Initial Response: Model answers the question
- Self-Reflection: Model reviews its previous answer
- Improvement: Model provides refined response
- Repeat: Process continues for N iterations (or until convergence in YOLO Mode)
The system extracts reasoning through:
- Native
reasoningfields (e.g., OpenAI o1 models) <think>...</think>tags in responses- Configurable extraction patterns
Through visualizations, you can observe:
- Convergence: Ideas stabilizing (high similarity)
- Divergence: Exploring new concepts (low similarity)
- Phase Transitions: Shifts between exploration/exploitation
When enabled, YOLO Mode:
- Automatic Stopping: Detects when consecutive iterations reach similarity threshold
- Dynamic Duration: Runs as many iterations as needed (up to 100 for safety)
- Real-time Monitoring: Shows convergence progress chart during execution
- Efficiency: Stops early when the model's responses stabilize
- Configurable Threshold: Adjust sensitivity from 80% to 99% similarity
- Comparison Modes:
- "Reasoning + Response": Compare full thought process
- "Response Only": Focus on answer convergence
From a typical 10-iteration experiment:
- Iterations 1-2: Initial exploration
- Iterations 3-6: First convergence cluster
- Iteration 7: Divergence/pivot point
- Iterations 8-10: Final convergence
OPENROUTER_API_KEY: Your OpenRouter API keyVLLM_API_KEY: Your vLLM server API key
Edit prompts.json or use the UI to modify:
- System prompts
- Reflection templates
- Reasoning extraction patterns
Contributions are welcome! Please feel free to submit a Pull Request.
# Clone your fork
git clone https://github.com/YOUR_USERNAME/llm-reflection-lab.git
cd thinking-loop-experiment
# Install in development mode
uv sync --dev
# Create a feature branch
git checkout -b feature/your-featureThis project is licensed under the MIT License - see the LICENSE file for details.
- Built with Streamlit
- Visualizations powered by Plotly and PyVis
- LLM integration via OpenAI Python SDK
If you use this tool in your research, please cite:
@software{llm_reflection_lab,
title = {LLM Reflection Lab},
author = {Your Name},
year = {2024},
url = {https://github.com/chheplo/llm-reflection-lab}
}Made with โค๏ธ for the AI research community