Terminal based, open-source, self-hosted personal AI chatbot that anyone can use with their own data. No UI, no cloud, no tracking.
- Streaming Responses - Token-by-token streaming for real-time output
- Conversation Memory - Maintains context across your chat session
- Source Citations - See which documents were used to generate answers
- Rich Terminal UI - Beautiful colored output with progress indicators
- Multi-Model Support - Switch between Ollama models mid-conversation
- Many File Formats - Supports .txt, .md, .pdf, .docx, .html, .json, .csv, .epub
- Incremental Ingestion - Only re-processes changed files
- Chat History Export - Export conversations to JSON or Markdown
- CLI Arguments - Full command-line interface with flags
- Retry Logic - Automatic retry with exponential backoff
- Input Validation - Security checks on user input
- Configurable - YAML config with environment variable overrides
- Unit Tests - Comprehensive test suite with pytest
-
Install Python dependencies:
pip install -r requirements.txt
-
Install Ollama: https://ollama.com/
# Pull a model (default is llama3) ollama pull llama3 -
(Optional) Install additional document loaders:
# For .docx support pip install docx2txt # For .epub support pip install ebooklib
-
Add your personal documents to the
data/directory. -
Ingest your data:
python ingest.py
-
Start chatting:
python chat.py
The application uses a config.yaml file for settings. A default configuration will be created automatically on first run. You can customize:
- Data paths: Where to find documents and store the database
- LLM settings: Model, temperature, max tokens
- Retrieval settings: Chunk size, overlap, number of results
- Embedding model: Which sentence transformer to use
- Supported extensions: File types to process
You can override any config setting using environment variables prefixed with AI_:
# Use a different model
AI_DEFAULT_MODEL=llama2 python chat.py
# Custom chunk size
AI_CHUNK_SIZE=1000 python ingest.py
# Multiple overrides
AI_DEFAULT_MODEL=mistral AI_TEMPERATURE=0.3 python chat.pypython chat.py [OPTIONS]
Options:
-m, --model MODEL Ollama model to use (overrides config)
-c, --config FILE Path to configuration file
-v, --verbose Enable verbose output
--debug Enable debug mode with full stack traces
--clear-db Clear the vector database and exit
--no-stream Disable streaming responses
--no-sources Disable source citation displaypython ingest.py [OPTIONS]
Options:
-f, --force Force re-ingestion of all files (ignore hashes)
-v, --verbose Enable verbose output
--debug Enable debug mode with full stack tracesDuring a chat session, you can use these commands:
| Command | Description |
|---|---|
/model <name> |
Switch to a different Ollama model |
/models |
List suggested models |
/export [json|md] |
Export chat history to file |
/clear |
Clear conversation history |
/sources |
Toggle source citation display |
/help |
Show available commands |
exit, quit, q |
Exit the chat |
| Setting | Default | Description |
|---|---|---|
data_path |
"data" |
Directory containing documents |
db_path |
"db" |
Vector database storage location |
default_model |
"llama3" |
Ollama model to use |
max_tokens |
512 |
Maximum response length |
temperature |
0.1 |
Response creativity (0.0-1.0) |
top_k |
3 |
Number of document chunks to retrieve |
chunk_size |
500 |
Text chunk size in characters |
chunk_overlap |
50 |
Overlap between chunks |
embedding_model |
"all-MiniLM-L6-v2" |
Embedding model name |
| Extension | Format | Notes |
|---|---|---|
.txt |
Plain text | Basic text files |
.md |
Markdown | Markdown formatted files |
.pdf |
Portable Document Format | |
.docx |
Word | Requires docx2txt package |
.html |
HTML | Web pages |
.json |
JSON | Structured data |
.csv |
CSV | Comma-separated values |
.epub |
EPUB | Ebooks, requires ebooklib |
# Run all tests
pytest
# Run with coverage
pytest --cov=. --cov-report=html
# Run specific test file
pytest tests/test_config.py -v"Vector database not found"
- Run
python ingest.pyfirst to create the database
"Failed to connect to Ollama"
- Ensure Ollama is installed:
ollama --version - Start Ollama service:
ollama serve - Pull the required model:
ollama pull llama3
"No supported files found"
- Check that files have the correct extensions
- Ensure files are in the
data/directory
Import errors
- Install dependencies:
pip install -r requirements.txt
Debug mode
- Run with
--debugflag for full stack traces:python chat.py --debug
Conversations are exported to the chat_history/ directory:
# Export as JSON
/export json
# Export as Markdown
/export mdThe system tracks file changes using SHA256 hashes stored in file_hashes.json. Only modified or new files are re-ingested:
# Normal run (incremental)
python ingest.py
# Force full re-ingestion
python ingest.py --forceThe application logs detailed information to help with troubleshooting. Use --verbose or --debug flags for more output.
MIT