personal-ai-cli

Terminal based, open-source, self-hosted personal AI chatbot that anyone can use with their own data. No UI, no cloud, no tracking.

Features

Streaming Responses - Token-by-token streaming for real-time output
Conversation Memory - Maintains context across your chat session
Source Citations - See which documents were used to generate answers
Rich Terminal UI - Beautiful colored output with progress indicators
Multi-Model Support - Switch between Ollama models mid-conversation
Many File Formats - Supports .txt, .md, .pdf, .docx, .html, .json, .csv, .epub
Incremental Ingestion - Only re-processes changed files
Chat History Export - Export conversations to JSON or Markdown
CLI Arguments - Full command-line interface with flags
Retry Logic - Automatic retry with exponential backoff
Input Validation - Security checks on user input
Configurable - YAML config with environment variable overrides
Unit Tests - Comprehensive test suite with pytest

Installation

Install Python dependencies:
```
pip install -r requirements.txt
```

Install Ollama: https://ollama.com/

# Pull a model (default is llama3)
ollama pull llama3

(Optional) Install additional document loaders:

# For .docx support
pip install docx2txt

# For .epub support
pip install ebooklib

Quick Start

Add your personal documents to the data/ directory.
Ingest your data:
```
python ingest.py
```
Start chatting:
```
python chat.py
```

Configuration

The application uses a config.yaml file for settings. A default configuration will be created automatically on first run. You can customize:

Data paths: Where to find documents and store the database
LLM settings: Model, temperature, max tokens
Retrieval settings: Chunk size, overlap, number of results
Embedding model: Which sentence transformer to use
Supported extensions: File types to process

Environment Variable Overrides

You can override any config setting using environment variables prefixed with AI_:

# Use a different model
AI_DEFAULT_MODEL=llama2 python chat.py

# Custom chunk size
AI_CHUNK_SIZE=1000 python ingest.py

# Multiple overrides
AI_DEFAULT_MODEL=mistral AI_TEMPERATURE=0.3 python chat.py

Command Line Options

Chat (chat.py)

python chat.py [OPTIONS]

Options:
  -m, --model MODEL    Ollama model to use (overrides config)
  -c, --config FILE    Path to configuration file
  -v, --verbose        Enable verbose output
  --debug              Enable debug mode with full stack traces
  --clear-db           Clear the vector database and exit
  --no-stream          Disable streaming responses
  --no-sources         Disable source citation display

Ingest (ingest.py)

python ingest.py [OPTIONS]

Options:
  -f, --force          Force re-ingestion of all files (ignore hashes)
  -v, --verbose        Enable verbose output
  --debug              Enable debug mode with full stack traces

Chat Commands

During a chat session, you can use these commands:

Command	Description
`/model <name>`	Switch to a different Ollama model
`/models`	List suggested models
`/export [json\|md]`	Export chat history to file
`/clear`	Clear conversation history
`/sources`	Toggle source citation display
`/help`	Show available commands
`exit`, `quit`, `q`	Exit the chat

Configuration Options

Setting	Default	Description
`data_path`	`"data"`	Directory containing documents
`db_path`	`"db"`	Vector database storage location
`default_model`	`"llama3"`	Ollama model to use
`max_tokens`	`512`	Maximum response length
`temperature`	`0.1`	Response creativity (0.0-1.0)
`top_k`	`3`	Number of document chunks to retrieve
`chunk_size`	`500`	Text chunk size in characters
`chunk_overlap`	`50`	Overlap between chunks
`embedding_model`	`"all-MiniLM-L6-v2"`	Embedding model name

Supported File Types

Extension	Format	Notes
`.txt`	Plain text	Basic text files
`.md`	Markdown	Markdown formatted files
`.pdf`	PDF	Portable Document Format
`.docx`	Word	Requires `docx2txt` package
`.html`	HTML	Web pages
`.json`	JSON	Structured data
`.csv`	CSV	Comma-separated values
`.epub`	EPUB	Ebooks, requires `ebooklib`

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=. --cov-report=html

# Run specific test file
pytest tests/test_config.py -v

Troubleshooting

"Vector database not found"

Run python ingest.py first to create the database

"Failed to connect to Ollama"

Ensure Ollama is installed: ollama --version
Start Ollama service: ollama serve
Pull the required model: ollama pull llama3

"No supported files found"

Check that files have the correct extensions
Ensure files are in the data/ directory

Import errors

Install dependencies: pip install -r requirements.txt

Debug mode

Run with --debug flag for full stack traces: python chat.py --debug

Chat History Export

Conversations are exported to the chat_history/ directory:

# Export as JSON
/export json

# Export as Markdown
/export md

Incremental Ingestion

The system tracks file changes using SHA256 hashes stored in file_hashes.json. Only modified or new files are re-ingested:

# Normal run (incremental)
python ingest.py

# Force full re-ingestion
python ingest.py --force

Logs

The application logs detailed information to help with troubleshooting. Use --verbose or --debug flags for more output.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
chat.py		chat.py
config.py		config.py
config.yaml		config.yaml
ingest.py		ingest.py
pytest.ini		pytest.ini
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

personal-ai-cli

Features

Installation

Quick Start

Configuration

Environment Variable Overrides

Command Line Options

Chat (chat.py)

Ingest (ingest.py)

Chat Commands

Configuration Options

Supported File Types

Running Tests

Troubleshooting

Troubleshooting

Chat History Export

Incremental Ingestion

Logs

License

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

personal-ai-cli

Features

Installation

Quick Start

Configuration

Environment Variable Overrides

Command Line Options

Chat (chat.py)

Ingest (ingest.py)

Chat Commands

Configuration Options

Supported File Types

Running Tests

Troubleshooting

Troubleshooting

Chat History Export

Incremental Ingestion

Logs

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages