A multi-agent system that analyzes multiple documents on the same topic to identify discrepancies, contradictions, and inconsistencies.
- Multi-document analysis (3-5 documents)
- Cross-document reasoning and comparison
- Discrepancy classification (contradictions, inconsistencies, omissions)
- Alignment scoring (0-100 scale)
- Human-readable explanations
- REST API for web integration
- Streamlit Web Interface for easy document analysis
- CLI interface for command-line usage
- Checkpoint system for resumable processing
# Install core dependencies
uv sync
# For API functionality, also install:
pip install -r requirements-api.txtStart both the API server and Streamlit frontend:
# Start both services at once
./start_full_app.sh
# Or start them separately:
# Terminal 1 - Start API server
python -m src.doc_classifier.run_api
# Terminal 2 - Start Streamlit frontend
python src/doc_classifier/frontend/run_frontend.pyThen open your browser to http://localhost:8501 to use the web interface.
Features:
- Upload files or enter text directly
- View results with formatted output
- Track execution history
- Download results in multiple formats
- Monitor API status
See frontend/README.md for detailed frontend documentation.
Start the FastAPI server:
python src/run_api.pyThen use the API at http://localhost:8000:
import requests
# Process documents via API
response = requests.post("http://localhost:8000/process/content", json={
"documents": [
{"id": "doc1", "content": "Policy text 1..."},
{"id": "doc2", "content": "Policy text 2..."},
{"id": "doc3", "content": "Policy text 3..."}
]
})
result = response.json()
print(f"Alignment Score: {result['alignment_score']}/100")API Documentation: See API_README.md for complete API documentation.
# Process documents from files
python -m src.doc_classifier process doc1.txt doc2.txt doc3.txt
# Get help
python -m src.doc_classifier --helpfrom src.doc_classifier import DocumentClassifier
# Initialize the classifier
classifier = DocumentClassifier()
# Process documents
result = classifier.process_documents([
"document1.txt",
"document2.txt",
"document3.txt"
])
print(f"Alignment Score: {result.alignment_result.score}")
print(f"Explanation: {result.explanation}")The system uses a multi-agent architecture with LangGraph orchestration:
- Ingestion Agent - Document reading and normalization
- Summarization Agent - Claim extraction
- Comparison Agent - Cross-document analysis
- Discrepancy Detection Agent - Issue classification
- Alignment Scoring Agent - Consistency scoring
- Explanation Generator - Human-readable output
# Install development dependencies
uv sync --dev
# Run tests
uv run pytest
# Run property-based tests
uv run pytest -k "property"
# Format code
uv run black .
uv run isort .
# Type checking
uv run mypy src.doc_classifier