A comprehensive financial analysis platform combining Multi-Agent RAG (Retrieval-Augmented Generation) for earnings report analysis with machine learning evaluation frameworks for stock price prediction.
Academic Project - CS7180: Special Topics in Generative AI Northeastern University
This repository contains two integrated systems for automated financial analysis:
-
Multi-Agent RAG Earnings Analyzer (
earnings-main/)- Processes earnings PDFs using AI agents
- Extracts financial metrics with verification
- Compares to analyst estimates
- Generates comprehensive analysis reports
-
Evaluation Framework (
eval-main/)- Historical S&P 500 earnings & price data (1980-2025)
- Benchmark datasets for model comparison
- Gradient boost baseline model
- Performance metrics (direction & regression)
- β Multi-Agent Architecture: Research Agent + Verification Agent with LangGraph orchestration
- β Hybrid Retrieval: BM25 (keyword) + Vector Search (semantic) with ChromaDB
- β Table Preservation: Docling PDF processing maintains financial table structures
- β Unit Normalization: Handles millions vs billions conversions automatically
- β GAAP Classification: Distinguishes GAAP vs Non-GAAP figures
- β Automated Verification: Cross-checks extracted metrics against source documents
- β Market Data Integration: Real-time analyst estimates from Yahoo Finance
- β Historical Data: 45 years of S&P 500 earnings and stock prices
- β Multiple Test Sets: Symbol-based, time-based, and random sampling configurations
- β Baseline Models: Gradient boost implementation for comparison
- β Comprehensive Metrics: Direction accuracy and regression performance
- β Reproducible: Standardized datasets and evaluation pipeline
- Python 3.9+
- OpenAI API key (GPT-5 access)
- ~5GB disk space for dependencies
- 8GB+ RAM recommended
cd earnings-main
# Setup environment
python3 -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
# Configure API key
cp .env_example .env
# Edit .env and add: OPENAI_API_KEY="sk-..."
# Run application
streamlit run app.pyOpen http://localhost:8501 to use the web interface.
cd eval-main
# Evaluate predictions
python evaluate.py
# Train gradient boost model
python gradient_boost.py
# Regenerate datasets
python dataset.pyfs/
βββ README.md # This file
βββ genai_project_report.pdf # Project requirements & design
β
βββ earnings-main/ # Multi-Agent RAG Earnings Analyzer
β βββ app.py # Streamlit web application
β βββ agents/ # AI agents (Research, Verification, Workflow)
β βββ config/ # Settings and constants
β βββ document_processor/ # Docling PDF processing
β βββ retriever/ # Hybrid BM25 + Vector retrieval
β βββ tools/ # Market data & calculation tools
β βββ test_imports.py # Diagnostic testing
β βββ CLAUDE.md # Developer guide for Claude Code
β βββ MIGRATION_SUMMARY.md # Technical migration docs
β βββ README.md # Component documentation
β
βββ eval-main/ # Evaluation Framework
βββ data/
β βββ combined_data.csv # Historical earnings & prices
β βββ test/ # Test datasets (5 configurations)
β βββ predictions/ # Baseline predictions
β βββ evaluation/ # Metrics results
βββ dataset.py # Dataset generation
βββ evaluate.py # Evaluation pipeline
βββ gradient_boost.py # ML baseline model
βββ README.md # Evaluation docs
User Uploads PDFs
β
Document Processing (Docling)
βββ Markdown conversion
βββ Table structure preservation
β
Chunking (1500 chars, 200 overlap)
β
Hybrid Retriever
βββ BM25 (40%): Keyword matching
βββ Vector Search (60%): Semantic similarity
β
Research Agent (GPT-5)
βββ Extract financial metrics
β
Market Data Tools (yfinance)
βββ Fetch analyst estimates
β
Verification Agent (GPT-5)
βββ Cross-check accuracy
β
Conditional Routing
βββ If verified β Generate Report
βββ If issues β Re-extract metrics
β
Final Report Display
βββ Earnings analysis tables
βββ Verification report
βββ Download options
AI & ML:
- OpenAI GPT-5 (gpt-5) - Metric extraction & verification
- OpenAI text-embedding-3-small - Vector embeddings
- LangChain - RAG framework
- LangGraph - Workflow orchestration
- ChromaDB - Vector database
- Scikit-learn - Gradient boost models
Document Processing:
- Docling - PDF β Markdown with table preservation
- pypdf - PDF utilities
Data & Analysis:
- yfinance - Market data & analyst estimates
- pandas - Data manipulation
- numpy - Numerical computing
Web Interface:
- Streamlit - Interactive UI
-
by_symbol_random_10.csv
- Random 10 records per stock
- Stocks: NVDA, GOOGL, AMZN, AAPL, MSFT, META, TSLA
- Use case: Per-stock performance analysis
-
by_symbol_time_10.csv
- Latest 10 records per stock
- Same tech stocks as above
- Use case: Recent trend analysis
-
by_random_100.csv
- Random sample of 100 earnings events
- Use case: Quick baseline testing
-
by_time_100.csv
- Latest 100 earnings events
- Use case: Current market conditions
-
by_random_1000.csv
- Random sample of 1000 events
- Use case: Comprehensive evaluation
Direction Metrics:
- Accuracy of price movement direction prediction
- Precision, recall, F1-score
Regression Metrics:
- Mean Absolute Error (MAE)
- Root Mean Squared Error (RMSE)
- RΒ² Score
Edit earnings-main/config/settings.py:
GPT5_MODEL = "gpt-5" # AI model
EMBEDDING_MODEL = "text-embedding-3-small"
CHUNK_SIZE = 1500 # Optimized for tables
CHUNK_OVERLAP = 200
RETRIEVAL_K = 20 # Chunks to retrieve
RESEARCH_AGENT_MAX_TOKENS = 2500
VERIFICATION_AGENT_MAX_TOKENS = 1500Edit earnings-main/config/constants.py:
FINANCIAL_METRICS = [
"EPS", "Revenue", "Operating Income",
"Net Income", "Gross Margin", "Operating Margin",
"Free Cash Flow"
]
UNIT_MULTIPLIERS = {"M": 1, "B": 1000, "K": 0.001}
GAAP_TYPES = ["GAAP", "Non-GAAP", "Adjusted"]- Navigate to
http://localhost:8501 - Enter company information:
- Ticker:
NVDA - Name:
NVIDIA Corporation
- Ticker:
- Upload PDF files:
- Earnings press release
- Earnings presentation
- Click "π Analyze Earnings"
- Review results:
- Earnings call summary table
- Financial metrics comparison
- Verification report
- Download Markdown report
cd eval-main
# Evaluate all test datasets
python evaluate.py
# Results saved to data/evaluation/
# - direction_metrics_*.csv
# - regression_metrics_*.csvcd eval-main
# Train gradient boost model
python gradient_boost.py
# Predictions saved to data/predictions/The system handles mixed units (millions vs billions) in financial reports:
# tools/calculation_tools.py
@tool
def calculate_surprise_percentage(reported, expected,
reported_unit="M",
expected_unit="M"):
# Normalize both to millions first
reported_m = normalize_to_millions(reported, reported_unit)
expected_m = normalize_to_millions(expected, expected_unit)
surprise = ((reported_m - expected_m) / abs(expected_m)) * 100
return round(surprise, 2)Why critical: Prevents calculation errors when reported values are in billions but estimates are in millions.
LangGraph workflow includes verification with automatic retry:
# agents/financial_workflow.py
workflow.add_conditional_edges(
"verify_metrics",
self._decide_after_verification,
{
"reextract": "extract_metrics", # Retry if verification fails
"generate": "generate_report", # Continue if verified
}
)Maximum 1 retry to prevent infinite loops.
SHA-256 based caching with 7-day expiration:
# document_processor/financial_document_processor.py
cache_key = hashlib.sha256(pdf_bytes).hexdigest()
cache_path = CACHE_DIR / f"{cache_key}.md"
if cache_path.exists() and not _is_cache_expired(cache_path):
return _load_from_cache(cache_path)Avoids re-processing identical PDFs.
"Could not import rank_bm25"
cd earnings-main
source venv/bin/activate
pip install rank-bm25"OPENAI_API_KEY not found"
cd earnings-main
ls -la .env # Check file exists
# Edit .env and add: OPENAI_API_KEY="sk-..."Empty GPT-5 responses
- Increase token limits in
config/settings.py - Current: 2500 (research), 1500 (verification)
Table extraction issues
- Verify
CHUNK_SIZE = 1500in settings - Check Docling has
do_table_structure=True
Missing data files
cd eval-main
python dataset.py # Regenerate datasetsImport errors
pip install pandas numpy scikit-learn- README.md (this file) - Project overview
- genai_project_report.pdf - Original requirements & design
- earnings-main/README.md - Earnings analyzer guide
- earnings-main/CLAUDE.md - Developer guide for Claude Code
- earnings-main/MIGRATION_SUMMARY.md - Migration details
- eval-main/README.md - Evaluation framework guide
- earnings-main/test_imports.py - Component initialization tests
- Ziqi Shao - ML method development
- Zhixiao Wu - Method refinement, dataset collection, evaluation
- Mingze Yuan - LLM method development
This project was developed as part of the CS7180: Special Topics in Generative AI course at Northeastern University.
This AI-powered financial analysis tool is for informational and educational purposes only. It should not be considered financial advice. Always conduct your own research and consult with qualified financial professionals before making investment decisions.
- β Multi-Agent RAG System: Successfully migrated from Claude Files API to comprehensive RAG architecture
- β Table Preservation: Docling maintains financial table integrity during processing
- β Automated Verification: Reduces manual checking with AI-powered cross-validation
- β Unit Normalization: Prevents calculation errors in earnings surprise calculations
- β Comprehensive Evaluation: 45 years of S&P 500 data for rigorous testing
- β Baseline Comparison: Gradient boost model for performance benchmarking
- β Production Ready: Streamlit interface with caching and error handling
- β Well Documented: CLAUDE.md for future development, extensive README files
- Repository: https://github.com/Snorman-zzz/interview-prep-finsight
- OpenAI GPT-5: https://openai.com/
- LangChain: https://python.langchain.com/
- LangGraph: https://langchain-ai.github.io/langgraph/
- Docling: https://github.com/DS4SD/docling
- Streamlit: https://streamlit.io/
Built with: OpenAI GPT-5 β’ LangChain β’ LangGraph β’ Docling β’ Streamlit β’ ChromaDB β’ yfinance
Last Updated: 2025-11-05