AI Multi-Agent Financial Document Analyzer

Debug Challenge Submission – Generative AI Internship

Overview

This project is a fully debugged and optimized multi-agent financial document analysis system built using CrewAI and FastAPI.

I analyzed the issues in the existing codebase, identified deterministic failures and prompt inefficiencies, and restructured the system to ensure stable and predictable execution.

The final system is capable of:

Verifying financial documents
Performing structured financial analysis
Assessing financial risks
Providing a final investment recommendation (BUY / HOLD / SELL)
Processing requests in the background
Persisting results in a database

The system runs end-to-end successfully using a local LLM setup (Ollama + Llama3).

Python Version

Python 3.11.9

Technology Stack

FastAPI (Backend API)
CrewAI (Multi-agent orchestration)
Ollama (Local LLM runtime)
Llama3 model
SQLite (Persistent storage)
LangChain PyPDFLoader (PDF extraction)
Uvicorn (ASGI server)

System Architecture

The system follows a structured multi-agent execution pipeline:

User Upload (PDF) → FastAPI Endpoint → Background Task Execution → PDF Text Extraction → CrewAI Sequential Agents → Database Storage → Retrieval via API

Agent Flow

Financial Document Verification Specialist
Senior Financial Analyst
Corporate Risk Analyst
Investment Strategy Advisor

Execution is strictly sequential using Process.sequential, with explicit task dependency linking to ensure deterministic behavior and proper context propagation.

Project Structure

financial-document-analyzer-debug/ │ ├── main.py ├── agents.py ├── task.py ├── tools.py ├── database.py ├── requirements.txt │ ├── data/ │ └── TSLA-Q2-2025-Update.pdf │ └── README.md

Deterministic Bugs Identified & Fixed

1. Missing Document Text Injection

The original flow did not correctly pass extracted document content to the agents.

Fix: Implemented PDF extraction using PyPDFLoader and passed trimmed document_text directly into the Crew kickoff payload.

2. Blocking Execution Model

The agent workflow was executed synchronously, blocking API responses.

Fix: Converted execution to FastAPI BackgroundTasks, enabling non-blocking request handling and allowing concurrent request processing.

3. No Database Persistence

Analysis results were not stored persistently.

Fix: Integrated SQLite database (analysis.db) with automatic initialization during application startup and structured result storage.

4. Uncontrolled Token Usage

Large financial PDFs caused unstable LLM behavior and excessive token consumption.

Fix: Implemented controlled trimming of document text to 8000 characters before passing it to the LLM to ensure stable CPU inference using Ollama.

5. Improper Task Dependency Flow

Risk and investment tasks were not fully context-aware.

Fix: Explicitly linked task contexts:

Risk assessment depends on financial analysis
Investment recommendation depends on both financial analysis and risk assessment

This ensures logical sequencing and consistent reasoning flow.

6. Weak Output Structure

Agents produced inconsistent outputs and occasional meta commentary.

Fix: Rewrote agent goals with strict structured output enforcement, including:

Clearly defined report sections
Explicit formatting rules
“Do NOT include thoughts or meta commentary” constraints
Controlled iteration limits
Delegation disabled to maintain predictable execution

Prompt Optimization Improvements

Original prompts were loosely defined and allowed unnecessary meta reasoning.

Improvements performed:

Enforced structured section headers
Restricted output formatting explicitly
Added financial terminology constraints
Controlled reasoning verbosity using max_iter
Disabled delegation to prevent uncontrolled agent chaining

These changes ensured consistent, professional-grade financial reports.

Custom CrewAI Tool Implementation

A reusable PDF ingestion tool was implemented using CrewAI’s @tool decorator:

@tool
def read_data_tool(path: str) -> str:

Features:

Validates file path type
Verifies file existence
Extracts structured PDF content using PyPDFLoader
Cleans excessive newline characters
Applies 8000-character safety limit

This ensures safe and deterministic document ingestion before multi-agent processing.

API Endpoints

GET /

Health check endpoint.

Response:

{
  "message": "Financial Document Analyzer API is running"
}

POST /analyze

Uploads a financial PDF for background analysis.

Request:

Multipart form-data
file (PDF)
query (optional)

Response:

{
  "status": "processing",
  "message": "Analysis started in background. Check /analyses endpoint for results."
}

Processing runs asynchronously in the background.

GET /analyses

Returns all stored analysis results from the SQLite database.

Response:

{
  "analyses": [...]
}

Database Schema

Database: analysis.db Table: analyses

Columns:

id (Primary Key)
filename
query
result

The database is initialized automatically when the application starts.

Setup Instructions

1. Clone Repository

git clone <your_repo_link>
cd financial-document-analyzer

2. Create Virtual Environment

python -m venv venv

Windows:

venv\Scripts\activate

3. Install Dependencies

pip install -r requirements.txt

4. Install and Run Ollama

Download Ollama from: https://ollama.com

Pull and run Llama3:

ollama pull llama3
ollama run llama3

Ensure Ollama is running at:

http://localhost:11434

5. Run API Server

uvicorn main:app --reload

Access Swagger UI at:

http://127.0.0.1:8000/docs

Sample Financial Document Included

For demonstration and testing purposes, this repository includes a sample financial earnings document:

data/TSLA-Q2-2025-Update.pdf

This PDF was used to validate:

PDF ingestion and text extraction
Multi-agent financial analysis flow
Risk assessment reasoning
Final BUY / HOLD / SELL recommendation logic

You may replace this file with any other financial report PDF and analyze it using the /analyze endpoint.

Repository Hygiene

To maintain a clean and production-ready repository, the following files and directories are excluded using .gitignore:

venv/
__pycache__/
analysis.db
.env
*.pyc

This prevents local artifacts and sensitive files from being committed.

Bonus Implementations

Background processing model for concurrent request handling
SQLite database integration
Structured task dependency management
Custom CrewAI tool for document ingestion
Controlled LLM token usage
Modular, production-ready architecture

Production Considerations

Designed for local LLM inference using Ollama
Modular agent architecture enables easy scaling
Can be upgraded to Redis/Celery for distributed queue management
Clear separation of concerns across agents and tasks
Deterministic execution model

Final Status

The system runs end-to-end successfully:

Upload financial PDF
Automatic background processing
Multi-agent structured financial analysis
Risk evaluation
Final investment recommendation
Persistent database storage
Retrieval via API endpoint

All major deterministic bugs and inefficient prompts have been resolved.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
.gitignore		.gitignore
README.md		README.md
agents.py		agents.py
database.py		database.py
main.py		main.py
requirements.txt		requirements.txt
task.py		task.py
tools.py		tools.py

Folders and files

Latest commit

History

Repository files navigation

AI Multi-Agent Financial Document Analyzer

Debug Challenge Submission – Generative AI Internship

Overview

Python Version

Technology Stack

System Architecture

Agent Flow

Project Structure

Deterministic Bugs Identified & Fixed

1. Missing Document Text Injection

2. Blocking Execution Model

3. No Database Persistence

4. Uncontrolled Token Usage

5. Improper Task Dependency Flow

6. Weak Output Structure

Prompt Optimization Improvements

Custom CrewAI Tool Implementation

Features:

API Endpoints

GET /

POST /analyze

GET /analyses

Database Schema

Setup Instructions

1. Clone Repository

2. Create Virtual Environment

3. Install Dependencies

4. Install and Run Ollama

5. Run API Server

Sample Financial Document Included

Repository Hygiene

Bonus Implementations

Production Considerations

Final Status

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages