🏦 FinSight RAG — AI-Powered Financial Document Intelligence

Endee Fork: https://github.com/pavankalyanperla/endee

Retrieval-Augmented Generation (RAG) + Semantic Search + Agentic AI for Banking & Finance Built on Endee — a high-performance open-source vector database

📌 Problem Statement

Financial institutions and analysts deal with enormous volumes of unstructured text — annual reports, earnings call transcripts, regulatory circulars (RBI/SEBI), loan agreements, and credit risk assessments. Extracting actionable insights from these documents manually is:

Slow: Analysts spend 60–70% of their time reading and summarising documents
Error-prone: Key risk indicators buried in 200-page reports are easily missed
Non-scalable: A single analyst cannot track regulatory changes across dozens of circulars simultaneously

FinSight RAG solves this by enabling natural language querying over large corpora of financial documents — powered by Endee as the vector store at its core.

🏗 System Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                         FinSight RAG                                │
│                                                                     │
│  ┌──────────────┐    ┌───────────────┐    ┌──────────────────────┐ │
│  │  Documents   │    │  Ingestion    │    │   Endee Vector DB    │ │
│  │  - PDFs      │───▶│  Pipeline     │───▶│  (HNSW Index)        │ │
│  │  - TXT/DOCX  │    │  - Extract    │    │  - Chunk vectors     │ │
│  │  - Circulars │    │  - Chunk      │    │  - Metadata store    │ │
│  └──────────────┘    │  - Embed      │    └──────────┬───────────┘ │
│                      └───────────────┘               │             │
│                                                       │ ANN Search  │
│  ┌──────────────┐    ┌───────────────┐    ┌──────────▼───────────┐ │
│  │  User Query  │───▶│  Retriever    │◀───│  Top-K Chunks        │ │
│  └──────────────┘    │  + RAG Engine │    └──────────────────────┘ │
│                      └──────┬────────┘                             │
│                             │                                       │
│                      ┌──────▼────────┐                             │
│                      │  Claude AI    │                             │
│                      │  (Anthropic)  │                             │
│                      └──────┬────────┘                             │
│                             │                                       │
│                      ┌──────▼────────┐                             │
│                      │  Grounded     │                             │
│                      │  Answer +     │                             │
│                      │  Citations    │                             │
│                      └───────────────┘                             │
└─────────────────────────────────────────────────────────────────────┘

Core Components

Component	Description
`src/utils/endee_client.py`	HTTP client wrapper for Endee REST API
`src/utils/embedder.py`	Local sentence-transformer embeddings (privacy-preserving)
`src/ingestion/ingest.py`	Document extraction, chunking, and vector upsert pipeline
`src/retrieval/retriever.py`	Semantic search, RAG Q&A, and risk clause detection
`src/agents/financial_analyst_agent.py`	Agentic multi-step analyst workflow
`cli.py`	Rich command-line interface
`demo.py`	End-to-end demo script

🔑 How Endee is Used

Endee serves as the central vector intelligence layer of FinSight RAG:

1. Index Creation

client.create_index("finsight_docs", dim=384, distance="cosine", index_type="hnsw")

Financial document chunks are stored in an HNSW index for sub-millisecond approximate nearest-neighbour (ANN) search.

2. Document Upsert

client.upsert("finsight_docs", [
    {
        "id": "nicb_ar_chunk_00001",
        "vector": [...384 floats...],
        "metadata": {
            "text": "The GNPA ratio declined to 2.41%...",
            "source": "nicb_annual_report_fy2024.txt",
            "doc_type": "annual_report",
            "chunk_index": 12
        }
    }
])

Each chunk's dense embedding (generated locally via Sentence-Transformers) is stored alongside rich metadata.

3. Semantic Retrieval

hits = client.search("finsight_docs", query_vector, top_k=5)

At query time, the user's question is embedded and the top-K most semantically similar document chunks are retrieved from Endee — forming the context window for the LLM.

4. Filtered Search

# Only search within RBI circulars
hits = client.search("finsight_docs", query_vector, top_k=5,
                     filters={"doc_type": "rbi_circular"})

Metadata filters enable targeted retrieval across different document categories.

✨ Key Features

🔍 Multi-Document Semantic Search

Find the most relevant passages across hundreds of financial documents using dense vector similarity — not keyword matching.

💬 Grounded RAG Q&A

Ask natural language questions and receive cited, grounded answers backed by retrieved document passages. The LLM cannot hallucinate facts not present in the source documents.

⚠️ Risk Clause Detection

Automatically scan your document corpus for clauses semantically similar to known high-risk financial patterns:

Acceleration clauses on default
Cross-default provisions
Material Adverse Change (MAC) clauses
Debt covenant breach triggers
Interest rate step-up on credit downgrade

🤖 Agentic Financial Analyst

A multi-step agent that:

Decomposes a complex research task into sub-questions
Retrieves evidence for each sub-question from Endee
Synthesises a structured analyst report with risk ratings and investment recommendations

📋 Regulatory Compliance Q&A

Ingest RBI/SEBI circulars and query them in plain language: "What are the new provisioning requirements for MFI loans?"

🚀 Setup & Installation

Prerequisites

Python 3.10+
Docker (recommended for Endee)
Anthropic API key

Step 1: Fork & Clone

⭐ Mandatory: Star and fork the Endee repository first: https://github.com/endee-io/endee

# Clone YOUR fork of this project
git clone https://github.com/<your-username>/finsight-rag.git
cd finsight-rag

Step 2: Start Endee

Option A – Docker Hub (easiest):

mkdir endee-data
docker run -p 8080:8080 -v $(pwd)/endee-data:/data endeeio/endee-server:latest

Option B – Docker Compose:

# From the Endee repo root
docker compose up -d

Option C – Build from source (see Endee README):

./install.sh --release --avx2   # Intel/AMD
./run.sh

Verify Endee is running:

curl http://localhost:8080/api/v1/health

Step 3: Install Python Dependencies

pip install -r requirements.txt

Step 4: Configure Environment

cp .env.example .env
# Edit .env and add your ANTHROPIC_API_KEY

ANTHROPIC_API_KEY=sk-ant-...
ENDEE_HOST=http://localhost
ENDEE_PORT=8080

💻 Usage

Run the Full Demo

python demo.py

This will ingest the sample financial documents and walk through all features.

CLI Reference

Check Status

python cli.py status

Ingest Documents

# Ingest a single annual report
python cli.py ingest data/sample_docs/nicb_annual_report_fy2024.txt --type annual_report

# Ingest an entire directory
python cli.py ingest path/to/your/docs/ --type earnings_call

# Supported doc types:
# annual_report | earnings_call | loan_agreement | rbi_circular | general

Ask Questions (RAG)

# Ask anything about your ingested documents
python cli.py ask "What is the GNPA ratio and how has asset quality changed?"

# Filter by document type
python cli.py ask "What are the new provisioning norms?" --type rbi_circular

# Control retrieval depth
python cli.py ask "What is the capital adequacy ratio?" --top-k 8

Pure Semantic Search

python cli.py search "liquidity coverage ratio stress scenario"
python cli.py search "microfinance overleveraging default risk" --top-k 10

Risk Clause Detection

# List all built-in risky patterns
python cli.py risk --list

# Search for a built-in risky pattern
python cli.py risk --pattern "acceleration clause"
python cli.py risk --pattern "material adverse change"

# Search for custom clause text
python cli.py risk "borrower shall repay the entire outstanding principal immediately upon any covenant breach"

Agentic Analyst Report

# Generate a full analyst report on any topic
python cli.py agent "Perform a comprehensive credit risk analysis of NICB"
python cli.py agent "Assess regulatory compliance risk based on the RBI circular" --output report.md
python cli.py agent "Identify all financial risks mentioned across all documents"

📁 Project Structure

finsight-rag/
├── src/
│   ├── utils/
│   │   ├── endee_client.py      # Endee HTTP API wrapper
│   │   └── embedder.py          # Sentence-Transformers embeddings
│   ├── ingestion/
│   │   └── ingest.py            # Document ingestion pipeline
│   ├── retrieval/
│   │   └── retriever.py         # Semantic search, RAG, risk detection
│   └── agents/
│       └── financial_analyst_agent.py  # Agentic analyst workflow
├── data/
│   └── sample_docs/             # Demo financial documents (synthetic)
│       ├── nicb_annual_report_fy2024.txt
│       ├── nicb_q4_earnings_call_fy2024.txt
│       └── rbi_mfi_circular_2024_demo.txt
├── tests/
│   └── test_ingestion.py        # Unit tests
├── cli.py                       # Rich CLI interface
├── demo.py                      # End-to-end demo
├── requirements.txt
├── .env.example
└── README.md

🔬 Technical Approach

Embedding Strategy

Model: sentence-transformers/all-MiniLM-L6-v2 (384-dim)
Run locally: No embedding API calls — financial data stays on-premise
Normalised embeddings + cosine similarity for stable retrieval

Chunking Strategy

Window size: 512 words per chunk (configurable)
Overlap: 64 words between adjacent chunks (preserves context across boundaries)
Overlap prevents important financial figures from being split across chunks

RAG Design

Retrieved chunks are injected into Claude's context as <financial_context> XML tags
System prompt enforces citation and prevents hallucination outside context
Conversation history support for multi-turn financial Q&A sessions

Agentic Workflow

The agent follows a Decompose → Retrieve → Synthesise loop:

Decompose: LLM breaks the task into 3-6 precise sub-questions
Retrieve: Each sub-question independently queries Endee (iterative vector search)
Synthesise: All evidence assembled into structured analyst report

This demonstrates how vector search is a core capability invoked inside an LLM reasoning loop — not just a one-shot retrieval.

🧪 Running Tests

python -m pytest tests/ -v

Or without pytest:

python tests/test_ingestion.py

🌐 Supported Document Types

Type	Flag	Examples
Annual Reports	`annual_report`	10-K, Annual Reports, DRHP
Earnings Calls	`earnings_call`	Q4 transcripts, investor day
Loan Agreements	`loan_agreement`	Term sheets, facility agreements
Regulatory	`rbi_circular`	RBI/SEBI circulars, master directions
General	`general`	Any financial narrative text

⚙️ Configuration

All configuration is via .env:

Variable	Default	Description
`ANTHROPIC_API_KEY`	—	Required for RAG generation
`ENDEE_HOST`	`http://localhost`	Endee server host
`ENDEE_PORT`	`8080`	Endee server port
`ENDEE_AUTH_TOKEN`	``	Optional auth token
`EMBEDDING_MODEL`	`all-MiniLM-L6-v2`	Sentence-Transformers model
`RAG_TOP_K`	`5`	Chunks retrieved per query
`RAG_CHUNK_SIZE`	`512`	Words per chunk
`RAG_CHUNK_OVERLAP`	`64`	Word overlap between chunks

🗺 Future Roadmap

Table-aware chunking — preserve financial tables from PDFs with structure
Hybrid search — combine dense + sparse (BM25) retrieval for better recall
Multi-modal — embed and search financial charts and graphs
Real-time ingestion — webhook for automatic ingestion of new filings
Streamlit dashboard — visual interface for portfolio risk monitoring
Multilingual support — Hindi and regional language financial documents

📄 Disclaimer

The sample documents in data/sample_docs/ are entirely synthetic and created for demonstration purposes only. They do not represent any real company, regulator, or financial institution. All figures, names, and scenarios are fictional.

📜 License

MIT License — see LICENSE

🙏 Acknowledgements

Endee — for the blazing-fast open-source vector database
Anthropic Claude — for the LLM powering FinSight's generation
Sentence-Transformers — for local, privacy-preserving embeddings

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data/sample_docs		data/sample_docs
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
cli.py		cli.py
demo.py		demo.py
docker-compose.yml		docker-compose.yml
quickstart.py		quickstart.py
requirements.txt		requirements.txt
setup.py		setup.py

Folders and files

Latest commit

History

Repository files navigation

🏦 FinSight RAG — AI-Powered Financial Document Intelligence

📌 Problem Statement

🏗 System Architecture

Core Components

🔑 How Endee is Used

1. Index Creation

2. Document Upsert

3. Semantic Retrieval

4. Filtered Search

✨ Key Features

🔍 Multi-Document Semantic Search

💬 Grounded RAG Q&A

⚠️ Risk Clause Detection

🤖 Agentic Financial Analyst

📋 Regulatory Compliance Q&A

🚀 Setup & Installation

Prerequisites

Step 1: Fork & Clone

Step 2: Start Endee

Step 3: Install Python Dependencies

Step 4: Configure Environment

💻 Usage

Run the Full Demo

CLI Reference

Check Status

Ingest Documents

Ask Questions (RAG)

Pure Semantic Search

Risk Clause Detection

Agentic Analyst Report

📁 Project Structure

🔬 Technical Approach

Embedding Strategy

Chunking Strategy

RAG Design

Agentic Workflow

🧪 Running Tests

🌐 Supported Document Types

⚙️ Configuration

🗺 Future Roadmap

📄 Disclaimer

📜 License

🙏 Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages