Endee Fork: https://github.com/pavankalyanperla/endee
Retrieval-Augmented Generation (RAG) + Semantic Search + Agentic AI for Banking & Finance Built on Endee β a high-performance open-source vector database
Financial institutions and analysts deal with enormous volumes of unstructured text β annual reports, earnings call transcripts, regulatory circulars (RBI/SEBI), loan agreements, and credit risk assessments. Extracting actionable insights from these documents manually is:
- Slow: Analysts spend 60β70% of their time reading and summarising documents
- Error-prone: Key risk indicators buried in 200-page reports are easily missed
- Non-scalable: A single analyst cannot track regulatory changes across dozens of circulars simultaneously
FinSight RAG solves this by enabling natural language querying over large corpora of financial documents β powered by Endee as the vector store at its core.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β FinSight RAG β
β β
β ββββββββββββββββ βββββββββββββββββ ββββββββββββββββββββββββ β
β β Documents β β Ingestion β β Endee Vector DB β β
β β - PDFs βββββΆβ Pipeline βββββΆβ (HNSW Index) β β
β β - TXT/DOCX β β - Extract β β - Chunk vectors β β
β β - Circulars β β - Chunk β β - Metadata store β β
β ββββββββββββββββ β - Embed β ββββββββββββ¬ββββββββββββ β
β βββββββββββββββββ β β
β β ANN Search β
β ββββββββββββββββ βββββββββββββββββ ββββββββββββΌββββββββββββ β
β β User Query βββββΆβ Retriever ββββββ Top-K Chunks β β
β ββββββββββββββββ β + RAG Engine β ββββββββββββββββββββββββ β
β ββββββββ¬βββββββββ β
β β β
β ββββββββΌβββββββββ β
β β Claude AI β β
β β (Anthropic) β β
β ββββββββ¬βββββββββ β
β β β
β ββββββββΌβββββββββ β
β β Grounded β β
β β Answer + β β
β β Citations β β
β βββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Component | Description |
|---|---|
src/utils/endee_client.py |
HTTP client wrapper for Endee REST API |
src/utils/embedder.py |
Local sentence-transformer embeddings (privacy-preserving) |
src/ingestion/ingest.py |
Document extraction, chunking, and vector upsert pipeline |
src/retrieval/retriever.py |
Semantic search, RAG Q&A, and risk clause detection |
src/agents/financial_analyst_agent.py |
Agentic multi-step analyst workflow |
cli.py |
Rich command-line interface |
demo.py |
End-to-end demo script |
Endee serves as the central vector intelligence layer of FinSight RAG:
client.create_index("finsight_docs", dim=384, distance="cosine", index_type="hnsw")Financial document chunks are stored in an HNSW index for sub-millisecond approximate nearest-neighbour (ANN) search.
client.upsert("finsight_docs", [
{
"id": "nicb_ar_chunk_00001",
"vector": [...384 floats...],
"metadata": {
"text": "The GNPA ratio declined to 2.41%...",
"source": "nicb_annual_report_fy2024.txt",
"doc_type": "annual_report",
"chunk_index": 12
}
}
])Each chunk's dense embedding (generated locally via Sentence-Transformers) is stored alongside rich metadata.
hits = client.search("finsight_docs", query_vector, top_k=5)At query time, the user's question is embedded and the top-K most semantically similar document chunks are retrieved from Endee β forming the context window for the LLM.
# Only search within RBI circulars
hits = client.search("finsight_docs", query_vector, top_k=5,
filters={"doc_type": "rbi_circular"})Metadata filters enable targeted retrieval across different document categories.
Find the most relevant passages across hundreds of financial documents using dense vector similarity β not keyword matching.
Ask natural language questions and receive cited, grounded answers backed by retrieved document passages. The LLM cannot hallucinate facts not present in the source documents.
Automatically scan your document corpus for clauses semantically similar to known high-risk financial patterns:
- Acceleration clauses on default
- Cross-default provisions
- Material Adverse Change (MAC) clauses
- Debt covenant breach triggers
- Interest rate step-up on credit downgrade
A multi-step agent that:
- Decomposes a complex research task into sub-questions
- Retrieves evidence for each sub-question from Endee
- Synthesises a structured analyst report with risk ratings and investment recommendations
Ingest RBI/SEBI circulars and query them in plain language: "What are the new provisioning requirements for MFI loans?"
- Python 3.10+
- Docker (recommended for Endee)
- Anthropic API key
β Mandatory: Star and fork the Endee repository first: https://github.com/endee-io/endee
# Clone YOUR fork of this project
git clone https://github.com/<your-username>/finsight-rag.git
cd finsight-ragOption A β Docker Hub (easiest):
mkdir endee-data
docker run -p 8080:8080 -v $(pwd)/endee-data:/data endeeio/endee-server:latestOption B β Docker Compose:
# From the Endee repo root
docker compose up -dOption C β Build from source (see Endee README):
./install.sh --release --avx2 # Intel/AMD
./run.shVerify Endee is running:
curl http://localhost:8080/api/v1/healthpip install -r requirements.txtcp .env.example .env
# Edit .env and add your ANTHROPIC_API_KEYANTHROPIC_API_KEY=sk-ant-...
ENDEE_HOST=http://localhost
ENDEE_PORT=8080python demo.pyThis will ingest the sample financial documents and walk through all features.
python cli.py status# Ingest a single annual report
python cli.py ingest data/sample_docs/nicb_annual_report_fy2024.txt --type annual_report
# Ingest an entire directory
python cli.py ingest path/to/your/docs/ --type earnings_call
# Supported doc types:
# annual_report | earnings_call | loan_agreement | rbi_circular | general# Ask anything about your ingested documents
python cli.py ask "What is the GNPA ratio and how has asset quality changed?"
# Filter by document type
python cli.py ask "What are the new provisioning norms?" --type rbi_circular
# Control retrieval depth
python cli.py ask "What is the capital adequacy ratio?" --top-k 8python cli.py search "liquidity coverage ratio stress scenario"
python cli.py search "microfinance overleveraging default risk" --top-k 10# List all built-in risky patterns
python cli.py risk --list
# Search for a built-in risky pattern
python cli.py risk --pattern "acceleration clause"
python cli.py risk --pattern "material adverse change"
# Search for custom clause text
python cli.py risk "borrower shall repay the entire outstanding principal immediately upon any covenant breach"# Generate a full analyst report on any topic
python cli.py agent "Perform a comprehensive credit risk analysis of NICB"
python cli.py agent "Assess regulatory compliance risk based on the RBI circular" --output report.md
python cli.py agent "Identify all financial risks mentioned across all documents"finsight-rag/
βββ src/
β βββ utils/
β β βββ endee_client.py # Endee HTTP API wrapper
β β βββ embedder.py # Sentence-Transformers embeddings
β βββ ingestion/
β β βββ ingest.py # Document ingestion pipeline
β βββ retrieval/
β β βββ retriever.py # Semantic search, RAG, risk detection
β βββ agents/
β βββ financial_analyst_agent.py # Agentic analyst workflow
βββ data/
β βββ sample_docs/ # Demo financial documents (synthetic)
β βββ nicb_annual_report_fy2024.txt
β βββ nicb_q4_earnings_call_fy2024.txt
β βββ rbi_mfi_circular_2024_demo.txt
βββ tests/
β βββ test_ingestion.py # Unit tests
βββ cli.py # Rich CLI interface
βββ demo.py # End-to-end demo
βββ requirements.txt
βββ .env.example
βββ README.md
- Model:
sentence-transformers/all-MiniLM-L6-v2(384-dim) - Run locally: No embedding API calls β financial data stays on-premise
- Normalised embeddings + cosine similarity for stable retrieval
- Window size: 512 words per chunk (configurable)
- Overlap: 64 words between adjacent chunks (preserves context across boundaries)
- Overlap prevents important financial figures from being split across chunks
- Retrieved chunks are injected into Claude's context as
<financial_context>XML tags - System prompt enforces citation and prevents hallucination outside context
- Conversation history support for multi-turn financial Q&A sessions
The agent follows a Decompose β Retrieve β Synthesise loop:
- Decompose: LLM breaks the task into 3-6 precise sub-questions
- Retrieve: Each sub-question independently queries Endee (iterative vector search)
- Synthesise: All evidence assembled into structured analyst report
This demonstrates how vector search is a core capability invoked inside an LLM reasoning loop β not just a one-shot retrieval.
python -m pytest tests/ -vOr without pytest:
python tests/test_ingestion.py
| Type | Flag | Examples |
|---|---|---|
| Annual Reports | annual_report |
10-K, Annual Reports, DRHP |
| Earnings Calls | earnings_call |
Q4 transcripts, investor day |
| Loan Agreements | loan_agreement |
Term sheets, facility agreements |
| Regulatory | rbi_circular |
RBI/SEBI circulars, master directions |
| General | general |
Any financial narrative text |
All configuration is via .env:
| Variable | Default | Description |
|---|---|---|
ANTHROPIC_API_KEY |
β | Required for RAG generation |
ENDEE_HOST |
http://localhost |
Endee server host |
ENDEE_PORT |
8080 |
Endee server port |
ENDEE_AUTH_TOKEN |
`` | Optional auth token |
EMBEDDING_MODEL |
all-MiniLM-L6-v2 |
Sentence-Transformers model |
RAG_TOP_K |
5 |
Chunks retrieved per query |
RAG_CHUNK_SIZE |
512 |
Words per chunk |
RAG_CHUNK_OVERLAP |
64 |
Word overlap between chunks |
- Table-aware chunking β preserve financial tables from PDFs with structure
- Hybrid search β combine dense + sparse (BM25) retrieval for better recall
- Multi-modal β embed and search financial charts and graphs
- Real-time ingestion β webhook for automatic ingestion of new filings
- Streamlit dashboard β visual interface for portfolio risk monitoring
- Multilingual support β Hindi and regional language financial documents
The sample documents in data/sample_docs/ are entirely synthetic and created for demonstration purposes only. They do not represent any real company, regulator, or financial institution. All figures, names, and scenarios are fictional.
MIT License β see LICENSE
- Endee β for the blazing-fast open-source vector database
- Anthropic Claude β for the LLM powering FinSight's generation
- Sentence-Transformers β for local, privacy-preserving embeddings