Andromeda is a tools-first financial QA system over SEC filings. It combines planner-routed tool calls, hybrid retrieval, reranking, and eval-governed iteration so numeric and narrative answers are grounded in explicit evidence.
Recent repo changes and benchmark results to know first:
- Planner characteristics quality improved substantially in the latest planner eval run:
- exact match
0.98 - macro F1
0.9960 - micro F1
0.9919 - see
BENCHMARK_PLANNER_v3.md
- exact match
The main UI combines conversation history, planner stage timings, tool results, reranked evidence, and a synchronized source viewer.
Tool cards surface profile/valuation, recent news, price history, and SEC financial metrics before final synthesis.
The retrieval panel shows ranked chunks, filing metadata, and direct "open source / view in app" links for inspection.
Answers are organized as thesis points with direct quotes and additional context tied to evidence.
Each response includes a consolidated cited-sources block with filing tags and tool outputs.
The review UI supports case filtering, pass/fail labeling, timing inspection, and targeted export/reload workflows.
flowchart TD
A[Client / UI] --> B[/POST /query or /query_stream/]
B --> C[Conversation resolution]
C --> D[Planner: structured decision]
D -->|action=clarification_required| E[Return clarifying question]
D -->|action=refused| F[Return refusal message]
D -->|action=answer| G[Finance tools stage]
G --> H{use_rag}
H -->|no| I[Synthesis prompt with tool context]
H -->|yes| J[Hybrid retrieval]
J --> K[Cross-encoder rerank]
K --> L{multi-ticker briefs}
L -->|yes| M[Per-ticker briefs + synthesis]
L -->|no| I
M --> I
I --> N[Draft generation]
N --> O{enable_refine}
O -->|yes| P[Refine generation]
O -->|no| Q[Finalize]
P --> Q
Q --> R[Persist history + return response]
flowchart LR
Q[User query] --> E[Dense embedding]
Q --> S[Sparse query branch]
E --> D[(pgvector dense rank)]
S --> T[(BM25 or FTS sparse rank)]
D --> U[Candidate union]
T --> U
U --> V[RRF fusion]
V --> W[Top-k hybrid chunks]
W --> X[Cross-encoder reranker]
X --> Y[Top-k reranked chunks]
Z[Chunk metadata]
Z -->|retrieval_text / retrieval_context| D
Z -->|text_for_rerank| X
flowchart TD
A[Tickers + profile config] --> B[scripts/download.py]
B --> C[scripts/process_html_to_markdown.py]
C --> D[scripts/chunk.py]
D --> E[Chunk postprocessors]
E --> F[scripts/build_index.py]
F --> G[Optional context strategy]
G --> H[Embedding + retrieval text assembly]
H --> I[(PostgreSQL schema: documents + chunks)]
J[/POST /ingest/] --> K[TickerIngestionJobManager]
K --> B
flowchart TD
A[Eval query sets JSONL] --> B[scripts/run_eval.py]
B --> C[generations.jsonl]
C --> D[scripts/score_eval.py]
D --> E[score_summary.json + review.csv]
F[Planner eval set] --> G[scripts/run_planner_eval.py]
G --> H[scripts/score_planner_eval.py]
E --> I[Benchmark reports]
H --> I
I --> J[Prompt/runtime/index changes]
J --> A
- Planner outputs:
action:answer,clarification_required,refusedcharacteristics:comparison,market_data,financial_metrics,filing_narrative- tool/rag routing hints (
use_rag,use_finance_tools, etc.)
- Clarification vs refusal boundary is explicit and tracked in planner eval artifacts.
- Planner-first routing removed heuristic ticker-inference refusal fallback from the hot path.
Primary module:
src/andromeda/query/runtime.py
- Finance tool adapters (
yfinance,edgartools) run before optional RAG. - Tool outputs are fed into synthesis prompts and returned in structured payloads.
- Streaming path (
/query_stream) shares the same planner/tools/retrieval pipeline with stage events.
Primary modules:
src/andromeda/finance_tools.pysrc/andromeda/query/runtime.pysrc/andromeda/query/streaming.py
- Retrieval backend is PostgreSQL-only (
pgvector+ sparse branch). - Hybrid search fuses dense and sparse candidates using weighted reciprocal rank fusion.
- Reranker is a cross-encoder over retrieved candidates.
- Metadata-aware retrieval text/context is preserved through chunk export -> indexing -> reranking.
Primary modules:
src/andromeda/retrieval/db.pysrc/andromeda/retrieval/retriever.pysrc/andromeda/processing/metadata_models.pysrc/andromeda/processing/context_support.py
- Ingestion defaults to profile-scoped paths under
data/ingest_profiles/<profile>/.... - PostgreSQL schema defaults to ingest profile unless explicitly overridden.
- Eval launchers now enforce ingest-profile/doc-index consistency by default.
Primary modules/scripts:
src/andromeda/ingestion/ingest_profile.pyscripts/download.py,scripts/process_html_to_markdown.py,scripts/chunk.py,scripts/build_index.pyscripts/run_full_eval_suite.sh,scripts/_env.sh
Primary endpoints in src/andromeda/main.py:
GET /healthGET /generation_presetsPOST /queryPOST /query_streamPOST /cancelPOST /ingestGET /ingest/{job_id}GET /ingested_companiesGET /sourceGET /source_textGET /historyGET /history_entryDELETE /historyGET /(main UI)GET /review(review UI, via review router)
- API wiring:
src/andromeda/main.py
- Query runtime:
src/andromeda/query/runtime.pysrc/andromeda/query/streaming.pysrc/andromeda/query/conversation.py
- Runtime builders/config:
src/andromeda/runtime/builders.py
- Retrieval:
src/andromeda/retrieval/db.pysrc/andromeda/retrieval/retriever.py
- Prompting and LLM clients:
src/andromeda/llm/qa.pysrc/andromeda/llm/clients.py
- Ingestion:
src/andromeda/ingestion/*scripts/download.py,scripts/process_html_to_markdown.py,scripts/chunk.py,scripts/build_index.py
- Evaluation:
src/andromeda/eval/*scripts/run_eval.py,scripts/score_eval.pyscripts/run_planner_eval.py,scripts/score_planner_eval.py
cp .env.example .env
source .venv/bin/activate
pip install -e ".[dev]"
npm installRequired env examples:
POSTGRES_DSN(orDATABASE_URL)- chat/embed model endpoint variables (OpenAI-compatible or provider-specific)
Run app:
source .venv/bin/activate
python -m uvicorn andromeda.main:app --host 0.0.0.0 --port 8000 --reloadUI:
http://localhost:8000/http://localhost:8000/review
Ingestion/indexing (profile-driven):
source .venv/bin/activate
bash scripts/download.sh
bash scripts/process_html_to_markdown.sh
bash scripts/chunk.sh
bash scripts/build_index.shRun full eval suite:
source .venv/bin/activate
bash scripts/run_full_eval_suite.shRun planner eval suite:
source .venv/bin/activate
bash scripts/run_planner_eval_suite.shDetailed eval runbook:
README_EVAL.md





