Skip to content

QBe1n/evidence-ai

Repository files navigation

EvidenceAI

███████╗██╗   ██╗██╗██████╗ ███████╗███╗   ██╗ ██████╗███████╗ █████╗ ██╗
██╔════╝██║   ██║██║██╔══██╗██╔════╝████╗  ██║██╔════╝██╔════╝██╔══██╗██║
█████╗  ██║   ██║██║██║  ██║█████╗  ██╔██╗ ██║██║     █████╗  ███████║██║
██╔══╝  ╚██╗ ██╔╝██║██║  ██║██╔══╝  ██║╚██╗██║██║     ██╔══╝  ██╔══██║██║
███████╗ ╚████╔╝ ██║██████╔╝███████╗██║ ╚████║╚██████╗███████╗██║  ██║██║
╚══════╝  ╚═══╝  ╚═╝╚═════╝ ╚══════╝╚═╝  ╚═══╝ ╚═════╝╚══════╝╚═╝  ╚═╝╚═╝

AI-powered clinical evidence synthesis for FDA submissions

From PubMed search to regulatory-ready dossier in days, not months.

Python 3.11+ License: MIT FastAPI Code style: ruff Tests

Cost $141K → under $10K
Time 18 months → weeks
Accuracy AI screening reduces costs 73–91%

What Is EvidenceAI?

EvidenceAI automates systematic literature reviews (SLRs) for biotech and pharmaceutical companies preparing FDA submissions. A traditional SLR performed by a contract research organization (CRO) costs $141,000–$500,000 and takes 6–18 months (Michelson & Reuter, 2019). EvidenceAI performs the same workflow — literature search, PICO extraction, evidence triangulation, narrative generation, and regulatory formatting — in days for under $10,000.

Each month of regulatory acceleration represents $25–40M in NPV for a drug development program (McKinsey, 2022).

Built on Academic Excellence

EvidenceAI integrates six MIT-licensed research repositories into a production-grade pipeline:

Source Institution Publication Role in EvidenceAI
PICOX Columbia University JAMIA 2024 PICO entity extraction
EvidenceOutcomes Columbia University Dataset 2024 Clinical outcome extraction
MedReview Columbia University npj Digital Medicine 2024 Evidence summarization
llm-evidence-triangulation Peking University medRxiv 2024 Causal evidence triangulation
TrialSynth Georgia Tech NeurIPS 2024 Workshop Synthetic trial data generation
awesome-nlp-in-ebm Columbia University Curated list NLP in evidence-based medicine

Architecture

EvidenceAI runs a deterministic 6-stage pipeline from raw literature to FDA-formatted evidence packages:

┌─────────────────────────────────────────────────────────────────────────┐
│                              EVIDENCEAI                                  │
│                    Clinical Evidence Synthesis Platform                  │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  ┌──────────────┐    ┌──────────────┐    ┌───────────────┐              │
│  │  1. INGEST   │───▶│  2. EXTRACT  │───▶│ 3. TRIANGULATE│              │
│  │              │    │              │    │               │              │
│  │ PubMed API   │    │ PICOX P/I/O  │    │ LLM two-step  │              │
│  │ ClinTrials   │    │ EvidOutcomes │    │ extraction    │              │
│  │ FDA databases│    │ PubMedBERT   │    │ CoE/LoE score │              │
│  └──────────────┘    └──────────────┘    └───────┬───────┘              │
│                                                   │                      │
│  ┌──────────────┐    ┌──────────────┐    ┌───────▼───────┐              │
│  │  6. DELIVER  │◀───│ 5. SUMMARIZE │◀───│  4. AUGMENT   │              │
│  │              │    │              │    │               │              │
│  │ FDA packages │    │ MedReview    │    │ TrialSynth    │              │
│  │ eCTD modules │    │ fine-tuned   │    │ VAE+Hawkes    │              │
│  │ Dashboard    │    │ narrative    │    │ synthetic     │              │
│  │ REST API     │    │ generation   │    │ trial data    │              │
│  └──────────────┘    └──────────────┘    └───────────────┘              │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Infrastructure

┌─────────────────────────────────────────────────────────────────┐
│                    FastAPI Application Layer                      │
│   POST /api/v1/reviews    GET /api/v1/evidence/{id}             │
│   POST /api/v1/search     GET /api/v1/status/{job_id}           │
└───────────────────┬────────────────────┬────────────────────────┘
                    │                    │
        ┌───────────▼─────┐   ┌──────────▼──────────┐
        │  Celery Workers  │   │   ML Inference Layer │
        │  (async jobs)    │   │  PICOX + MedReview  │
        └───────────┬─────┘   └──────────┬──────────┘
                    │                    │
        ┌───────────▼────────────────────▼──────────┐
        │              Data Layer                     │
        │  PostgreSQL (evidence)  Redis (cache/queue)│
        └───────────────────────────────────────────┘

Comparison

EvidenceAI Traditional CRO Manual Review
Cost $5K–$15K $141K–$500K $50K–$200K
Time Days–weeks 6–18 months 3–12 months
Reproducibility ✅ Deterministic pipeline ❌ Varies by team ❌ Human error
Audit trail ✅ Full provenance ⚠️ Partial ❌ None
PRISMA compliance ✅ Automated ✅ Manual ✅ Manual
eCTD formatting ✅ Module 2.5/2.7 ✅ Manual ❌ Extra work
Real-time updates ✅ Continuous monitoring ❌ Static deliverable ❌ Static
FDA Plausible Mechanism ✅ Built-in (Feb 2026) ⚠️ Ad hoc ❌ Not supported

Quick Start

Prerequisites

  • Python 3.11+
  • PostgreSQL 15+
  • Redis 7+
  • OpenAI API key (for triangulation LLM calls)

Installation

pip install evidence-ai

Or from source:

git clone https://github.com/your-org/evidence-ai.git
cd evidence-ai
pip install -e ".[dev]"

Five-line usage

from evidence_ai import EvidenceAI

client = EvidenceAI(openai_api_key="sk-...")

review = await client.synthesize(
    question="Does semaglutide reduce cardiovascular mortality in T2DM patients?",
    max_papers=500,
)

print(review.level_of_evidence)   # 0.91
print(review.summary)             # "Strong evidence (9 RCTs, n=47,382) shows..."
review.export_fda_package("./output/")

Docker (full stack)

cp .env.example .env
# edit .env with your API keys

docker compose up -d

# API available at http://localhost:8000
# Docs at http://localhost:8000/docs

API Reference

Start a Systematic Review

curl -X POST http://localhost:8000/api/v1/reviews \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $API_KEY" \
  -d '{
    "question": "Does SGLT2 inhibition reduce hospitalization for heart failure?",
    "databases": ["pubmed", "clinicaltrials", "fda"],
    "date_range": {"start": "2015-01-01", "end": "2025-12-31"},
    "study_designs": ["RCT", "META", "SR"],
    "max_results": 1000
  }'

Response:

{
  "review_id": "rev_a1b2c3d4",
  "status": "queued",
  "estimated_minutes": 12,
  "job_url": "https://api.evidenceai.com/api/v1/status/rev_a1b2c3d4"
}

Poll Status

curl http://localhost:8000/api/v1/status/rev_a1b2c3d4 \
  -H "Authorization: Bearer $API_KEY"

Response:

{
  "review_id": "rev_a1b2c3d4",
  "status": "completed",
  "stages_completed": ["ingest", "extract", "triangulate", "augment", "summarize"],
  "papers_found": 847,
  "papers_included": 124,
  "level_of_evidence": 0.88,
  "effect_direction": "inhibitory",
  "confidence": "high"
}

Retrieve Evidence Package

curl http://localhost:8000/api/v1/reviews/rev_a1b2c3d4/package \
  -H "Authorization: Bearer $API_KEY" \
  --output evidence_package.zip

Search Evidence Base

curl "http://localhost:8000/api/v1/evidence?q=KRAS+G12D+neoantigen+immune+response&limit=20" \
  -H "Authorization: Bearer $API_KEY"

Python SDK

Full Systematic Review

import asyncio
from evidence_ai import EvidenceAI
from evidence_ai.config import Settings

settings = Settings(
    openai_api_key="sk-...",
    database_url="postgresql+asyncpg://user:pass@localhost/evidenceai",
    redis_url="redis://localhost:6379/0",
)

async def main():
    client = EvidenceAI(settings=settings)

    # Run a complete review
    review = await client.synthesize(
        question="Does checkpoint inhibitor therapy improve OS in NSCLC?",
        databases=["pubmed", "clinicaltrials"],
        max_papers=1000,
        include_augmentation=True,
    )

    # Access structured results
    print(f"Papers screened: {review.papers_screened}")
    print(f"Papers included: {review.papers_included}")
    print(f"Level of Evidence: {review.level_of_evidence:.2f}")
    print(f"Effect direction: {review.effect_direction}")
    print(f"\nEvidence summary:\n{review.summary}")

    # Export FDA evidence package (eCTD Module 2.5/2.7 format)
    review.export_fda_package(
        output_dir="./fda_package/",
        format="ectd",
        include_bibliography=True,
        include_prisma_diagram=True,
    )

asyncio.run(main())

PICO Entity Extraction

from evidence_ai.extract import PICOExtractor

extractor = PICOExtractor.from_pretrained()

abstract = """
Background: Patients with type 2 diabetes and established cardiovascular disease
were randomized to semaglutide 0.5mg weekly or placebo. The primary endpoint
was a composite of cardiovascular death, non-fatal MI, or non-fatal stroke.
Results: After 2 years, semaglutide reduced the primary endpoint by 26%
(HR 0.74, 95% CI 0.58-0.95) vs placebo (n=3,297).
"""

entities = extractor.extract(abstract)
print(entities.population)      # ["type 2 diabetes", "cardiovascular disease"]
print(entities.intervention)    # ["semaglutide 0.5mg weekly"]
print(entities.comparator)      # ["placebo"]
print(entities.outcomes)        # ["cardiovascular death", "non-fatal MI", ...]

Evidence Triangulation

from evidence_ai.triangulate import TriangulationEngine

engine = TriangulationEngine(llm_model="gpt-4o-mini")

results = await engine.triangulate(
    pmids=["37123456", "36987654", "35876543", ...],
    exposure="semaglutide",
    outcome="cardiovascular mortality",
)

print(f"p_inhibitory: {results.coe_scores.p_inhibitory:.3f}")   # 0.891
print(f"p_no_change:  {results.coe_scores.p_no_change:.3f}")    # 0.072
print(f"p_excitatory: {results.coe_scores.p_excitatory:.3f}")   # 0.037
print(f"Level of Evidence: {results.loe:.3f}")                   # 0.837

Regulatory Outputs

EvidenceAI generates evidence packages formatted for:

  • FDA eCTD Module 2.5 — Clinical Overview
  • FDA eCTD Module 2.7.3 — Summary of Clinical Efficacy
  • FDA eCTD Module 2.7.4 — Summary of Clinical Safety
  • FDA Plausible Mechanism Framework (February 2026) — for individualized therapies
  • EMA CHMP/EWP guidance — European regulatory submissions
  • PRISMA 2020 — Preferred Reporting Items for Systematic Reviews
  • Cochrane Handbook compliant evidence tables

Development

Setup

git clone https://github.com/your-org/evidence-ai.git
cd evidence-ai
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"

# Copy and configure environment
cp .env.example .env

# Run database migrations
make db-migrate

# Start development server
make dev

Commands

make dev          # Start FastAPI with hot reload
make test         # Run test suite with coverage
make lint         # Run ruff + mypy
make format       # Auto-format with ruff
make docker-up    # Start full stack in Docker
make docker-down  # Stop Docker stack
make db-migrate   # Run Alembic migrations
make docs         # Generate API docs

Testing

# Run all tests
pytest

# Run specific module
pytest tests/test_triangulate/ -v

# With coverage
pytest --cov=evidence_ai --cov-report=html
open htmlcov/index.html

Configuration

All settings are managed via environment variables or .env file. See .env.example for a full list.

Variable Required Description
OPENAI_API_KEY Yes OpenAI API key (triangulation + summarization)
DATABASE_URL Yes PostgreSQL connection string
REDIS_URL Yes Redis connection string
NCBI_API_KEY Recommended NCBI API key (10 req/s vs 3 req/s without)
ANTHROPIC_API_KEY Optional Claude fallback for LLM calls
SECRET_KEY Yes JWT signing key for API auth
LOG_LEVEL No Logging level (default: INFO)

Academic Foundations

EvidenceAI's pipeline is built on peer-reviewed academic work. If you use EvidenceAI in research, please cite:

@article{picox2024,
  title={PICOX: Overlapping Span Extraction for PICO Entities},
  author={Nye, Benjamin and others},
  journal={Journal of the American Medical Informatics Association},
  year={2024},
  publisher={Oxford University Press}
}

@inproceedings{trialsynth2024,
  title={TrialSynth: Generating Synthetic Sequential Clinical Trial Data},
  author={Gao, Chufa and others},
  booktitle={NeurIPS 2024 Workshop},
  year={2024}
}

@article{medreview2024,
  title={MedReview: A Benchmark for Automatic Medical Evidence Summarization},
  journal={npj Digital Medicine},
  year={2024}
}

@article{triangulation2024,
  title={LLM-based Evidence Triangulation across Study Designs},
  author={Shi, Xuanyu and others},
  journal={medRxiv},
  year={2024}
}

Roadmap

  • PubMed ingestion with rate limiting and caching
  • ClinicalTrials.gov API v2 connector
  • PICO entity extraction (PubMedBERT-based)
  • CoE/LoE evidence triangulation algorithm
  • FastAPI REST API
  • PostgreSQL + Redis data layer
  • Docker deployment
  • eCTD Module 2.5/2.7 formatted exports
  • Next.js evidence dashboard
  • PRISMA 2020 diagram generation
  • Fine-tuned MedReview summarization model
  • Differential privacy for TrialSynth
  • EMA/CHMP regulatory format support
  • Continuous literature monitoring (webhooks)
  • NeoVax AI integration (neoantigen validation)

Contributing

Contributions are welcome. Please read the contributing guide first.

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/my-feature
  3. Write tests for your changes
  4. Ensure make test and make lint pass
  5. Submit a pull request

Code Standards

  • Type hints required on all public functions
  • Docstrings required on all classes and public methods
  • Test coverage must not decrease
  • Ruff formatting enforced in CI

License

MIT License — see LICENSE for full text.

Built on MIT-licensed work from Columbia University ebmlab, Peking University, and Georgia Tech.


Citation

If you use EvidenceAI in published research:

@software{evidenceai2025,
  title={EvidenceAI: AI-Powered Clinical Evidence Synthesis},
  year={2025},
  url={https://github.com/your-org/evidence-ai},
  license={MIT}
}

Built with ❤️ for biotech teams trying to get life-saving therapies approved faster.

About

AI-powered clinical evidence synthesis for FDA submissions. From PubMed search to regulatory-ready dossier in days, not months.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors