PrescriptoAI

AI-powered prescription intelligence system — upload a handwritten or printed prescription image and get structured medication data, plain-English interpretation, and risk warnings in seconds.

ScreenShots

Overview

PrescriptoAI transforms medical prescriptions into structured, actionable insights using a multi-agent AI pipeline. Instead of a single LLM call, it models prescription understanding as a stateful workflow with decision gates, retry logic, parallel analysis, and severity-based routing — all orchestrated by LangGraph.

The React frontend provides a clean interface for uploading prescriptions, viewing results, and asking follow-up questions via a built-in chat assistant.

Disclaimer: This is an assistive tool only. It does not provide medical diagnoses or clinical decisions.

Features

AI & Backend

Vision OCR — Gemini Flash reads handwritten and printed prescriptions directly from images
Confidence-aware retries — low-confidence OCR results trigger a targeted re-extraction attempt
Structured extraction — medications, dosage, frequency, duration, route, and doctor instructions parsed into validated JSON
Plain-English interpretation — medication purposes and dosing schedules explained in accessible language
Risk assessment — flags missing dosages, ambiguous instructions, and low-confidence OCR areas with severity levels (high / medium / low)
Parallel analysis — interpretation and risk detection run concurrently for faster response
Severity routing — high-risk prescriptions trigger critical alert escalation; medium-risk produces advisory warnings
Chat Q&A — multi-turn conversation about any analysed prescription, with full conversation memory per session

Frontend

Drag-and-drop upload — drop an image or click to browse; live preview before submission
Text input tab — paste raw prescription text to skip OCR and go straight to analysis
Medication cards — each medication displayed with its structured data and AI interpretation side by side
Color-coded risk panel — warnings and alerts rendered with severity badges (red / yellow / green)
Processing timeline — step-by-step log of what the pipeline did, great for understanding the AI's reasoning
Chat interface — ask follow-up questions like "What is this medication for?" or "When should I take it?" with streaming-style responses

Tech Stack

Layer	Technology
Backend	Python, FastAPI
LLM	Google Gemini 2.5 Flash (free tier)
LLM Framework	LangChain (`langchain-google-genai`)
Agent Orchestration	LangGraph (stateful graph with cycles, conditional edges, parallel branches)
Structured Output	Pydantic v2 + `.with_structured_output()`
Settings	pydantic-settings
Image Handling	Pillow
Frontend	React + Vite
HTTP Client	Axios
Chat Memory	LangGraph MemorySaver (in-memory, per thread_id)

System Architecture

┌─────────────────────────────────────────────────────────────┐
│                        React Frontend                        │
│  UploadForm │ ResultDisplay │ MedicationCard │ ChatInterface │
└──────────────────────┬──────────────────────────────────────┘
                       │  HTTP (Axios)
┌──────────────────────▼──────────────────────────────────────┐
│                       FastAPI Backend                        │
│                                                             │
│   POST /upload-prescription    POST /analyze    POST /chat  │
│              │                      │                │      │
│              └──────────┬───────────┘          LangGraph   │
│                         │                      chat_graph  │
│                ┌────────▼────────┐                         │
│                │  LangGraph      │                         │
│                │  StateGraph     │                         │
│                │  (13 nodes)     │                         │
│                └────────┬────────┘                         │
│                         │                                  │
│                ┌────────▼────────┐                         │
│                │  Gemini Flash   │  ← vision + text LLM   │
│                └─────────────────┘                         │
└─────────────────────────────────────────────────────────────┘

Processing Pipeline

[Image upload]          [Text input]
      │                      │
      ▼                      │
   OCR Node ◄── retry ──┐   │
      │                  │   │
  [confidence check]     │   │
      │                  │   │
  low → Enhance OCR ─────┘   │
  ok  → Clean Text ◄─────────┘
            │
      Structure Node ◄── retry ──┐
            │                    │
      [completeness check]       │
            │                    │
     gaps → Recover Fields ──────┘
     ok  → Dispatch Analysis
                │
       ┌────────┴────────┐
       │   PARALLEL      │
  Reasoning Node    Risk Node
       │                 │
       └────────┬────────┘
            Merge
              │
        [severity routing]
              │
    high → Critical Alert
    med  → Advisory
    low  → Format Output
              │
            END

Each stage produces structured, typed output. If any node fails, the pipeline returns a graceful fallback with error details rather than crashing.

Frontend

The React frontend is built with Vite and communicates with the FastAPI backend via Axios.

Components:

Component	Purpose
`UploadForm`	Drag-and-drop image upload with live preview; text input tab for raw paste
`ResultDisplay`	Orchestrates the full results layout
`MedicationCard`	Displays one medication with its structured fields and AI interpretation
`WarningsPanel`	Color-coded risk level badge and warning/alert list
`ChatInterface`	Multi-turn chat tied to a session `thread_id`; sends prescription context to ground answers

Running the frontend:

cd frontend
npm install
npm run dev        # dev server at http://localhost:5173
npm run build      # production build → dist/

API Reference

`POST /api/upload-prescription`

Upload a prescription image. Runs the full pipeline including OCR.

Request: multipart/form-data

file: <image>    # JPEG, PNG, WEBP, HEIC

Response:

{
  "structured_data": {
    "patient_name": "John Doe",
    "doctor_name": "Dr. Smith",
    "date": "2026-03-22",
    "medications": [
      {
        "name": "Amoxicillin",
        "dosage": "500mg",
        "frequency": "TDS",
        "duration": "7 days",
        "route": "oral",
        "instructions": "Take with food"
      }
    ],
    "notes": "Follow up in 1 week",
    "missing_fields": []
  },
  "interpretation": {
    "medication_interpretations": [...],
    "instruction_summary": "Take Amoxicillin three times daily with food for one week.",
    "abbreviations_decoded": { "TDS": "Three times daily" },
    "disclaimer": "..."
  },
  "risk_assessment": { "risk_level": "low", "flags": [], "warnings": [] },
  "risk_level": "low",
  "warnings": [],
  "critical_alerts": [],
  "ocr_confidence": 0.91,
  "processing_steps": ["[OCR] ...", "[Structuring] ...", "..."],
  "errors": []
}

`POST /api/analyze`

Analyze raw prescription text (no image needed — skips OCR).

Request:

{ "text": "Amoxicillin 500mg TDS x 7 days\nTake with food\nDr. Smith" }

Response: Same schema as /upload-prescription (without ocr_confidence).

`POST /api/chat`

Multi-turn Q&A about an analysed prescription.

Request:

{
  "message": "What is Amoxicillin commonly used for?",
  "thread_id": "session-abc123",
  "prescription_context": {}
}

Response:

{
  "reply": "Amoxicillin is commonly prescribed for bacterial infections...",
  "thread_id": "session-abc123"
}

Project Structure

PrescriptoAI/
├── backend/
│   ├── main.py                         # FastAPI app, CORS, router registration
│   ├── config.py                       # Settings from .env (pydantic-settings)
│   ├── api/
│   │   └── routes/
│   │       ├── prescription.py         # /upload-prescription, /analyze
│   │       └── chat.py                 # /chat
│   ├── pipeline/
│   │   ├── state.py                    # PrescriptionState TypedDict
│   │   ├── graph.py                    # Graph assembly and compile
│   │   ├── edges.py                    # Conditional edge functions
│   │   └── nodes/
│   │       ├── ocr_node.py
│   │       ├── ocr_enhancement_node.py
│   │       ├── flag_unreadable_node.py
│   │       ├── cleaning_node.py
│   │       ├── structuring_node.py
│   │       ├── field_recovery_node.py
│   │       ├── dispatch_analysis_node.py
│   │       ├── reasoning_node.py
│   │       ├── risk_node.py
│   │       ├── merge_node.py
│   │       ├── critical_alert_node.py
│   │       ├── advisory_node.py
│   │       └── output_formatter_node.py
│   ├── schemas/
│   │   └── models.py                   # Pydantic models (LLM output + API I/O)
│   └── utils/
│       └── prompts.py                  # LangChain ChatPromptTemplates
│
├── frontend/
│   ├── src/
│   │   ├── App.jsx                     # Root layout
│   │   ├── api.js                      # Axios API calls
│   │   └── components/
│   │       ├── UploadForm.jsx
│   │       ├── ResultDisplay.jsx
│   │       ├── MedicationCard.jsx
│   │       ├── WarningsPanel.jsx
│   │       └── ChatInterface.jsx
│   ├── package.json
│   └── vite.config.js
│
├── .env                                # GEMINI_API_KEY (not committed)
├── .env.example
├── requirements.txt
└── README.md

Setup & Running

Prerequisites

Python 3.11+
Node.js 18+
A free Google AI Studio API key

Backend

# Create and activate virtual environment
python -m venv venv
venv\Scripts\activate        # Windows
# source venv/bin/activate   # macOS/Linux

# Install dependencies
pip install -r requirements.txt

# Configure environment
cp .env.example .env
# Edit .env and set GEMINI_API_KEY=your_actual_key

# Start the backend
uvicorn backend.main:app --reload --host 0.0.0.0 --port 8000

API docs available at http://localhost:8000/docs

Frontend

cd frontend
npm install
npm run dev       # runs at http://localhost:5173

The frontend proxies API requests to http://localhost:8000 — both must be running for full functionality.

Design Decisions

Multi-agent pipeline over a single prompt — breaking processing into discrete nodes (OCR → clean → structure → reason → risk) makes each step independently debuggable, retryable, and replaceable without touching the rest of the pipeline.

Gemini Flash for OCR — instead of a separate OCR library (Tesseract, etc.), Gemini's vision capability handles both extraction and initial interpretation in one call, with self-reported confidence that drives the retry logic.

Structured output everywhere — .with_structured_output() with Pydantic models eliminates fragile JSON parsing. Every LLM response is type-validated before entering state.

Graceful degradation — every node wraps its LLM call in try/except and returns a safe fallback with descriptive errors/warnings. The API never crashes with a 500; it always returns a usable (if partial) response.

Parallel analysis — reasoning and risk detection are independent operations, so they run as concurrent LangGraph branches and merge before severity routing. This cuts latency roughly in half for the analysis phase.

React + Vite frontend — Vite's fast HMR and component-based architecture pair well with the structured JSON responses from the backend, making it straightforward to map API fields to UI components.

Known Limitations

Limitation	Detail
OCR on very degraded images	Even with retries, severely damaged or blurry images may not extract reliably
No drug database	Medication interpretations come from Gemini's training data, not a validated drug database
LLM hallucination risk	Structured output reduces but does not eliminate hallucination
No persistent storage	Results are not stored between sessions
English-optimised	Non-English prescriptions may have reduced accuracy
Free-tier rate limits	Gemini free tier has per-minute request limits

Future Roadmap

Reliability

OpenFDA API integration for validated medication lookup
Per-field confidence scoring (not just overall OCR confidence)
Cross-field consistency validation (e.g., flag impossible dosages)

Production

PostgreSQL persistence for prescriptions, results, and chat history
User authentication (JWT)
Containerisation (Docker + docker-compose)
Deployment: backend on Railway/Render, frontend on Vercel

Intelligence

Multi-language prescription support
Drug interaction detection across medications on the same prescription
Pharmacist feedback loop to improve prompts over time

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PrescriptoAI

ScreenShots

Table of Contents

Overview

Features

AI & Backend

Frontend

Tech Stack

System Architecture

Processing Pipeline

Frontend

API Reference

`POST /api/upload-prescription`

`POST /api/analyze`

`POST /api/chat`

Project Structure

Setup & Running

Prerequisites

Backend

Frontend

Design Decisions

Known Limitations

Future Roadmap

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
backend		backend
frontend		frontend
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
flow_guide.md		flow_guide.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

PrescriptoAI

ScreenShots

Table of Contents

Overview

Features

AI & Backend

Frontend

Tech Stack

System Architecture

Processing Pipeline

Frontend

API Reference

POST /api/upload-prescription

POST /api/analyze

POST /api/chat

Project Structure

Setup & Running

Prerequisites

Backend

Frontend

Design Decisions

Known Limitations

Future Roadmap

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`POST /api/upload-prescription`

`POST /api/analyze`

`POST /api/chat`

Packages