Skip to content

jdelaire/verifai

Repository files navigation

VerifAI

AI-generated image detection as a service. Upload an image and receive a report estimating the likelihood it was created by an AI model, with supporting evidence, metadata analysis, and provenance checks.

Architecture

┌──────────┐     ┌──────────────────┐     ┌───────────────────┐
│ Vue SPA  │────>│ Cloudflare Worker│────>│ FastAPI Inference  │
│ (Vite)   │<────│ D1 + R2          │<────│ (ViT detector)    │
└──────────┘     └──────────────────┘     └───────────────────┘
  • Frontend (apps/web) — Vue 3 + Vite + TailwindCSS v4. Drag-and-drop upload, polling, and report rendering.
  • API Worker (apps/worker) — Cloudflare Worker with D1 (SQLite) for job/report storage, R2 for temporary image storage. Dispatches inference via ctx.waitUntil().
  • Inference Service (services/inference) — Python FastAPI service. Extracts EXIF metadata, checks C2PA provenance, and optionally runs a ViT-based AI image detector (umm-maybe/AI-image-detector). The ML detector requires requirements-ml.txt (~1 GB RAM); without it, reports still include metadata and provenance analysis.
  • Shared Types (packages/shared) — TypeScript type definitions shared between frontend and worker.

Prerequisites

  • Node.js >= 18
  • Python 3.11
  • npm (ships with Node)

Local Development

1. Install dependencies

# From repo root — installs all workspace packages
npm install

# Python inference service
cd services/inference
python3.11 -m venv .venv
source .venv/bin/activate

# Base dependencies (metadata + provenance only, no ML detector)
pip install -r requirements.txt

# With ML detector (requires ~1.5 GB disk, ~1 GB RAM for PyTorch + ViT model)
pip install -r requirements-ml.txt

2. Configure environment

The Worker needs these [vars] in wrangler.toml (already set with defaults):

Variable Description
REPORT_TTL_HOURS Hours before reports expire (default: 24)

The Worker also needs secrets. For local dev, create apps/worker/.dev.vars:

INFERENCE_SERVICE_URL=http://localhost:8001
INFERENCE_SHARED_SECRET=your-secret-here

The inference service reads environment variables. Create services/inference/.env:

SHARED_SECRET=your-secret-here
CALLBACK_AUTH_SECRET=your-secret-here

SHARED_SECRET must match the Worker's INFERENCE_SHARED_SECRET.

3. Run the D1 migration

npm run db:migrate:local -w apps/worker

4. Start all services

In three separate terminals:

# Terminal 1 — Frontend (http://localhost:5173)
npm run dev:web

# Terminal 2 — Worker (http://localhost:8787)
npm run dev:worker

# Terminal 3 — Inference (http://localhost:8001)
cd services/inference
source .venv/bin/activate
uvicorn app.main:app --reload --port 8001

API Routes

Method Path Description
POST /api/upload/token Request a job ID and upload URL
PUT /api/upload/:jobId Upload image bytes (proxied to R2)
POST /api/upload/finalize Validate upload, hash, dedup, enqueue
GET /api/report/:jobId Poll for report status/results
POST /api/internal/report Inference callback (internal only)

Testing

# Python tests (45 tests — scoring, evidence, contract, integration)
cd services/inference
source .venv/bin/activate
pytest

# Worker typecheck
npm run typecheck -w apps/worker

# Frontend typecheck + build
npm run build:web

Deployment

  • Frontend: Cloudflare Pages — npm run build:web, deploy apps/web/dist
  • Worker: npm run deploy -w apps/worker (requires wrangler login)
  • Inference: Deploy to Render or Fly.io with services/inference as root, Python 3.11 runtime

Key Design Decisions

  • Proxy upload: Cloudflare Workers can't generate pre-signed R2 URLs, so the Worker proxies uploads via PUT /api/upload/:jobId.
  • Base64 image transfer: The Worker reads from R2, base64-encodes the image, and sends it as a data URL to the inference service.
  • Lazy model loading: The ViT detector loads on first request to keep FastAPI startup fast. Returns null scores gracefully if the model is unavailable.
  • Rate limiting: IP-based, backed by D1. 50 requests/day, 10-second burst limit.
  • File dedup: SHA-256 hash on finalize. If a matching non-expired report exists, it's returned immediately.
  • Auto-cleanup: Hourly cron deletes expired jobs, reports, and stale rate-limit rows.

License

Private — all rights reserved.

About

Probabilistic AI image analysis with provenance and confidence reporting

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors