Paperwise is an open-source, self-hosted document intelligence app for OCR, metadata extraction, document organization, and grounded Q&A.
Website: paperwise.dev
Docs:
- Docker Engine or Docker Desktop with Compose support
- At least one model provider key for metadata extraction and grounded Q&A
- Optional:
tesseractif you want OCR to stay local instead of using an LLM for OCR
The GitHub Actions publish workflow pushes images to:
ghcr.io/zellux/paperwise
Create a docker-compose.yml file:
services:
api:
image: ghcr.io/zellux/paperwise:latest
environment:
PAPERWISE_ENV: docker
PAPERWISE_LOG_LEVEL: INFO
PAPERWISE_API_HOST: 0.0.0.0
PAPERWISE_API_PORT: 8000
PAPERWISE_REDIS_URL: redis://redis:6379/0
PAPERWISE_REPOSITORY_BACKEND: postgres
PAPERWISE_POSTGRES_URL: postgresql+psycopg://paperwise:paperwise@postgres:5432/paperwise
PAPERWISE_OBJECT_STORE_ROOT: /data/object-store
PAPERWISE_AUTH_SECRET: replace-with-a-strong-secret
PAPERWISE_AUTH_TOKEN_TTL_SECONDS: "43200"
depends_on:
redis:
condition: service_healthy
postgres:
condition: service_healthy
ports:
- "8080:8000"
volumes:
- paperwise_data:/data
healthcheck:
test: ["CMD", "python", "-c", "import urllib.request; urllib.request.urlopen('http://127.0.0.1:8000/healthz', timeout=3)"]
interval: 15s
timeout: 5s
retries: 5
start_period: 20s
restart: unless-stopped
worker:
image: ghcr.io/zellux/paperwise:latest
command: ["celery", "-A", "paperwise.workers.celery_app.celery_app", "worker", "--loglevel=INFO"]
environment:
PAPERWISE_ENV: docker
PAPERWISE_LOG_LEVEL: INFO
PAPERWISE_REDIS_URL: redis://redis:6379/0
PAPERWISE_REPOSITORY_BACKEND: postgres
PAPERWISE_POSTGRES_URL: postgresql+psycopg://paperwise:paperwise@postgres:5432/paperwise
PAPERWISE_OBJECT_STORE_ROOT: /data/object-store
PAPERWISE_AUTH_SECRET: replace-with-a-strong-secret
PAPERWISE_AUTH_TOKEN_TTL_SECONDS: "43200"
depends_on:
redis:
condition: service_healthy
postgres:
condition: service_healthy
volumes:
- paperwise_data:/data
restart: unless-stopped
redis:
image: redis:7-alpine
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 3s
retries: 5
restart: unless-stopped
postgres:
image: postgres:16-alpine
environment:
POSTGRES_USER: paperwise
POSTGRES_PASSWORD: paperwise
POSTGRES_DB: paperwise
volumes:
- postgres_data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U paperwise -d paperwise"]
interval: 10s
timeout: 3s
retries: 5
restart: unless-stopped
volumes:
paperwise_data:
postgres_data:Start the stack:
docker compose up -dBefore starting, replace replace-with-a-strong-secret with your own secret in both api and worker.
To pin a specific release tag, replace the image value, for example:
image: ghcr.io/zellux/paperwise:v0.1.1If the GHCR package is private, make it public in the GitHub package settings before sharing it with other users.
Open Paperwise at http://localhost:8080.
Before you hand this off to other users, it is worth checking five things on a clean deploy:
- Sign up and sign in both work.
- Saving Settings > Model Config succeeds.
- Uploading a document moves it out of
processing. - The worker is running and document jobs complete.
- Ask My Docs returns an answer for at least one test document.
New images are published automatically when changes land on main.
To update a running server that uses ghcr.io/zellux/paperwise:latest:
docker compose pull
docker compose up -dTo pin a release instead of tracking latest, set the image tag explicitly, for example:
image: ghcr.io/zellux/paperwise:v0.1.1Then update by changing the tag in your docker-compose.yml and running:
docker compose pull
docker compose up -dBack up both persistent data locations:
postgres_datafor users, preferences, document records, and search metadatapaperwise_datafor uploaded files and extracted object-store data
If you only back up one of them, restore will be incomplete.
For a safe upgrade path:
- Back up
postgres_data. - Back up
paperwise_data. - Run
docker compose pull. - Run
docker compose up -d.
After the app is running:
- Create your first account.
- Open Settings > Model Config.
- Add an OpenAI, Gemini, or Custom (OpenAI-compatible) connection.
- Assign models for:
- Metadata Extraction
- Grounded Q&A
- OCR
- Upload a few documents and test extraction or Ask My Docs.
- Metadata extraction: a fast general-purpose model
- Grounded Q&A: a stronger reasoning model if you need cross-document answers
- OCR:
- use LLM OCR for scans, forms, or image-heavy PDFs
- use Local Tesseract if you want OCR to stay on your machine
For more detail, use the docs:
docker compose ps
docker compose logs -f api worker
docker compose logs -f postgres redis
docker compose pull
docker compose downIf uploads stay stuck in processing:
- check that the
workercontainer is running - inspect
docker compose logs -f api worker - make sure Redis is healthy and reachable
If a new image was published but your server still looks old:
- run
docker compose pull - then run
docker compose up -d - if you pinned a version tag, update the tag in
docker-compose.yml
If upload works but extraction or Ask My Docs fails:
- open Settings > Model Config
- confirm the required task models are assigned
- check the API logs for provider timeout or auth errors
See LICENSE.