PaperSurveyor

Agentic Literature Survey Engine for Researchers
Domain-aware paper retrieval, importance-first ranking, cross-domain exploration, and survey report generation.

Stars · Blueprint · Deployment

Why This Project Exists

Most paper tools help you retrieve literature. PaperSurveyor is built to help you finish the survey workflow:

understand what a query is really asking
route it through domain-aware source strategies
rank papers by importance, not only keyword similarity
cluster the topic and surface research lineage
generate an editable survey report

It is not a generic chat wrapper. It is a workflow-first research engine.

What Makes It Different

Capability	Typical Paper Search	PaperSurveyor
Ranking objective	Relevance-first	Importance-first
Domain awareness	Weak	Built-in venue/source profiles
Cross-domain surveying	Manual	Native
Survey workflow	Fragmented	End-to-end
Explainability	Minimal	Feature-level ranking breakdown
Report generation	External	Built-in async pipeline

Current MVP

Real retrieval from OpenAlex and Crossref
PostgreSQL-backed paper cache, venue registry, search history, ranking features, report tasks, and report outputs
Built-in domain profiles for:
- Computer Science
- Medicine
- Biology
- Materials / Physics / Chemistry
- Economics / Management / Social Science
Explainable importance ranking
Search UI wired to the live API
Redis-backed async report generation queue
Docker Compose and Render deployment config

Product Flow

flowchart LR
  A["User Query"] --> B["Query Understanding"]
  B --> C["Domain Router"]
  C --> D["Source Strategy"]
  D --> E["OpenAlex + Crossref Retrieval"]
  E --> F["Importance Ranking"]
  F --> G["Theme Clustering"]
  G --> H["Insight Extraction"]
  H --> I["Survey Report"]

Architecture

apps/
  web/          Next.js frontend
  api/          FastAPI + SQLAlchemy + Dramatiq
packages/
  agents/       Agent prompts and pipeline definitions
  config/       Domain profiles, venue priorities, source strategies
  core/         Shared ranking logic
docs/           Product blueprint and deployment docs

UI Preview

High-end minimal frontend instead of an admin dashboard look
Search results optimized for “what should I read first”
Importance score, reading level, recommendation reasons, and source provenance shown inline

Built-In Domain Strategy

The MVP ships with editable seed profiles in packages/config/domains/default.json.

Examples:

Computer Science: CCF-style top conferences and journals such as NeurIPS, ICML, ICLR, CVPR, KDD
Medicine: NEJM, The Lancet, JAMA, BMJ, Nature Medicine, The Lancet Digital Health
Biology: Cell, Nature, Science, Nature Biotechnology, Nature Genetics, Genome Biology

These profiles are configuration, not hardcoded logic, so the community can keep extending them by domain.

Importance Ranking

The MVP ranking is explicit and inspectable:

importance_score =
100 * (
  0.30 * relevance_score +
  0.22 * venue_score +
  0.16 * citation_score +
  0.10 * recency_score +
  0.10 * survey_foundation_score +
  0.07 * cross_domain_score +
  0.05 * domain_profile_boost
)

Implementation:

Database Model

Key tables already wired into the MVP:

papers
authors
paper_authors
venues
domains
domain_source_profiles
search_history
ranking_features
report_tasks
report_outputs

Database seed bootstrap:

cd apps/api
python -m venv .venv
source .venv/bin/activate
pip install -e .
python -m app.api_cli

Quickstart

Option 1: Docker Demo

docker compose up --build

Open:

Web: http://localhost:3000
API: http://localhost:8000

Option 2: Manual Development

pnpm install
pnpm dev:web

cd apps/api
python -m venv .venv
source .venv/bin/activate
pip install -e .
python -m app.api_cli
uvicorn app.main:app --reload --port 8000

cd apps/api
source .venv/bin/activate
dramatiq app.worker

Example API Calls

curl "http://localhost:8000/search?q=multimodal%20clinical%20decision%20support&domains=computer_science&domains=medicine&year_from=2021&year_to=2026"

curl -X POST http://localhost:8000/report/generate \
  -H "Content-Type: application/json" \
  -d '{"query":"multimodal clinical decision support","paper_ids":["<paper-uuid>"]}'

Online Demo Deployment

This repo includes:

docker-compose.yml for local full-stack demo
render.yaml for hosted demo deployment
docs/DEPLOYMENT.md for setup details

Recommended hosted topology:

papersurveyor-web on Render
papersurveyor-api on Render
papersurveyor-worker on Render
managed PostgreSQL
managed Redis

Source References For Initial Seed Profiles

Initial high-authority venue seeds were curated from official or primary sources, including:

Roadmap

Replace heuristic query understanding with structured LLM planning
Add OpenAlex citation graph exploration
Add PubMed and Semantic Scholar adapters
Add workspace persistence and collaborative editing
Add PDF parsing and evidence extraction
Add benchmark datasets for ranking evaluation

Contributing

If you want to extend domain profiles, provider adapters, ranking features, or UI workflows, open an issue or PR. Good first contribution areas:

new domain profiles
venue authority tuning
provider adapters
report templates
frontend interaction polish

Star This Repo

If you care about open-source tooling for serious literature survey work rather than another generic research chatbot, this project is worth watching.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
apps		apps
docs		docs
packages		packages
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml
package.json		package.json
pnpm-workspace.yaml		pnpm-workspace.yaml
render.yaml		render.yaml
turbo.json		turbo.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PaperSurveyor

Why This Project Exists

What Makes It Different

Current MVP

Product Flow

Architecture

UI Preview

Built-In Domain Strategy

Importance Ranking

Database Model

Quickstart

Option 1: Docker Demo

Option 2: Manual Development

Example API Calls

Online Demo Deployment

Source References For Initial Seed Profiles

Roadmap

Contributing

Star This Repo

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PaperSurveyor

Why This Project Exists

What Makes It Different

Current MVP

Product Flow

Architecture

UI Preview

Built-In Domain Strategy

Importance Ranking

Database Model

Quickstart

Option 1: Docker Demo

Option 2: Manual Development

Example API Calls

Online Demo Deployment

Source References For Initial Seed Profiles

Roadmap

Contributing

Star This Repo

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages