P2N Monorepo

The Paper-to-Notebook (P2N) project turns a research paper PDF into a deterministic reproduction plan, notebook, and run pipeline. The current implementation covers the first three stages of the roadmap—ingest ? extract ? plan/materialize—and exposes deterministic stubs for notebook execution.

What’s Implemented Today

Stage	Description	Key Endpoints	Artifacts
Ingest	Upload a PDF, store it in Supabase Storage, attach it to an OpenAI File Search vector store, and persist paper metadata.	`POST /api/v1/papers/ingest`	Supabase storage path, vector store id
Verify	Confirm the paper’s storage/vector store references without exposing secrets.	`GET /api/v1/papers/{paper_id}/verify`	Boolean status report
Extract	Stream claim extraction via SSE. The extractor agent uses File Search, enforces tool caps, and emits structured `claims[]{dataset, split, metric, value, units, citation, confidence}`.	`POST /api/v1/papers/{paper_id}/extract`	SSE stages, typed errors (e.g., `E_EXTRACT_LOW_CONFIDENCE`)
Plan	Planner agent returns a Plan JSON v1.1 document with dataset/model/config/metrics/visualizations/explain/justifications. Stored in the `plans` table (schema v0: only `PRIMARY KEY(id)`).	`POST /api/v1/papers/{paper_id}/plan`	Plan JSON persisted with logical `paper_id` reference
Materialize	Convert a stored plan into a deterministic notebook + requirements set. Notebook cells seed RNGs, log SSE-style events, and produce `metrics.json` when executed.	`POST /api/v1/plans/{plan_id}/materialize`	Supabase assets under `plans/{plan_id}/`
Verify Plan Assets	Issue short-lived signed URLs for the notebook and requirements artifacts.	`GET /api/v1/plans/{plan_id}/assets`	Signed URLs (120 s TTL) with tokens redacted in logs
Run Stub	Launch a simulated run, stream SSE via `/api/v1/runs/{run_id}/events`, and persist metrics/events/log artifacts. This will be replaced by a sandbox worker in C-RUN-01.	`POST /api/v1/plans/{plan_id}/run`	`runs/{run_id}/metrics.json`, `events.jsonl`, `logs.txt`

All persistence adheres to the “No-Rules v0” posture: tables define PRIMARY KEY(id) only, no foreign keys, defaults, RLS, or advanced constraints. Application code supplies every value explicitly (including timestamps and UUIDs).

Directory Map

api/ – FastAPI service housing agents, routes, Supabase wrappers, and SSE streaming utilities.
- app/materialize/ – Notebook/env generation utilities (Plan JSON ? notebook.ipynb + requirements.txt).
- app/runs/ – In-memory SSE manager for run streaming.
- app/routers/ – Public HTTP surface (papers, plans, runs, internal).
- app/data/ – Typed Supabase wrappers and pydantic models (PaperCreate, PlanCreate, RunCreate, etc.).
- tests/ – Pytest suites for ingest, extract, plan, materialize, and run stubs.
worker/ – Placeholder for the future sandbox runner (to be filled in during C-RUN-01).
web/ – Placeholder React client.
sql/ – Schema v0 reference files and reset scripts (documentation only; no migrations applied automatically).
infra/ – Skeleton for IaC and deployment workflows.

Environment & Dependencies

Create .env from .env.example and populate:

OPENAI_API_KEY=sk-...
SUPABASE_URL=https://<project>.supabase.co
SUPABASE_SERVICE_ROLE_KEY=...
SUPABASE_ANON_KEY=...
FILE_SEARCH_PER_RUN=10
WEB_SEARCH_PER_RUN=5
OPENAI_TRACING_ENABLED=true

Install API requirements (Python 3.12.5):

python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r api/requirements.txt

Optional developer extras (nbformat already pinned for notebook work).

Running the Services

# start API with deterministic SSE (reload disabled keeps streams stable)
python -m uvicorn app.main:app --app-dir api --log-level info

# run worker stub (currently no-op)
make worker

# serve example web client shell
make web

End-to-End Manual Flow (PowerShell)

# 1. Ingest a paper PDF
$PDF = "C:\path\to\paper.pdf"
$ingest = curl.exe -sS -X POST http://127.0.0.1:8000/api/v1/papers/ingest `
  -F "title=Demo Paper" `
  -F ("file=@`"$PDF`";type=application/pdf")
$paper_id = ($ingest | ConvertFrom-Json).paper_id

# 2. Verify storage / vector store references (no secrets returned)
curl.exe -sS "http://127.0.0.1:8000/api/v1/papers/$paper_id/verify"

# 3. Stream extractor SSE (claims + guardrails)
curl.exe -N "http://127.0.0.1:8000/api/v1/papers/$paper_id/extract"

# 4. Generate a Plan JSON v1.1
touch plan.json  # optional local capture
curl.exe -sS -X POST "http://127.0.0.1:8000/api/v1/papers/$paper_id/plan" `
  -H "Content-Type: application/json" `
  -d '{"claims":[{"dataset":"demo","split":"test","metric":"accuracy","value":0.8,"units":"fraction","citation":"p.2","confidence":0.9}]}'

# 5. Materialize notebook + requirements
$plan_id = <plan uuid from previous response>
curl.exe -sS -X POST "http://127.0.0.1:8000/api/v1/plans/$plan_id/materialize"

# 6. Fetch signed asset URLs (120-second TTL)
curl.exe -sS "http://127.0.0.1:8000/api/v1/plans/$plan_id/assets"

# 7. Kick off the run stub and watch SSE
t$run = curl.exe -sS -X POST "http://127.0.0.1:8000/api/v1/plans/$plan_id/run" | ConvertFrom-Json
curl.exe -N "http://127.0.0.1:8000/api/v1/runs/$($run.run_id)/events"

All SSE streams follow the vocabulary: stage_update, log_line, token (delta output), metric_update, sample_pred, and final result payloads. Typed errors (e.g., E_POLICY_CAP_EXCEEDED, E_PLAN_NOT_FOUND, E_RUN_TIMEOUT) surface structured remediation messages.

Testing & Quality Gates

Run the full API suite: .\.venv\Scripts\python.exe -m pytest -q
Key targeted tests:
- test_papers_ingest.py – ingest / verify flows and negative paths.
- test_papers_extract.py – SSE, guardrail enforcement, policy caps.
- test_plans_materialize.py – notebook/env generation and signed URL verification.
- test_runs_stub.py – run initiation + SSE stub behaviours.

CI expectations for every PR:

Provide a diff summary of files touched.
Include happy-path + at least one negative-path test.
Document manual verification steps in the PR body.
Never leak secrets in logs or responses (vector_store IDs must be redacted to abcd1234***).

Observability & Tracing

app/config/llm.py wraps OpenAI client calls with traced_run / traced_subspan. Spans currently used:
- p2n.extractor.run
- p2n.planner.run
- p2n.materialize.codegen
- p2n.materialize.persist
- Run stub spans will evolve in C-RUN-01 (p2n.run.*).
Disable tracing by setting OPENAI_TRACING_ENABLED=false in .env if necessary.

Data & Storage Layout

Supabase Storage keys:
- papers/dev/YYYY/MM/DD/{paper_id}.pdf
- plans/{plan_id}/notebook.ipynb
- plans/{plan_id}/requirements.txt
- runs/{run_id}/metrics.json, events.jsonl, logs.txt
Database (schema v0): tables such as papers, plans, runs, run_events, run_series only define id primary keys. No default values, no referential integrity – every field must be populated from the app.

Roadmap Snapshot

Upcoming prompts (see docs/FUTURE_CODEX_PROMPTS.md for full detail):

C-RUN-01 – turn the run stub into a deterministic sandbox executor with artifact persistence.
C-RUN-02 – enforce deterministic seeds, CPU-only execution, and policy caps.
C-REPORT-01 – compute reproduction gap reports per paper.
C-KID-01 – generate/update Kid-Mode storybooks alongside run completion.

Reference Commands

# OpenAI config doctor
curl.exe -sS http://127.0.0.1:8000/internal/config/doctor

# Health checks
curl.exe -sS http://127.0.0.1:8000/health
curl.exe -sS http://127.0.0.1:8000/health/live

Remember: stay on openai==1.109.1 / openai-agents==0.3.3 until the dedicated migration prompt lands, and keep the database in “No-Rules v0” until the Schema Hardening milestone explicitly authorizes upgrades.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
api		api
docs		docs
infra		infra
sql		sql
web		web
worker		worker
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
AGENTS_detailed.md		AGENTS_detailed.md
CLAUDE_REVIEWER_CONTEXT.md		CLAUDE_REVIEWER_CONTEXT.md
CLAUDE_TESTS_MILESTONES.md		CLAUDE_TESTS_MILESTONES.md
CLAUDE_TESTS_PROMPTS.md		CLAUDE_TESTS_PROMPTS.md
CLAUDE_TEST_SUITE__vNext_after_C-KID-OBS.md		CLAUDE_TEST_SUITE__vNext_after_C-KID-OBS.md
CLAUDE_TEST_SUITE__vNext_after_C-REP-01.md		CLAUDE_TEST_SUITE__vNext_after_C-REP-01.md
CONTRIBUTING.md		CONTRIBUTING.md
MASTER_PROMPTS__C-KID-01_C-OBS-01_C-UI-01__updated.md		MASTER_PROMPTS__C-KID-01_C-OBS-01_C-UI-01__updated.md
MASTER_PROMPTS__vNext_after_C-KID-OBS.md		MASTER_PROMPTS__vNext_after_C-KID-OBS.md
Makefile		Makefile
PLAYBOOK__manual_testing_post_C-KID-OBS.md		PLAYBOOK__manual_testing_post_C-KID-OBS.md
README.md		README.md
SECURITY.md		SECURITY.md
codex_claude_prompts__hotfix_and_runbook.md		codex_claude_prompts__hotfix_and_runbook.md
codex_claude_prompts__next_milestones.md		codex_claude_prompts__next_milestones.md
manage.py		manage.py
master_prompts__next_milestones.md		master_prompts__next_milestones.md
start_server.EXAMPLE.ps1		start_server.EXAMPLE.ps1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

P2N Monorepo

What’s Implemented Today

Directory Map

Environment & Dependencies

Running the Services

End-to-End Manual Flow (PowerShell)

Testing & Quality Gates

Observability & Tracing

Data & Storage Layout

Roadmap Snapshot

Reference Commands

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

P2N Monorepo

What’s Implemented Today

Directory Map

Environment & Dependencies

Running the Services

End-to-End Manual Flow (PowerShell)

Testing & Quality Gates

Observability & Tracing

Data & Storage Layout

Roadmap Snapshot

Reference Commands

About

Resources

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages