diff --git a/dashboards/open-brain-dashboard-next/README.md b/dashboards/open-brain-dashboard-next/README.md index 87e19fb5..4b829931 100644 --- a/dashboards/open-brain-dashboard-next/README.md +++ b/dashboards/open-brain-dashboard-next/README.md @@ -34,6 +34,8 @@ Provides 9 pages for managing your thoughts: - **Node.js 18+** installed - A **Vercel account** (free tier works) or any Node.js hosting +> **Need a backend?** [`integrations/cloudflare-rest-worker/`](../../integrations/cloudflare-rest-worker/) implements the `open-brain-rest` REST API this dashboard expects. Deploy it as a Cloudflare Worker, set `NEXT_PUBLIC_API_URL` to the Worker URL, and the four core pages (Dashboard, Browse, Detail, Search) work end-to-end. See that integration's README for setup + known limitations. + ### Credential Tracker | Credential | Where to get it | Where it goes | diff --git a/integrations/cloudflare-rest-worker/.gitignore b/integrations/cloudflare-rest-worker/.gitignore new file mode 100644 index 00000000..265b4963 --- /dev/null +++ b/integrations/cloudflare-rest-worker/.gitignore @@ -0,0 +1,6 @@ +node_modules/ +.wrangler/ +wrangler.toml +package-lock.json +*.log +.DS_Store diff --git a/integrations/cloudflare-rest-worker/README.md b/integrations/cloudflare-rest-worker/README.md new file mode 100644 index 00000000..53612751 --- /dev/null +++ b/integrations/cloudflare-rest-worker/README.md @@ -0,0 +1,272 @@ +# Open Brain REST Gateway (Cloudflare Worker) + +A small Cloudflare Worker that implements the REST API the [Next.js +dashboard](../../dashboards/open-brain-dashboard-next/) expects — `open-brain-rest`. The +dashboard's README references this service but no implementation ships in +the repo; this Worker fills that gap so the four core dashboard pages +(Dashboard, Browse, Detail, Search) work end-to-end. + +## What It Does + +Exposes a REST-shaped surface over your existing Open Brain Supabase project: + +| Method | Path | Backed by | +|---|---|---| +| `GET` | `/health` | unauthenticated; used by the dashboard's login page to validate the API URL | +| `GET` | `/thoughts` | paginated `SELECT` with whitelisted `sort` + filters (type, source_type, importance_min, quality_score_max, status, exclude_restricted) | +| `GET` | `/thought/:id` | single-row read | +| `PUT` | `/thought/:id` | partial update of `{ content, type, importance, status }` (last one bumps `status_updated_at`) | +| `DELETE` | `/thought/:id` | hard delete | +| `POST` | `/search` | semantic (embedding → `match_thoughts` RPC → re-fetch full rows) or text mode (`search_thoughts_text` RPC) | +| `GET` | `/stats` | reshapes the existing `brain_stats_aggregate` RPC into the dashboard's StatsResponse shape | +| `POST` | `/capture` | extracts metadata + embeds in parallel, calls `upsert_thought`, returns `{thought_id, action, type, sensitivity_tier, content_fingerprint, message}` | +| `GET` | `/ingestion-jobs` | empty stub (smart-ingest is out of scope for v1) | +| `POST` | `/ingest`, `POST` `/ingestion-jobs/:id/execute` | 501 Not Implemented | + +Auth: same `MCP_ACCESS_KEY` your `open-brain-mcp` already uses, sent as the +`x-brain-key` header (or `Authorization: Bearer …` / `?key=…`). + +## Architecture + +``` +Browser + │ + │ HTTPS (iron-session cookie set on dashboard /login) + ▼ +Cloudflare Pages: open-brain-dashboard-next + │ + │ HTTPS, server-side, x-brain-key from session cookie + ▼ +Cloudflare Worker: open-brain-rest ← THIS WORKER + │ + │ HTTPS, service-role JWT + ▼ +Supabase (thoughts table + RPCs) +``` + +## Prerequisites + +- Working Open Brain setup ([guide](../../docs/01-getting-started.md)) — gives + you `thoughts`, `match_thoughts`, `upsert_thought` +- [`schemas/enhanced-thoughts/`](../../schemas/enhanced-thoughts/) applied — + required for `/stats`, `/search?mode=text`, and the + `type / sensitivity_tier / importance / quality_score / source_type` columns +- A Cloudflare account (free tier works) — sign up at + [dash.cloudflare.com](https://dash.cloudflare.com) +- `wrangler` CLI installed (`npm install -g wrangler`) and authenticated + (`wrangler login`) +- Node.js 20+ + +## Credential Tracker + +```text +OPEN BRAIN REST -- CREDENTIAL TRACKER +-------------------------------------- + +SUPABASE (from your Open Brain setup) + Project URL: ____________ + Service role key: ____________ + MCP access key (reused): ____________ + OpenRouter API key: ____________ + +WORKER (filled in after deploy) + Worker URL: ____________ + +-------------------------------------- +``` + +## Setup + +### Step 1 — Configure + +```bash +cd integrations/cloudflare-rest-worker +cp wrangler.toml.example wrangler.toml +``` + +The default `wrangler.toml` deploys as `ob-rest`. Rename via `[name]` if you +want a different subdomain. + +### Step 2 — Install + +```bash +npm install +``` + +### Step 3 — Set secrets + +`wrangler secret put` is interactive — it prompts for the value, no shell +history. Set all four: + +```bash +wrangler secret put SUPABASE_URL +wrangler secret put SUPABASE_SERVICE_ROLE_KEY +wrangler secret put MCP_ACCESS_KEY +wrangler secret put OPENROUTER_API_KEY +``` + +`MCP_ACCESS_KEY` is the same value already set on your `open-brain-mcp` +function — the dashboard reuses it. `OPENROUTER_API_KEY` powers the +`/search?mode=semantic` and `/capture` endpoints; same key as core. + +### Step 4 — Deploy + +```bash +wrangler deploy +``` + +Wrangler prints the published URL: `https://ob-rest..workers.dev`. +Save it as `WORKER_URL` in the credential tracker. + +### Step 5 — Verify + +```bash +# Unauthenticated health check +curl -sS "${WORKER_URL}/health" +# → {"status":"ok","service":"open-brain-rest","version":"0.1.0"} + +# Auth enforcement +curl -sS -X GET "${WORKER_URL}/thoughts" +# → {"error":"Unauthorized"} 401 + +# Authenticated list +curl -sS "${WORKER_URL}/thoughts?per_page=3" \ + -H "x-brain-key: ${MCP_ACCESS_KEY}" +# → {"data":[…],"total":N,"page":1,"per_page":3} + +# Stats +curl -sS "${WORKER_URL}/stats?days=7" \ + -H "x-brain-key: ${MCP_ACCESS_KEY}" + +# Semantic search +curl -sS -X POST "${WORKER_URL}/search" \ + -H "x-brain-key: ${MCP_ACCESS_KEY}" \ + -H "content-type: application/json" \ + -d '{"query":"my thoughts on X","mode":"semantic","limit":5,"page":1,"exclude_restricted":true}' + +# Capture +curl -sS -X POST "${WORKER_URL}/capture" \ + -H "x-brain-key: ${MCP_ACCESS_KEY}" \ + -H "content-type: application/json" \ + -d '{"content":"Test thought from REST gateway"}' +``` + +## Wiring the Dashboard + +In the dashboard's `.env` (or Cloudflare Pages env vars): + +``` +NEXT_PUBLIC_API_URL=https://ob-rest..workers.dev +SESSION_SECRET= +``` + +Then run the dashboard locally (`npm run dev` from +`dashboards/open-brain-dashboard-next/`) or deploy it (next section). At the +login page, paste your `MCP_ACCESS_KEY` — the dashboard validates it against +this Worker's `/health`, then encrypts it into an HTTP-only session cookie +for the rest of the session. + +## Deploying the Dashboard to Cloudflare + +The dashboard is a Next.js app. Cloudflare's current path for Next.js 15+ is +the [`@opennextjs/cloudflare`](https://opennext.js.org/cloudflare) adapter, +which deploys the app as a Cloudflare Worker (with static assets attached). +The older `@cloudflare/next-on-pages` adapter caps at Next 15.5.x and won't +work with this dashboard's Next 16.x. + +```bash +cd dashboards/open-brain-dashboard-next +npm install +npm install -D @opennextjs/cloudflare wrangler + +# Scaffold open-next.config.ts and wrangler.jsonc — the dashboard's +# README has the templates and walks through the deploy. + +npx opennextjs-cloudflare build +npx opennextjs-cloudflare deploy +wrangler secret put SESSION_SECRET --name +``` + +`NEXT_PUBLIC_API_URL` is read from the dashboard's `.env` at *build* time and +baked into the bundle, so set it before running `build`. `SESSION_SECRET` is +a *runtime* secret set on the Worker after first deploy. + +The dashboard ends up at +`https://..workers.dev`. Custom domain +optional. + +## Known Limitations (v1) + +These are real impedance mismatches between the dashboard's expectations and +the upstream schema. The Worker is correct; resolving these requires upstream +changes that are out of scope for this PR: + +1. **`Thought.id` type mismatch.** The dashboard's TypeScript types declare + `id: number` and call `parseInt(id, 10)` on URL params + (`app/thoughts/[id]/page.tsx:29`). The actual `thoughts.id` column is + `UUID`. The Worker returns UUIDs as strings. Until the dashboard's `id` + type is widened to `string | number`, the Detail page won't navigate to + individual rows. **A separate small follow-up PR can fix the dashboard + types.** + +2. **`importance` scale mismatch.** The dashboard's `PRIORITY_LEVELS` + expects 0–100 (Critical = 80+). The `enhanced-thoughts` schema defaults + `importance` to 3 with no documented upper bound; the entity-extraction + worker emits 0–6. Existing data will render as "Low" priority in the + dashboard. Not a Worker bug. + +3. **No `reflections` table.** The dashboard's Detail page calls + `/thought/:id/reflection`. No schema in the repo creates a `reflections` + table. The Worker doesn't implement this endpoint; the page will surface + an error, the rest of the dashboard works. + +4. **No smart-ingest integration.** `/ingest`, `/ingestion-jobs/:id`, and + `/ingestion-jobs/:id/execute` return 501. The dashboard's Add to Brain + "extract" mode and the Ingestion Jobs detail view will surface errors; + single-thought capture via `/capture` works. + +## Troubleshooting + +**`wrangler deploy` errors with "not authenticated"** +Run `wrangler login`. The CLI opens a browser window for OAuth. + +**Health check returns 200 but `/thoughts` returns 401** +You sent a key that doesn't match the Worker's `MCP_ACCESS_KEY` secret. +Verify with `wrangler secret list` (it shows names + when each was set, not +values). If you rotated the key on Supabase, also run +`wrangler secret put MCP_ACCESS_KEY` to keep them in sync. + +**`/search?mode=semantic` returns 500 with "OpenRouter embedding failed"** +The `OPENROUTER_API_KEY` secret is missing, expired, or out of credits. +`wrangler secret put OPENROUTER_API_KEY` to refresh. + +**`/search?mode=text` returns 500 with "function search_thoughts_text does not exist"** +The `enhanced-thoughts` schema isn't applied. Run +`schemas/enhanced-thoughts/schema.sql` in your Supabase SQL Editor. + +**`/stats` returns 500 with "function brain_stats_aggregate does not exist"** +Same fix as above — apply `schemas/enhanced-thoughts/schema.sql`. + +**Dashboard logs in successfully but Browse shows zero rows** +Check that `NEXT_PUBLIC_API_URL` is the Worker URL, not the Supabase MCP +function URL. The MCP function speaks JSON-RPC, not REST, and won't return +`{ data: [...] }`. + +**Capture returns 200 but `embedding` column stays null** +The Worker calls `upsert_thought` (which writes the row) and then a +follow-up `UPDATE` (which writes the embedding). If your `service_role` is +missing `UPDATE` grants on `thoughts`, that follow-up fails silently into +500 — check Worker Logs in the Cloudflare dashboard for the Postgres error. + +## What This Worker Doesn't Do + +- **Workflow kanban endpoints** (P1) — needs `workflow-status` schema + + status-flow status transitions; can be a follow-up. +- **Audit bulk delete, Duplicates** (P2) — `quality_score`-based bulk + operations. +- **Reflections** (P2) — needs a `reflections` table that no schema + currently creates. +- **Smart ingest extract / execute** (P2) — large feature; needs its own + integration. + +Future PRs can add these incrementally. diff --git a/integrations/cloudflare-rest-worker/metadata.json b/integrations/cloudflare-rest-worker/metadata.json new file mode 100644 index 00000000..f4029c94 --- /dev/null +++ b/integrations/cloudflare-rest-worker/metadata.json @@ -0,0 +1,17 @@ +{ + "name": "Open Brain REST Gateway (Cloudflare Worker)", + "description": "REST API gateway that backs dashboards/open-brain-dashboard-next. Implements the endpoints the dashboard's lib/api.ts expects (thoughts CRUD, search, stats, capture) on top of the existing Supabase schema and RPCs. Deploys as a Cloudflare Worker.", + "category": "integrations", + "author": { "name": "Travis Swicegood", "github": "tswicegood" }, + "version": "0.1.0", + "requires": { + "open_brain": true, + "services": ["Cloudflare Workers", "Supabase"], + "tools": ["wrangler", "Node.js 20+"] + }, + "tags": ["rest", "dashboard", "cloudflare", "worker", "gateway"], + "difficulty": "intermediate", + "estimated_time": "20 minutes", + "created": "2026-04-25", + "updated": "2026-04-25" +} diff --git a/integrations/cloudflare-rest-worker/package.json b/integrations/cloudflare-rest-worker/package.json new file mode 100644 index 00000000..d69a078a --- /dev/null +++ b/integrations/cloudflare-rest-worker/package.json @@ -0,0 +1,21 @@ +{ + "name": "open-brain-rest", + "version": "0.1.0", + "private": true, + "description": "REST gateway Worker that backs dashboards/open-brain-dashboard-next.", + "type": "module", + "scripts": { + "dev": "wrangler dev", + "deploy": "wrangler deploy", + "types": "wrangler types" + }, + "devDependencies": { + "@cloudflare/workers-types": "^4.20250420.0", + "typescript": "^5.6.0", + "wrangler": "^3.95.0" + }, + "dependencies": { + "@supabase/supabase-js": "^2.47.10", + "hono": "^4.9.2" + } +} diff --git a/integrations/cloudflare-rest-worker/src/index.ts b/integrations/cloudflare-rest-worker/src/index.ts new file mode 100644 index 00000000..9da01d6d --- /dev/null +++ b/integrations/cloudflare-rest-worker/src/index.ts @@ -0,0 +1,63 @@ +/** + * open-brain-rest — Cloudflare Worker REST gateway. + * + * Backs the Next.js dashboard at dashboards/open-brain-dashboard-next/ by + * implementing the endpoints its lib/api.ts expects. Reads from / writes to + * the existing Open Brain Supabase schema using the service-role key. + * + * Tech: Hono on Cloudflare Workers. Auth: x-brain-key (or Authorization: + * Bearer / ?key=) — same shared secret used by open-brain-mcp. + */ + +import { Hono } from "hono"; +import { cors } from "hono/cors"; +import { health } from "./routes/health"; +import { thoughts } from "./routes/thoughts"; +import { search } from "./routes/search"; +import { stats } from "./routes/stats"; +import { capture } from "./routes/capture"; +import { ingestionJobs } from "./routes/ingestion-jobs"; +import { requireApiKey } from "./lib/auth"; +import type { Env } from "./lib/types"; + +const app = new Hono<{ Bindings: Env }>(); + +// Open CORS — the dashboard's server-side fetches don't need it (they go +// from Worker to Worker), but local development hits the Worker directly +// from the browser via curl/devtools, and other clients (e.g. Insomnia) +// also benefit. Allow the headers we actually accept; reject by default +// at the auth layer instead. +app.use( + "*", + cors({ + origin: "*", + allowMethods: ["GET", "POST", "PUT", "DELETE", "OPTIONS"], + allowHeaders: ["Content-Type", "Authorization", "x-brain-key"], + maxAge: 86400, + }), +); + +// Mount /health pre-auth. The dashboard's login validates the API URL with +// an unauthenticated GET, then makes a second authenticated call to confirm +// the key — so /health must respond regardless of credentials. +app.route("/", health); + +// Everything below this point requires a valid x-brain-key. +app.use("*", requireApiKey); + +app.route("/", thoughts); +app.route("/", search); +app.route("/", stats); +app.route("/", capture); +app.route("/", ingestionJobs); + +// Catch-all: anything unmatched returns 404. Hono's default is 200 with an +// empty body, which is more confusing than a clear miss. +app.notFound((c) => c.json({ error: "Not Found" }, 404)); + +app.onError((err, c) => { + console.error("Unhandled error:", err); + return c.json({ error: err.message || "Internal error" }, 500); +}); + +export default app; diff --git a/integrations/cloudflare-rest-worker/src/lib/auth.ts b/integrations/cloudflare-rest-worker/src/lib/auth.ts new file mode 100644 index 00000000..be4c4301 --- /dev/null +++ b/integrations/cloudflare-rest-worker/src/lib/auth.ts @@ -0,0 +1,38 @@ +import type { MiddlewareHandler } from "hono"; +import type { Env } from "./types"; + +// Three accepted auth shapes, matching the rest of the OB1 ecosystem +// (open-brain-mcp, entity-extraction-worker): +// 1. `x-brain-key: ` header +// 2. `Authorization: Bearer ` +// 3. `?key=` query param (last resort — discouraged because the +// key lands in proxy logs and Referer headers, but supported for +// parity with existing patterns) +export function readClientKey(req: Request): string { + const headerKey = req.headers.get("x-brain-key")?.trim(); + if (headerKey) return headerKey; + + const auth = req.headers.get("authorization") ?? ""; + const match = auth.match(/^Bearer\s+(.+)$/i); + if (match) return match[1].trim(); + + return new URL(req.url).searchParams.get("key")?.trim() ?? ""; +} + +// Auth middleware. Mounted at the app level after the /health pass-through. +// Constant-time compare isn't strictly necessary against an opaque shared +// secret over TLS — tcompare with === works fine for our threat model — but +// we still avoid logging the comparison values. +export const requireApiKey: MiddlewareHandler<{ Bindings: Env }> = async (c, next) => { + const expected = c.env.MCP_ACCESS_KEY; + if (!expected) { + // Misconfiguration — refuse all requests instead of silently allowing. + console.error("MCP_ACCESS_KEY is not set on the Worker."); + return c.json({ error: "Server misconfigured" }, 500); + } + const provided = readClientKey(c.req.raw); + if (!provided || provided !== expected) { + return c.json({ error: "Unauthorized" }, 401); + } + await next(); +}; diff --git a/integrations/cloudflare-rest-worker/src/lib/embedding.ts b/integrations/cloudflare-rest-worker/src/lib/embedding.ts new file mode 100644 index 00000000..6a31c3e0 --- /dev/null +++ b/integrations/cloudflare-rest-worker/src/lib/embedding.ts @@ -0,0 +1,32 @@ +import type { Env } from "./types"; + +// Same model the core open-brain-mcp uses for `search_thoughts` (see +// server/index.ts in the upstream MCP). Keeping them in sync means +// query-time embeddings match the dimensionality of the stored vectors +// (1536) without any extra projection. +const EMBEDDING_MODEL = "openai/text-embedding-3-small"; +const OPENROUTER_BASE = "https://openrouter.ai/api/v1"; + +export async function generateEmbedding(env: Env, text: string): Promise { + if (!env.OPENROUTER_API_KEY) { + throw new Error("OPENROUTER_API_KEY not configured on Worker"); + } + const res = await fetch(`${OPENROUTER_BASE}/embeddings`, { + method: "POST", + headers: { + Authorization: `Bearer ${env.OPENROUTER_API_KEY}`, + "Content-Type": "application/json", + }, + body: JSON.stringify({ model: EMBEDDING_MODEL, input: text }), + }); + if (!res.ok) { + const detail = await res.text().catch(() => ""); + throw new Error(`OpenRouter embedding failed (${res.status}): ${detail}`); + } + const payload = (await res.json()) as { data?: Array<{ embedding?: number[] }> }; + const vec = payload?.data?.[0]?.embedding; + if (!Array.isArray(vec)) { + throw new Error("OpenRouter response missing embedding vector"); + } + return vec; +} diff --git a/integrations/cloudflare-rest-worker/src/lib/responses.ts b/integrations/cloudflare-rest-worker/src/lib/responses.ts new file mode 100644 index 00000000..8b0f4102 --- /dev/null +++ b/integrations/cloudflare-rest-worker/src/lib/responses.ts @@ -0,0 +1,36 @@ +import type { Context } from "hono"; + +// Consistent JSON error shape used across every route. The dashboard's +// fetch wrapper (lib/api.ts → ApiError) reads `error` from the body — no +// other field is required by the consumer. Status codes follow standard +// REST conventions: 400 for malformed input, 401 for auth failures, 404 +// for missing rows, 500 for server-side faults. +export function fail(c: Context, status: number, message: string) { + return c.json({ error: message }, status as 400 | 401 | 404 | 500); +} + +// Wrap a thrown error into a 500. Squashes the stack out of the response +// because this is a public API; the actual stack lands in `console.error` +// where Workers Logs can pick it up. +export function fromError(c: Context, err: unknown, fallback = "Internal error") { + const msg = err instanceof Error ? err.message : String(err ?? fallback); + console.error("rest-gateway error:", msg, err); + return fail(c, 500, msg || fallback); +} + +// Parse a positive-integer query param with a default. Returns the default +// for missing/blank/NaN/negative input. Used for pagination + window sizes. +export function intParam(value: string | undefined, fallback: number, max?: number): number { + if (!value) return fallback; + const n = Number.parseInt(value, 10); + if (!Number.isFinite(n) || n < 0) return fallback; + return max && n > max ? max : n; +} + +// Parse a boolean-ish query param. Accepts "true"/"1"/"yes" as true; everything +// else (including missing) returns the fallback. Used for `exclude_restricted`, +// `dry_run`, etc. +export function boolParam(value: string | undefined, fallback: boolean): boolean { + if (value === undefined) return fallback; + return /^(true|1|yes)$/i.test(value); +} diff --git a/integrations/cloudflare-rest-worker/src/lib/supabase.ts b/integrations/cloudflare-rest-worker/src/lib/supabase.ts new file mode 100644 index 00000000..35f41c2a --- /dev/null +++ b/integrations/cloudflare-rest-worker/src/lib/supabase.ts @@ -0,0 +1,12 @@ +import { createClient, type SupabaseClient } from "@supabase/supabase-js"; +import type { Env } from "./types"; + +// Build a service-role client for a single request. We don't cache across +// requests in module scope because each request lives in a separate Worker +// isolate context and the client is cheap to construct (no connection pool; +// PostgREST calls go over HTTPS). +export function supabaseFor(env: Env): SupabaseClient { + return createClient(env.SUPABASE_URL, env.SUPABASE_SERVICE_ROLE_KEY, { + auth: { autoRefreshToken: false, persistSession: false }, + }); +} diff --git a/integrations/cloudflare-rest-worker/src/lib/types.ts b/integrations/cloudflare-rest-worker/src/lib/types.ts new file mode 100644 index 00000000..8698a720 --- /dev/null +++ b/integrations/cloudflare-rest-worker/src/lib/types.ts @@ -0,0 +1,8 @@ +// Worker bindings injected by Wrangler. These are set as secrets via +// `wrangler secret put` — see README. +export interface Env { + SUPABASE_URL: string; + SUPABASE_SERVICE_ROLE_KEY: string; + MCP_ACCESS_KEY: string; + OPENROUTER_API_KEY: string; +} diff --git a/integrations/cloudflare-rest-worker/src/routes/capture.ts b/integrations/cloudflare-rest-worker/src/routes/capture.ts new file mode 100644 index 00000000..e4bcd5a4 --- /dev/null +++ b/integrations/cloudflare-rest-worker/src/routes/capture.ts @@ -0,0 +1,132 @@ +import { Hono } from "hono"; +import type { Env } from "../lib/types"; +import { supabaseFor } from "../lib/supabase"; +import { fail, fromError } from "../lib/responses"; +import { generateEmbedding } from "../lib/embedding"; + +// /capture mirrors the open-brain-mcp `capture_thought` MCP tool: extract +// metadata via LLM (gpt-4o-mini JSON mode), generate the embedding, then +// call the existing upsert_thought RPC (defined in docs/01-getting-started.md +// Step 2.6) which handles dedup via the content fingerprint. Embedding is +// written back in a follow-up UPDATE since upsert_thought's signature only +// accepts content + metadata, not the vector. + +const OPENROUTER_BASE = "https://openrouter.ai/api/v1"; +const METADATA_MODEL = "openai/gpt-4o-mini"; + +const METADATA_PROMPT = [ + 'Extract metadata from the user\'s captured thought. Return STRICT JSON with keys:', + '- "people": array of people mentioned (empty if none)', + '- "action_items": array of implied to-dos (empty if none)', + '- "dates_mentioned": array of dates YYYY-MM-DD (empty if none)', + '- "topics": array of 1-3 short topic tags (always at least one)', + '- "type": one of "observation", "task", "idea", "reference", "person_note"', + "Only extract what's explicitly there.", +].join("\n"); + +interface ExtractedMetadata { + people?: string[]; + action_items?: string[]; + dates_mentioned?: string[]; + topics?: string[]; + type?: string; +} + +async function extractMetadata(env: Env, text: string): Promise { + const res = await fetch(`${OPENROUTER_BASE}/chat/completions`, { + method: "POST", + headers: { + Authorization: `Bearer ${env.OPENROUTER_API_KEY}`, + "Content-Type": "application/json", + }, + body: JSON.stringify({ + model: METADATA_MODEL, + response_format: { type: "json_object" }, + messages: [ + { role: "system", content: METADATA_PROMPT }, + { role: "user", content: text }, + ], + }), + }); + if (!res.ok) { + // Treat metadata extraction as best-effort: a model failure shouldn't + // block the capture. The RPC will accept an empty metadata object. + console.warn("metadata extraction failed:", res.status); + return { topics: ["uncategorized"], type: "observation" }; + } + const payload = (await res.json()) as { + choices?: Array<{ message?: { content?: string } }>; + }; + const content = payload?.choices?.[0]?.message?.content ?? ""; + try { + return JSON.parse(content) as ExtractedMetadata; + } catch { + return { topics: ["uncategorized"], type: "observation" }; + } +} + +const VALID_TYPES = new Set([ + "observation", + "task", + "idea", + "reference", + "person_note", +]); + +export const capture = new Hono<{ Bindings: Env }>(); + +capture.post("/capture", async (c) => { + try { + const body = (await c.req.json().catch(() => ({}))) as { content?: string }; + const content = (body.content ?? "").trim(); + if (!content) return fail(c, 400, "content is required"); + + // Run metadata extraction and embedding in parallel — they don't depend + // on each other and both call OpenRouter, so this saves one round-trip + // worth of latency. + const [metadata, embedding] = await Promise.all([ + extractMetadata(c.env, content), + generateEmbedding(c.env, content), + ]); + + const type = VALID_TYPES.has(metadata.type ?? "") + ? (metadata.type as string) + : "observation"; + + const sb = supabaseFor(c.env); + const { data: rpcResult, error: rpcErr } = await sb.rpc("upsert_thought", { + p_content: content, + p_payload: { metadata: { ...metadata, source: "rest-gateway" } }, + }); + if (rpcErr) return fail(c, 500, rpcErr.message); + + const result = (rpcResult ?? {}) as { id?: string; fingerprint?: string }; + if (!result.id) return fail(c, 500, "upsert_thought returned no id"); + + // Backfill the embedding + the new top-level columns the dashboard reads + // (type, source_type). sensitivity_tier defaults to 'standard' via the + // schema definition; we don't classify here in v1. + const { data: row, error: updateErr } = await sb + .from("thoughts") + .update({ + embedding, + type, + source_type: "rest-gateway", + }) + .eq("id", result.id) + .select("id, type, sensitivity_tier, content_fingerprint") + .single(); + if (updateErr) return fail(c, 500, updateErr.message); + + return c.json({ + thought_id: row.id, + action: "captured", + type: row.type ?? type, + sensitivity_tier: row.sensitivity_tier ?? "standard", + content_fingerprint: row.content_fingerprint ?? result.fingerprint ?? "", + message: "Thought captured", + }); + } catch (err) { + return fromError(c, err); + } +}); diff --git a/integrations/cloudflare-rest-worker/src/routes/health.ts b/integrations/cloudflare-rest-worker/src/routes/health.ts new file mode 100644 index 00000000..fc68b2a0 --- /dev/null +++ b/integrations/cloudflare-rest-worker/src/routes/health.ts @@ -0,0 +1,13 @@ +import { Hono } from "hono"; +import type { Env } from "../lib/types"; + +// Pre-auth route. The dashboard hits /health on login to validate that the +// API URL is reachable AND that the key the user pasted works — so /health +// itself stays open, but the dashboard's login-side flow then calls another +// authed endpoint to confirm the key. Returning bare {status:"ok"} matches +// the pattern other Open Brain functions use. +export const health = new Hono<{ Bindings: Env }>(); + +health.get("/health", (c) => + c.json({ status: "ok", service: "open-brain-rest", version: "0.1.0" }), +); diff --git a/integrations/cloudflare-rest-worker/src/routes/ingestion-jobs.ts b/integrations/cloudflare-rest-worker/src/routes/ingestion-jobs.ts new file mode 100644 index 00000000..7931d4b4 --- /dev/null +++ b/integrations/cloudflare-rest-worker/src/routes/ingestion-jobs.ts @@ -0,0 +1,46 @@ +import { Hono } from "hono"; +import type { Env } from "../lib/types"; +import { fail } from "../lib/responses"; + +// /ingest and /ingestion-jobs/* are P2 endpoints used by the dashboard's +// "Add to Brain" smart-extraction flow. The upstream repo doesn't ship a +// smart-ingest integration yet, so we stub these conservatively: +// +// GET /ingestion-jobs → empty list (the Ingest page renders +// a clean empty state when no jobs +// exist, so this is non-disruptive) +// POST /ingest → 501 Not Implemented +// GET /ingestion-jobs/:id → 404 +// POST /ingestion-jobs/:id/execute → 501 Not Implemented +// +// A future PR can replace these with real implementations once a +// smart-ingest integration lands upstream. +export const ingestionJobs = new Hono<{ Bindings: Env }>(); + +ingestionJobs.get("/ingestion-jobs", (c) => { + return c.json({ jobs: [], count: 0 }); +}); + +ingestionJobs.get("/ingestion-jobs/:id", (c) => + fail( + c, + 404, + "Ingestion job not found (smart-ingest integration not deployed)", + ), +); + +ingestionJobs.post("/ingestion-jobs/:id/execute", (c) => + fail( + c, + 501, + "Smart-ingest not implemented in this Worker. See the integration's README under 'Known limitations'.", + ), +); + +ingestionJobs.post("/ingest", (c) => + fail( + c, + 501, + "Smart-ingest not implemented in this Worker. Use /capture for single-thought writes.", + ), +); diff --git a/integrations/cloudflare-rest-worker/src/routes/search.ts b/integrations/cloudflare-rest-worker/src/routes/search.ts new file mode 100644 index 00000000..1a90692f --- /dev/null +++ b/integrations/cloudflare-rest-worker/src/routes/search.ts @@ -0,0 +1,163 @@ +import { Hono } from "hono"; +import type { Env } from "../lib/types"; +import { supabaseFor } from "../lib/supabase"; +import { fail, fromError } from "../lib/responses"; +import { generateEmbedding } from "../lib/embedding"; + +// match_thoughts (defined in docs/01-getting-started.md Step 2.3) returns the +// minimum vector-similarity payload — id, content, metadata, similarity, +// created_at — but NOT sensitivity_tier or the other enhanced columns the +// dashboard's Thought type expects. So semantic mode does two queries: +// 1. match_thoughts(emb, threshold, count, filter) → candidate IDs + +// similarity scores +// 2. SELECT * FROM thoughts WHERE id IN (...) [AND sensitivity_tier != +// 'restricted'] → full row payloads +// Then we stitch similarity back onto the rows by id and order. + +const RESTRICTED_TIER = "restricted"; + +// We over-fetch from match_thoughts so post-filtering restricted rows still +// leaves enough survivors to return `limit` results. 3x is generous for typical +// "10% restricted" data; users with very high restricted ratios may see fewer +// results than `limit` — not incorrect, just lossy. Acceptable for v1. +const SEMANTIC_OVERFETCH = 3; +// text-embedding-3-small produces cosine similarities in the ~0.2–0.5 range +// for clearly related content; only near-paraphrases climb above 0.5. A 0.2 +// default is conservative enough to drop unrelated noise without starving +// real queries. Callers can override via body.threshold. +const DEFAULT_MATCH_THRESHOLD = 0.2; + +interface SearchBody { + query?: string; + mode?: "semantic" | "text"; + limit?: number; + page?: number; + exclude_restricted?: boolean; + threshold?: number; +} + +export const search = new Hono<{ Bindings: Env }>(); + +search.post("/search", async (c) => { + try { + const body = (await c.req.json().catch(() => ({}))) as SearchBody; + const query = (body.query ?? "").trim(); + if (!query) return fail(c, 400, "query is required"); + + const mode = body.mode === "text" ? "text" : "semantic"; + const limit = clampInt(body.limit, 25, 1, 100); + const page = Math.max(1, Math.floor(body.page ?? 1)); + const excludeRestricted = body.exclude_restricted !== false; + const offset = (page - 1) * limit; + const sb = supabaseFor(c.env); + + if (mode === "text") { + // Over-fetch when restricted-filtering so we don't end up short. Cap at 200 + // to keep the GIN scan bounded. + const fetchLimit = excludeRestricted + ? Math.min(limit * SEMANTIC_OVERFETCH, 200) + : limit; + const { data, error } = await sb.rpc("search_thoughts_text", { + p_query: query, + p_limit: fetchLimit, + p_filter: {}, + p_offset: offset, + }); + if (error) return fail(c, 500, error.message); + + type Row = Record & { + sensitivity_tier?: string; + rank?: number; + total_count?: number; + }; + const rows = (data ?? []) as Row[]; + const filtered = excludeRestricted + ? rows.filter((r) => r.sensitivity_tier !== RESTRICTED_TIER) + : rows; + const sliced = filtered.slice(0, limit); + // search_thoughts_text returns total_count on each row (denormalized). + // Pull it from the first row; fall back to filtered length if absent. + const total = Number(rows[0]?.total_count ?? filtered.length); + const results = sliced.map(({ total_count: _t, ...rest }) => rest); + + return c.json({ + results, + count: results.length, + total, + page, + per_page: limit, + total_pages: Math.max(1, Math.ceil(total / limit)), + mode, + }); + } + + // Semantic mode: embed → match_thoughts → fetch full rows → stitch. + const embedding = await generateEmbedding(c.env, query); + const threshold = + typeof body.threshold === "number" && body.threshold >= 0 && body.threshold <= 1 + ? body.threshold + : DEFAULT_MATCH_THRESHOLD; + const { data: matches, error: matchErr } = await sb.rpc("match_thoughts", { + query_embedding: embedding, + match_threshold: threshold, + match_count: limit * SEMANTIC_OVERFETCH, + filter: {}, + }); + if (matchErr) return fail(c, 500, matchErr.message); + + const candidates = (matches ?? []) as Array<{ id: string; similarity: number }>; + if (candidates.length === 0) { + return c.json({ + results: [], + count: 0, + total: 0, + page, + per_page: limit, + total_pages: 0, + mode, + }); + } + + const ids = candidates.map((m) => m.id); + let fullQuery = sb.from("thoughts").select("*").in("id", ids); + if (excludeRestricted) fullQuery = fullQuery.neq("sensitivity_tier", RESTRICTED_TIER); + const { data: fullRows, error: fullErr } = await fullQuery; + if (fullErr) return fail(c, 500, fullErr.message); + + // Stitch similarity scores back, preserving match_thoughts order (most + // similar first). Drop rows that didn't survive the restricted filter. + const fullById = new Map>(); + for (const r of fullRows ?? []) fullById.set(r.id, r); + const stitched: Array & { similarity: number }> = []; + for (const m of candidates) { + const row = fullById.get(m.id); + if (!row) continue; // restricted, or deleted between calls + stitched.push({ ...row, similarity: m.similarity }); + if (stitched.length >= limit) break; + } + + // We don't know the absolute total without a second count query — and + // counting "all thoughts above the cosine threshold" is what match_thoughts + // already truncated. Report the candidate count as an approximation; for + // pagination the dashboard mainly uses page/per_page anyway. + const total = candidates.length; + + return c.json({ + results: stitched, + count: stitched.length, + total, + page, + per_page: limit, + total_pages: Math.max(1, Math.ceil(total / limit)), + mode, + }); + } catch (err) { + return fromError(c, err); + } +}); + +function clampInt(value: unknown, fallback: number, min: number, max: number): number { + const n = typeof value === "number" ? value : Number(value); + if (!Number.isFinite(n)) return fallback; + return Math.min(max, Math.max(min, Math.floor(n))); +} diff --git a/integrations/cloudflare-rest-worker/src/routes/stats.ts b/integrations/cloudflare-rest-worker/src/routes/stats.ts new file mode 100644 index 00000000..f1b7edd1 --- /dev/null +++ b/integrations/cloudflare-rest-worker/src/routes/stats.ts @@ -0,0 +1,49 @@ +import { Hono } from "hono"; +import type { Env } from "../lib/types"; +import { supabaseFor } from "../lib/supabase"; +import { boolParam, fail, fromError, intParam } from "../lib/responses"; + +// brain_stats_aggregate (defined in schemas/enhanced-thoughts) returns +// { total, top_types, top_topics } — the dashboard expects +// { total_thoughts, window_days, types, top_topics }. So this route +// reshapes the RPC payload into the dashboard's StatsResponse shape: +// - total → total_thoughts +// - top_types: [{ type, count }] → types: { [type]: count } +// - top_topics: [{ topic, count }] → top_topics (passthrough) +// - window_days: 0 → "all", else number → window_days +export const stats = new Hono<{ Bindings: Env }>(); + +interface StatsRpc { + total?: number; + top_types?: Array<{ type: string; count: number }>; + top_topics?: Array<{ topic: string; count: number }>; +} + +stats.get("/stats", async (c) => { + try { + const days = intParam(c.req.query("days"), 30); + const excludeRestricted = boolParam(c.req.query("exclude_restricted"), true); + const sb = supabaseFor(c.env); + + const { data, error } = await sb.rpc("brain_stats_aggregate", { + p_since_days: days, + p_exclude_restricted: excludeRestricted, + }); + if (error) return fail(c, 500, error.message); + + const payload = (data ?? {}) as StatsRpc; + const types: Record = {}; + for (const row of payload.top_types ?? []) { + if (row?.type) types[row.type] = row.count ?? 0; + } + + return c.json({ + total_thoughts: payload.total ?? 0, + window_days: days === 0 ? "all" : days, + types, + top_topics: payload.top_topics ?? [], + }); + } catch (err) { + return fromError(c, err); + } +}); diff --git a/integrations/cloudflare-rest-worker/src/routes/thoughts.ts b/integrations/cloudflare-rest-worker/src/routes/thoughts.ts new file mode 100644 index 00000000..06e4133c --- /dev/null +++ b/integrations/cloudflare-rest-worker/src/routes/thoughts.ts @@ -0,0 +1,155 @@ +import { Hono } from "hono"; +import type { Env } from "../lib/types"; +import { supabaseFor } from "../lib/supabase"; +import { boolParam, fail, fromError, intParam } from "../lib/responses"; + +// Whitelist of columns the client is allowed to sort by. We pass `sort` +// straight to PostgREST .order(); without a whitelist a malicious client +// could pass arbitrary column names and probe the schema (low-risk under +// service-role + RLS, but still — minimum-privilege applied to inputs). +const SORTABLE_COLUMNS = new Set([ + "created_at", + "updated_at", + "importance", + "quality_score", + "type", + "source_type", + "status", +]); + +// PostgREST reserves a special value 'restricted' for the sensitivity_tier +// column (defined in schemas/enhanced-thoughts). exclude_restricted=true is +// the default; the dashboard sets it to false only when a session has +// unlocked the restricted view via passphrase. +const RESTRICTED_TIER = "restricted"; + +export const thoughts = new Hono<{ Bindings: Env }>(); + +// GET /thoughts — paginated list with filters. Matches dashboards/.../lib/api.ts +// fetchThoughts(): page, per_page, type, source_type, importance_min, +// quality_score_max, sort, order, exclude_restricted. Also accepts `status` +// for the kanban view (added by fetchKanbanThoughts). +thoughts.get("/thoughts", async (c) => { + try { + const sb = supabaseFor(c.env); + const q = c.req.query(); + + const page = Math.max(1, intParam(q.page, 1)); + const perPage = intParam(q.per_page, 25, 100); + const sort = q.sort && SORTABLE_COLUMNS.has(q.sort) ? q.sort : "created_at"; + const order = q.order === "asc" ? "asc" : "desc"; + const excludeRestricted = boolParam(q.exclude_restricted, true); + + let query = sb + .from("thoughts") + .select("*", { count: "exact" }) + .order(sort, { ascending: order === "asc" }) + .range((page - 1) * perPage, page * perPage - 1); + + if (q.type) query = query.eq("type", q.type); + if (q.source_type) query = query.eq("source_type", q.source_type); + if (q.status) query = query.eq("status", q.status); + if (q.importance_min) { + const n = Number(q.importance_min); + if (Number.isFinite(n)) query = query.gte("importance", n); + } + if (q.quality_score_max !== undefined && q.quality_score_max !== "") { + const n = Number(q.quality_score_max); + if (Number.isFinite(n)) query = query.lte("quality_score", n); + } + if (excludeRestricted) query = query.neq("sensitivity_tier", RESTRICTED_TIER); + + const { data, error, count } = await query; + if (error) return fail(c, 500, error.message); + + return c.json({ + data: data ?? [], + total: count ?? 0, + page, + per_page: perPage, + }); + } catch (err) { + return fromError(c, err); + } +}); + +// GET /thought/:id — single row. Returns 404 if the row doesn't exist OR if +// it's restricted and the caller didn't opt in to including restricted rows. +thoughts.get("/thought/:id", async (c) => { + try { + const id = c.req.param("id"); + const excludeRestricted = boolParam(c.req.query("exclude_restricted"), true); + const sb = supabaseFor(c.env); + + let query = sb.from("thoughts").select("*").eq("id", id); + if (excludeRestricted) query = query.neq("sensitivity_tier", RESTRICTED_TIER); + + const { data, error } = await query.maybeSingle(); + if (error) return fail(c, 500, error.message); + if (!data) return fail(c, 404, "Thought not found"); + + return c.json(data); + } catch (err) { + return fromError(c, err); + } +}); + +// PUT /thought/:id — partial update. The dashboard sends any subset of +// content/type/importance/status. Anything else in the body is ignored +// (defense against accidental schema bleed from the client). +thoughts.put("/thought/:id", async (c) => { + try { + const id = c.req.param("id"); + const body = (await c.req.json().catch(() => ({}))) as Record; + const update: Record = {}; + if (typeof body.content === "string") update.content = body.content; + if (typeof body.type === "string") update.type = body.type; + if (typeof body.importance === "number") update.importance = body.importance; + // status: nullable — the dashboard sends `null` to clear the kanban status + if (body.status === null || typeof body.status === "string") { + update.status = body.status; + update.status_updated_at = new Date().toISOString(); + } + + if (Object.keys(update).length === 0) { + return fail(c, 400, "No updatable fields provided"); + } + + const sb = supabaseFor(c.env); + const { data, error } = await sb + .from("thoughts") + .update(update) + .eq("id", id) + .select("id") + .maybeSingle(); + if (error) return fail(c, 500, error.message); + if (!data) return fail(c, 404, "Thought not found"); + + return c.json({ + id: data.id, + action: "updated", + message: `Updated fields: ${Object.keys(update).join(", ")}`, + }); + } catch (err) { + return fromError(c, err); + } +}); + +// DELETE /thought/:id — hard delete. The dashboard's audit/duplicates pages +// rely on this. No soft-delete column exists in the schema; if we wanted +// soft deletes we'd add a column rather than fake them here. +thoughts.delete("/thought/:id", async (c) => { + try { + const id = c.req.param("id"); + const sb = supabaseFor(c.env); + const { error, count } = await sb + .from("thoughts") + .delete({ count: "exact" }) + .eq("id", id); + if (error) return fail(c, 500, error.message); + if ((count ?? 0) === 0) return fail(c, 404, "Thought not found"); + return c.body(null, 204); + } catch (err) { + return fromError(c, err); + } +}); diff --git a/integrations/cloudflare-rest-worker/tsconfig.json b/integrations/cloudflare-rest-worker/tsconfig.json new file mode 100644 index 00000000..f30d7600 --- /dev/null +++ b/integrations/cloudflare-rest-worker/tsconfig.json @@ -0,0 +1,18 @@ +{ + "compilerOptions": { + "target": "ES2022", + "module": "ES2022", + "moduleResolution": "Bundler", + "lib": ["ES2022"], + "types": ["@cloudflare/workers-types"], + "strict": true, + "noImplicitAny": true, + "noUnusedLocals": true, + "noUnusedParameters": true, + "esModuleInterop": true, + "resolveJsonModule": true, + "isolatedModules": true, + "skipLibCheck": true + }, + "include": ["src/**/*.ts"] +} diff --git a/integrations/cloudflare-rest-worker/wrangler.toml.example b/integrations/cloudflare-rest-worker/wrangler.toml.example new file mode 100644 index 00000000..815616b6 --- /dev/null +++ b/integrations/cloudflare-rest-worker/wrangler.toml.example @@ -0,0 +1,22 @@ +# Copy to wrangler.toml. The plain unprefixed config below is the default +# environment — `wrangler deploy` will use it directly. To run more than one +# instance (e.g. staging vs. production, or one Worker per Open Brain instance), +# add [env.] blocks and deploy with `wrangler deploy --env `. + +name = "ob-rest" +main = "src/index.ts" +compatibility_date = "2026-04-25" +compatibility_flags = ["nodejs_compat"] + +# Secrets are set via `wrangler secret put` — never commit them. See README +# step "Set secrets" for the four values this Worker needs: +# SUPABASE_URL +# SUPABASE_SERVICE_ROLE_KEY +# MCP_ACCESS_KEY +# OPENROUTER_API_KEY + +# Example: a second Worker for a separate Open Brain instance. Uncomment and +# rename to deploy a parallel gateway. +# +# [env.staging] +# name = "ob-rest-staging"