The governance gateway between AI agents and the business systems they were never supposed to touch unsupervised.
Recent receipts. February 2026: Meta's Director of AI Alignment lost 200+ emails to an autonomous agent that ignored her stop commands. July 2025: Replit's AI deleted a live production database during a declared code freeze and lied about recovery. 97% of enterprises expect a major AI-agent incident within the next 12 months; only 14.4% of agents reach production with full security review.
The agents are working. The safety surface around them isn't.
Semantic GPS sits between agents and any MCP-connected system as one control plane: shadow→enforce live policy swap (the observe-before-acting pattern those incidents lacked), audit on every call, saga rollback with explicit per-step input mapping, a Tool Relationship (TRel) MCP extension for workflow discovery, and a side-by-side Playground proving raw-MCP vs governed contrast under identical Opus 4.7 prompts.
Built for the Anthropic "Keep Thinking" Hackathon (April 2026). 5-day scope.
- Live demo: https://semantic-gps-hackathon.vercel.app/
- Demo video: https://www.youtube.com/watch?v=fYh2MpMw1ng
- Full story:
docs/VISION.md - Security:
SECURITY.md
- Vendor-agnostic MCP gateway. Customers register their own MCP servers via
POST /api/servers(HTTP-Streamable or OpenAPI). The gateway has zero hardcoded vendor knowledge. The demo recording happens to use a few real upstreams to prove end-to-end correctness; nothing is bundled. - 12 gateway-native policies across 7 governance dimensions: time/state gates (
business_hours,write_freeze), rate limiting, identity (client_id,agent_identity_required), residency (ip_allowlist,geo_fence), data hygiene (pii_redactionwith libphonenumber-js,injection_guard), kill switches, idempotency. - TRel extension methods on the gateway:
discover_relationships,find_workflow_path,validate_workflow,evaluate_goal. Same JSON-RPC surface as standard MCP methods. - Saga rollback with canonical per-step
rollback_input_mappingDSL. Compensators get mapped args, not raw producer results. - Playground A/B at
/dashboard/playground. Same prompt, same Opus 4.7 client, two endpoints (raw MCP vs governed gateway). Honest variable isolation, no tool-count cheats. - Shadow → enforce policy mode swap, demoed live from the Policies page.
- Three-tier scoped gateway:
/api/mcp(org),/api/mcp/domain/[slug],/api/mcp/server/[id]. Bearer-token auth, per-scope manifest caching.
Requires: Node 20+, pnpm 10, Docker (for the local Supabase stack), openssl for key generation.
# 1. Install deps
pnpm install
# 2. Start local Supabase (Postgres + Auth on :54321)
pnpm supabase start
# 3. Wire up .env.local
cp .env.example .env.local
# fill in the local Supabase keys printed by `pnpm supabase start`,
# plus your ANTHROPIC_API_KEY and a freshly generated encryption key:
openssl rand -base64 32 # CREDENTIALS_ENCRYPTION_KEY
# 4. Apply migrations + seed the demo org
pnpm supabase db reset
# 5. (Optional) Load demo data: sample MCPs, tools, saga route, and policies that exercise the gateway end-to-end
docker exec -i supabase_db_semantic-gps-hackathon \
psql -U postgres -d postgres -f /dev/stdin \
< scripts/bootstrap-local-demo.sql
# 6. Run the app
pnpm dev
# http://localhost:3000| Var | Purpose |
|---|---|
NEXT_PUBLIC_SUPABASE_URL |
Supabase project URL (local http://127.0.0.1:54321 or hosted) |
NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY |
Supabase publishable key (2026 format: sb_publishable_…) |
SUPABASE_SECRET_KEY |
Supabase service-role key (gateway-only; sb_secret_…) |
ANTHROPIC_API_KEY |
Required for the Playground agent loops |
PLAYGROUND_MODEL |
Playground model ID (claude-sonnet-4-6 by default) |
EVALUATE_GOAL_MODEL |
TRel evaluate_goal ranker (claude-opus-4-7) |
NEXT_PUBLIC_APP_URL |
Absolute app URL. Set to your Cloudflare tunnel URL when testing the Playground agent against local. |
CREDENTIALS_ENCRYPTION_KEY |
AES-256-GCM key for servers.auth_config ciphertext. Generate with openssl rand -base64 32 |
The gateway is vendor-agnostic; nothing below is required to boot. The three demo upstreams (Salesforce / Slack / GitHub) are co-deployed at app/api/mcps/<vendor>/route.ts so the demo recording can govern real third-party traffic end-to-end. Bring your own MCPs (HTTP-Streamable or OpenAPI) and skip these entirely.
| Var | Purpose |
|---|---|
SF_LOGIN_URL |
Salesforce demo MCP, org base URL |
SF_CLIENT_ID |
Salesforce demo MCP, Connected App client id (Client Credentials flow) |
SF_CLIENT_SECRET |
Salesforce demo MCP, Connected App client secret |
SLACK_BOT_TOKEN |
Slack demo MCP, bot token (xoxb-…); scopes: chat:write, users:read.email, channels:read |
GITHUB_PAT |
GitHub demo MCP, classic PAT with repo scope (owner/repo come from each tool call, not env) |
| Var | Purpose |
|---|---|
SSRF_ALLOW_LOCALHOST=1 |
Set in .env.local, leave unset on Vercel. The gateway's proxyHttp roundtrips through safeFetch for every upstream including the co-deployed vendor routes. With origin_url=http://localhost:3000/... in dev, the SSRF guard would block the hop without this flag. Prod uses the live HTTPS domain so the guard stays tight. |
| Var | Purpose |
|---|---|
NEXT_PUBLIC_ENABLE_DEMO_SIMULATORS=1 |
Renders the simulator row on the dashboard graph page |
REAL_PROXY_ENABLED=0 |
Forces the dispatcher's mock canned-data path (default is real upstreams) |
MANIFEST_INTROSPECTION_ENABLED=1 |
Opens /api/internal/manifest/invalidate on prod (dev auto-opens via NODE_ENV) |
SEMANTIC_GPS_GATEWAY_URL |
Explicit gateway URL for the Playground runner; falls back to NEXT_PUBLIC_APP_URL |
The app never reads these at runtime; only vitest suites under __tests__/ reference them. Set them in your shell when running the gated tests locally. All default to skipped so CI stays fast.
| Var | Purpose |
|---|---|
VERIFY_ANTHROPIC=1 |
Hits real Anthropic API (needs ANTHROPIC_API_KEY) |
VERIFY_REAL_PROXY=1 |
Runs proxy tests against real upstreams |
VERIFY_INTEGRATIONS=1 |
Umbrella flag for all live-upstream tests |
VERIFY_SALESFORCE=1 |
Salesforce live tests (requires SF_*) |
VERIFY_SLACK=1 |
Slack live tests (requires SLACK_BOT_TOKEN) |
VERIFY_GITHUB=1 |
GitHub live tests (requires GITHUB_PAT) |
VERIFY_GATEWAY_URL |
Base URL for E2E gateway tests (e.g. tunnel URL) |
All runtime env helpers throw loudly on missing values. No silent production fallbacks.
pnpm dev: Next.js on :3000 (Turbopack)pnpm test: Vitest suite (__tests__/*.vitest.ts), 344 pass / 2 skippnpm typecheck:tsc --noEmitpnpm lint: ESLintpnpm supabase start: local Docker Postgres + Authpnpm supabase db reset: re-apply all migrations + seed locallypnpm supabase db push: apply pending migrations to hosted (deploy-only)
Demo-day recording aides:
node scripts/cleanup-demo-data.mjs: close stale GH issues, prune Slack bot messages, delete recent SF Tasks. Idempotent; run between recording takes.scripts/bootstrap-local-demo.sql: re-seed the 3-MCP / 12-tool / saga-route / policy demo set afterdb reset.
- MCP gateway (
app/api/mcp/**): stateless@modelcontextprotocol/sdkserver factory, HTTP-Streamable transport, JSON-RPC 2.0. FreshMcpServerper request; no in-memory session state. - Manifest cache (
lib/manifest/cache.ts): per-scope compiled view of servers / tools / policies / routes. Invalidated on every mutation route. - Policy engine (
lib/policies/**): 12 builtins, pre-call + post-call phases, shadow/enforce mode is a DB column flip. Fail-closed by convention. - Proxy layer (
lib/mcp/proxy-*.ts): per-transport dispatchers (openapi, salesforce, slack, github, direct-http). Decryptsauth_config, SSRF-guarded fetches, typedExecuteResultunion. - Routes + sagas (
lib/mcp/execute-route.ts): ordered step execution, explicitinput_mapping+rollback_input_mappingDSLs, compensated_by traversal on halt, shared traceId audit chain.
Full stack reference in docs/ARCHITECTURE.md.
Sprint-by-sprint build log in TASKS.md.
Roadmap and post-hackathon vision in docs/VISION.md.
MIT. See LICENSE.
Built by @mboss37. Claude Opus 4.7 1M-context used throughout the build loop.
