From b72a9124567863c94e2a7d13a2fa36f642cfc75e Mon Sep 17 00:00:00 2001 From: Willis Kirkham Date: Mon, 16 Mar 2026 17:19:10 -0700 Subject: [PATCH 1/5] Updates plan with AUTH_MODE naming and living document instructions Reverts agent-proposed BANKSY_MODE (internal/public/dev) to existing AUTH_MODE (sso-proxy/mural-oauth/dev) for migration simplicity. Adds "How to use this plan" section establishing the document as a living plan that should be revised in-place during implementation. --- .../00-migration-execution-strategy.md | 65 +++++++++++-------- 1 file changed, 38 insertions(+), 27 deletions(-) diff --git a/fastmcp-migration/execution-strategy-research/00-migration-execution-strategy.md b/fastmcp-migration/execution-strategy-research/00-migration-execution-strategy.md index d6daa3e..a2bf3a1 100644 --- a/fastmcp-migration/execution-strategy-research/00-migration-execution-strategy.md +++ b/fastmcp-migration/execution-strategy-research/00-migration-execution-strategy.md @@ -1,9 +1,18 @@ # Banksy xmcp-to-FastMCP Migration +## How to use this plan + +This is a **living document**. It reflects the intended final state of the migration and should be updated in-place as implementation reveals better approaches, new constraints, or scope changes. + +- When deviating from the plan during implementation, **update the plan first** before proceeding. The plan should always describe what we're actually building, not what we originally thought we'd build. +- Mark revisions inline with a brief **`Revised:`** annotation so readers can tell what changed and why (e.g. *"Revised: switched from X to Y because Z"*). Don't silently overwrite — the revision trail is useful context. +- The `.plan.md` is the working copy. The `.md` preview on the PR is a sharing snapshot and does not need to stay in sync during implementation. +- Research documents linked in the Deep Research Index are **not revised** — they capture point-in-time analysis. If a research finding turns out to be wrong, note it in the plan rather than editing the research doc. + ## Summary -Rewrite banksy from a 3-process TypeScript/xmcp architecture to a Python/FastMCP server. `BANKSY_MODE` (internal/public/dev) selects the auth provider and tool set at runtime — one Docker image, multiple deployments. Two `FastMCP.from_openapi()` calls replace both `banksy-mural-api` (internal API, 39 tools) and `banksy-public-api` (Public API, 87 tools) code-gen pipelines. Auth uses FastMCP's built-in OAuth with Google as the initial IdP for Layer 1 (IDE to banksy) plus custom Python for Layer 2 (banksy to Mural API) token management. Database is a fresh PostgreSQL schema (no data migration). A React SPA is preserved for browser-facing pages (home, Session Activation, error) and served from the same process via Starlette's `StaticFiles`. +Rewrite banksy from a 3-process TypeScript/xmcp architecture to a Python/FastMCP server. `AUTH_MODE` (sso-proxy/mural-oauth/dev) selects the auth provider and tool set at runtime — one Docker image, multiple deployments. Two `FastMCP.from_openapi()` calls replace both `banksy-mural-api` (internal API, 39 tools) and `banksy-public-api` (Public API, 87 tools) code-gen pipelines. Auth uses FastMCP's built-in OAuth with Google as the initial IdP for Layer 1 (IDE to banksy) plus custom Python for Layer 2 (banksy to Mural API) token management. Database is a fresh PostgreSQL schema (no data migration). A React SPA is preserved for browser-facing pages (home, Session Activation, error) and served from the same process via Starlette's `StaticFiles`. The repo uses a uv workspace structure under `pypackages/` — only `banksy-server` is created now. The workspace is ready to expand with `banksy-shared` (extracted shared code) and `banksy-harness` (agent orchestration) when those consumers are needed. Existing TS code in `packages/` stays as read-only reference until the final cleanup removes all TypeScript artifacts. @@ -19,9 +28,9 @@ graph TD Core -.->|"code-gen at build time"| PublicAPI end - subgraph after ["Target (FastMCP, 1 image, BANKSY_MODE per deploy)"] - ClientInt["LLM Client"] -->|"MCP HTTP"| InternalDeploy["banksy BANKSY_MODE=internal"] - ClientPub["LLM Client"] -->|"MCP HTTP"| PublicDeploy["banksy BANKSY_MODE=public"] + subgraph after ["Target (FastMCP, 1 image, AUTH_MODE per deploy)"] + ClientInt["LLM Client"] -->|"MCP HTTP"| InternalDeploy["banksy AUTH_MODE=sso-proxy"] + ClientPub["LLM Client"] -->|"MCP HTTP"| PublicDeploy["banksy AUTH_MODE=mural-oauth"] InternalDeploy -->|REST| MURAL2I["Mural API (internal)"] PublicDeploy -->|REST| MURAL2P["Mural API (public)"] Browser["Browser"] -->|"SPA + auth routes"| PublicDeploy @@ -53,7 +62,7 @@ graph LR | Phase | What It Delivers | Depends On | Parallelism | |-------|-----------------|------------|-------------| -| 1 Bootstrap | uv workspace skeleton (root + `banksy-server` under `pypackages/`), echo tool, health endpoint, `BANKSY_MODE` config, CI | Nothing | -- | +| 1 Bootstrap | uv workspace skeleton (root + `banksy-server` under `pypackages/`), echo tool, health endpoint, `AUTH_MODE` config, CI | Nothing | -- | | 2 OpenAPI Tools | `from_openapi()` integration, Mural API tools | 1 | Parallel with 4, 5 | | 3 Tool Curation | LLM-friendly names, descriptions, transforms, composites | 2 | -- | | 4 Database | PostgreSQL schema, Alembic migrations, token storage | 1 | Parallel with 2, 5 | @@ -76,12 +85,12 @@ Two hard constraints shape the FastMCP migration: ### Deployment Mode Selection (Option E) -Build one Docker image. At runtime, `BANKSY_MODE` selects the auth provider and tool set. Within each mode, tags provide finer-grained client-side filtering. +Build one Docker image. At runtime, `AUTH_MODE` selects the auth provider and tool set. Within each mode, tags provide finer-grained client-side filtering. ``` -BANKSY_MODE=internal -> FastMCP(auth=SSOProxyAuth) + internal tools + internal tags -BANKSY_MODE=public -> FastMCP(auth=MuralOAuthAuth) + public tools + public tags -BANKSY_MODE=dev -> FastMCP(auth=None) + all tools + all tags +AUTH_MODE=sso-proxy -> FastMCP(auth=SSOProxyAuth) + internal tools + internal tags +AUTH_MODE=mural-oauth -> FastMCP(auth=MuralOAuthAuth) + public tools + public tags +AUTH_MODE=dev -> FastMCP(auth=None) + all tools + all tags ``` ### Startup Flow @@ -96,10 +105,10 @@ def create_server() -> FastMCP: register_common_routes(mcp) # /health, /version match settings.banksy_mode: - case "internal": + case "sso-proxy": register_internal_tools(mcp) register_session_activation_routes(mcp) - case "public": + case "mural-oauth": register_public_tools(mcp) register_mural_oauth_routes(mcp) case "dev": @@ -116,9 +125,9 @@ def create_server() -> FastMCP: ### Auth Provider per Mode -**Internal mode (`sso-proxy`):** Layer 1 uses `OAuthProxy` or `RemoteAuthProvider` with Google IdP via SSO proxy. Layer 2 stores session JWTs in `mural_tokens`. Tools call `banksy-mural-api` (internal REST) with session JWTs. +**sso-proxy mode:** Layer 1 uses `OAuthProxy` or `RemoteAuthProvider` with Google IdP via SSO proxy. Layer 2 stores session JWTs in `mural_tokens`. Tools call `banksy-mural-api` (internal REST) with session JWTs. -**Public mode (`mural-oauth`):** Layer 1 uses `OAuthProxy` wrapping Mural's OAuth authorization server. Layer 2 stores Mural OAuth access/refresh tokens in `mural_tokens`. Tools call mural-api's public API with OAuth access tokens. +**mural-oauth mode:** Layer 1 uses `OAuthProxy` wrapping Mural's OAuth authorization server. Layer 2 stores Mural OAuth access/refresh tokens in `mural_tokens`. Tools call mural-api's public API with OAuth access tokens. **Dev mode:** Layer 1 has no auth (`auth=None` or `StaticTokenVerifier`). Layer 2 tokens loaded from dev seed data. Both tool sets registered; backend URLs configurable. @@ -163,8 +172,8 @@ banksy/ │ ├── src/ │ │ └── banksy_server/ │ │ ├── __init__.py -│ │ ├── server.py # Entry point: reads BANKSY_MODE, wires auth + domains -│ │ ├── config.py # pydantic-settings with BANKSY_MODE, DB URLs, auth +│ │ ├── server.py # Entry point: reads AUTH_MODE, wires auth + domains +│ │ ├── config.py # pydantic-settings with AUTH_MODE, DB URLs, auth │ │ ├── mural_api.py # FastMCP.from_openapi() integration │ │ ├── spa.py # SpaStaticFiles class │ │ ├── auth/ # providers.py, sso_proxy.py, mural_oauth.py, token_manager.py @@ -339,12 +348,12 @@ Two separate `from_openapi()` sub-servers, one per API spec: - Filter to the operation IDs currently exposed by `banksy-public-api` - Uses standard OAuth tokens for all operations -Both use `RouteMap`: GET → RESOURCE, POST/PUT/DELETE → TOOL, deprecated/internal → EXCLUDE. Each mounts onto the server within its respective `BANKSY_MODE` — `mount()` organizes tools by namespace within a single mode, not across auth modes (see Server Topology). +Both use `RouteMap`: GET → RESOURCE, POST/PUT/DELETE → TOOL, deprecated/internal → EXCLUDE. Each mounts onto the server within its respective `AUTH_MODE` — `mount()` organizes tools by namespace within a single mode, not across auth modes (see Server Topology). -**Phasing**: Start with the Public API spec in Phase 2 (when `BANKSY_MODE=public` or `dev`). Add the internal API spec as a follow-on (when `BANKSY_MODE=internal` or `dev`). The plumbing is identical — `from_openapi()` is called with different specs and different httpx clients (different base URLs, different auth injection per mode). +**Phasing**: Start with the Public API spec in Phase 2 (when `AUTH_MODE=mural-oauth` or `dev`). Add the internal API spec as a follow-on (when `AUTH_MODE=sso-proxy` or `dev`). The plumbing is identical — `from_openapi()` is called with different specs and different httpx clients (different base URLs, different auth injection per mode). ```python -# In BANKSY_MODE=public (or dev) +# In AUTH_MODE=mural-oauth (or dev) public_api = FastMCP.from_openapi( openapi_spec=public_spec, client=public_http_client, @@ -353,7 +362,7 @@ public_api = FastMCP.from_openapi( ) mcp.mount(public_api, namespace="mural") -# In BANKSY_MODE=internal (or dev) +# In AUTH_MODE=sso-proxy (or dev) internal_api = FastMCP.from_openapi( openapi_spec=internal_spec, client=internal_http_client, @@ -427,9 +436,9 @@ mcp.enable(tags={"murals"}, only=True) # Mural-focused deployment ### Deployment Modes (Resolved) -Mode merging is not recommended. `BANKSY_MODE` is preserved as a runtime configuration flag. Auth modes are capability constraints — internal and public tools call different APIs with incompatible token types. FastMCP's one-auth-per-server constraint means a single server cannot cleanly handle multiple auth strategies. MCP clients support multiple servers, so separate deployments per auth mode is transparent to users. +Mode merging is not recommended. `AUTH_MODE` is preserved as a runtime configuration flag. Auth modes are capability constraints — internal and public tools call different APIs with incompatible token types. FastMCP's one-auth-per-server constraint means a single server cannot cleanly handle multiple auth strategies. MCP clients support multiple servers, so separate deployments per auth mode is transparent to users. -The current two TS Dockerfiles (`Dockerfile` for sso-proxy, `Dockerfile.mural-oauth` for mural-oauth) are replaced by a single `Dockerfile.server` — the mode is runtime config (`BANKSY_MODE` env var), not build-time. See Server Topology for the full design. +The current two TS Dockerfiles (`Dockerfile` for sso-proxy, `Dockerfile.mural-oauth` for mural-oauth) are replaced by a single `Dockerfile.server` — the mode is runtime config (`AUTH_MODE` env var), not build-time. See Server Topology for the full design. --- @@ -612,7 +621,7 @@ This `get_authenticated_user` helper belongs in the auth module and is reused ac | (new) `IDP_ISSUER` | Expected JWT issuer | | (new) `IDP_AUDIENCE` | Expected JWT audience | | (new) `IDP_AUTHORIZATION_SERVER` | IdP URL for PRM metadata | -| (new) `BANKSY_MODE` | `internal`, `public`, or `dev` — selects auth provider and tool set (see Server Topology) | +| (new) `AUTH_MODE` | `sso-proxy`, `mural-oauth`, or `dev` — selects auth provider and tool set (see Server Topology) | | (new) `ENABLED_TAGS` | Optional comma-separated tag filter for specialized deployments (e.g., `read`) | **Key TS reference**: @@ -765,7 +774,7 @@ pypackages/server/src/banksy_server/domains/ └── tools.py ``` -Each domain's `register_*_tools(mcp)` function takes a `FastMCP` instance and registers all tools for that domain, including tags and metadata. The domain owns its tool definitions, schemas, and any domain-specific helpers. `server.py` calls the appropriate registration functions based on `BANKSY_MODE` (see Server Topology). +Each domain's `register_*_tools(mcp)` function takes a `FastMCP` instance and registers all tools for that domain, including tags and metadata. The domain owns its tool definitions, schemas, and any domain-specific helpers. `server.py` calls the appropriate registration functions based on `AUTH_MODE` (see Server Topology). ### from_openapi() in Domain Context @@ -789,7 +798,7 @@ def register_public_tools(mcp: FastMCP) -> None: ### Routes by Concern -Non-MCP HTTP routes (`routes/`) are organized by concern, not by mode. Mode-specific routes are registered conditionally in `server.py` based on `BANKSY_MODE` — for example, Session Activation routes are only registered in `internal` and `dev` modes. +Non-MCP HTTP routes (`routes/`) are organized by concern, not by mode. Mode-specific routes are registered conditionally in `server.py` based on `AUTH_MODE` — for example, Session Activation routes are only registered in `sso-proxy` and `dev` modes. ### Canvas-MCP Absorption @@ -908,7 +917,7 @@ pypackages/server/tests/ ├── test_token_refresh.py # Token refresh logic ├── test_auth_flow.py # OAuth flow (HeadlessOAuth) ├── test_session_activation.py # Session Activation routes -├── test_mode_selection.py # BANKSY_MODE startup paths +├── test_mode_selection.py # AUTH_MODE startup paths └── test_integration/ # End-to-end tests ``` @@ -926,7 +935,7 @@ The TS codebase has ~15 Vitest test files. These should be reviewed as reference ### Dockerfile.server -Workspace-aware multi-stage build using uv. One Docker image serves all modes — `BANKSY_MODE` is a runtime env var, not build-time. This replaces the two current TS Dockerfiles (`Dockerfile` for sso-proxy, `Dockerfile.mural-oauth` for mural-oauth). +Workspace-aware multi-stage build using uv. One Docker image serves all modes — `AUTH_MODE` is a runtime env var, not build-time. This replaces the two current TS Dockerfiles (`Dockerfile` for sso-proxy, `Dockerfile.mural-oauth` for mural-oauth). ```dockerfile # Stage 1: Build SPA @@ -1124,7 +1133,7 @@ banksy-shared = { workspace = true } - **Single workspace member → multi-member**: When a second Python service is needed (e.g., agent harness), add a directory under `pypackages/` with its own `pyproject.toml`. Extract shared code into `banksy-shared` at that time. The workspace glob auto-discovers new members. - **pre-commit → CI only**: If hooks cause friction during rapid iteration and the team is 1–2 developers, rely on CI alone. - **custom_route() → raw Starlette routing**: If HTTP routes grow complex, use `starlette.routing.Router` for grouping, `Mount` for sub-apps, or Starlette middleware wrappers for per-route concerns. FastAPI is a last resort — Starlette is already underneath FastMCP. -- **`BANKSY_MODE` per-deployment → mode merging**: If a future need requires multi-auth in a single process, revisit Option B (protocol-level routing) or Option D (middleware-based auth) from the [server topology analysis](../banksy-research/tool-visibility-server-topology-research.md). +- **`AUTH_MODE` per-deployment → mode merging**: If a future need requires multi-auth in a single process, revisit Option B (protocol-level routing) or Option D (middleware-based auth) from the [server topology analysis](../banksy-research/tool-visibility-server-topology-research.md). --- @@ -1144,6 +1153,8 @@ Items from the canvas-mcp alignment assessment and architecture research that ar | 12 | When to extract `banksy-shared` | Trigger: when a second consumer (agent harness) needs shared code (models, auth utils, Mural client) | Open (deferred) | | 13 | When to create `banksy-harness` | Trigger: when agent orchestration work begins | Open (deferred) | +**Future naming consideration:** The research documents proposed renaming `AUTH_MODE` to something like `BANKSY_MODE` with more semantic values (`internal`/`public`/`dev`). That has merit for clarity, but we keep the existing naming (`AUTH_MODE` with `sso-proxy`/`mural-oauth`/`dev`) for migration simplicity. Consider revisiting the rename once the migration stabilizes. + --- ## Deep Research Index From f2bdabafa9d1f02c76b941bed765a9d08df53a95 Mon Sep 17 00:00:00 2001 From: Willis Kirkham Date: Wed, 18 Mar 2026 14:59:33 -0700 Subject: [PATCH 2/5] Converts migration plan to standalone roadmap Strips Cursor plan frontmatter, renames to plain .md, updates "how to use" section to reference phase-specific .plan.md files for execution. --- .../00-migration-execution-strategy.md | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/fastmcp-migration/execution-strategy-research/00-migration-execution-strategy.md b/fastmcp-migration/execution-strategy-research/00-migration-execution-strategy.md index a2bf3a1..36ab7a0 100644 --- a/fastmcp-migration/execution-strategy-research/00-migration-execution-strategy.md +++ b/fastmcp-migration/execution-strategy-research/00-migration-execution-strategy.md @@ -1,14 +1,13 @@ - # Banksy xmcp-to-FastMCP Migration -## How to use this plan +## How to use this roadmap This is a **living document**. It reflects the intended final state of the migration and should be updated in-place as implementation reveals better approaches, new constraints, or scope changes. -- When deviating from the plan during implementation, **update the plan first** before proceeding. The plan should always describe what we're actually building, not what we originally thought we'd build. +- When deviating from the roadmap during implementation, **update the roadmap first** before proceeding. It should always describe what we're actually building, not what we originally thought we'd build. - Mark revisions inline with a brief **`Revised:`** annotation so readers can tell what changed and why (e.g. *"Revised: switched from X to Y because Z"*). Don't silently overwrite — the revision trail is useful context. -- The `.plan.md` is the working copy. The `.md` preview on the PR is a sharing snapshot and does not need to stay in sync during implementation. -- Research documents linked in the Deep Research Index are **not revised** — they capture point-in-time analysis. If a research finding turns out to be wrong, note it in the plan rather than editing the research doc. +- Implementation is driven by **phase-specific `.plan.md` files** created from this roadmap when starting each phase. The roadmap defines what to build; phase plans define how to execute each chunk. +- Research documents linked in the Deep Research Index are **not revised** — they capture point-in-time analysis. If a research finding turns out to be wrong, note it in the roadmap rather than editing the research doc. ## Summary From 619e9b47a81aa913b7b4ba4266c6d5d390b28a90 Mon Sep 17 00:00:00 2001 From: Willis Kirkham Date: Mon, 23 Mar 2026 12:36:21 -0700 Subject: [PATCH 3/5] Adds security team tradeoff review for RS migration paths --- .../security-oauthproxy-tradeoff-review.md | 116 ++++++++++++++++++ 1 file changed, 116 insertions(+) create mode 100644 fastmcp-migration/security-oauthproxy-tradeoff-review.md diff --git a/fastmcp-migration/security-oauthproxy-tradeoff-review.md b/fastmcp-migration/security-oauthproxy-tradeoff-review.md new file mode 100644 index 0000000..787a9ee --- /dev/null +++ b/fastmcp-migration/security-oauthproxy-tradeoff-review.md @@ -0,0 +1,116 @@ +# MCP Server Auth Migration: Tradeoff Review for Security Team + +**Author:** Willis Kirkham + +**Date:** March 2026 + +**Status:** Seeking Security Team Input + +--- + +## Context + +In early 2026, the security team audited our MCP server and recommended that it stop acting as an OAuth Authorization Server and become a pure Resource Server that validates externally-issued tokens. We agree with the direction — eliminating AS surface reduces attack vectors (confused deputy, DCR abuse, code interception). + +Before we commit to the implementation path, we want to make sure the full cost picture is visible. + +We've identified three concrete paths. Each has different costs and different security outcomes. We'd like your assessment of whether the strongest option is required, or whether a meaningful partial improvement is acceptable as an intermediate posture. + +--- + +## The Three Paths + +### Path 1: Pure RS with External IdP (Auth0 or Descope) + +The MCP server serves only Protected Resource Metadata (RFC 9728), validates JWTs from an external identity provider, and has zero authorization endpoints. This is the audit recommendation fully implemented. + +**Costs:** + +- **Requires purchasing an external identity provider.** Mural's authentication infrastructure doesn't currently meet the requirements to serve as an IdP for the RS model — it issues HS256 tokens with no JWKS endpoint and no OAuth discovery metadata (Path 2 addresses this, but that work isn't planned). Until then, a third-party IdP (Auth0, Descope, or similar) is required to provide the protocol infrastructure the RS model needs: RS256 JWTs, JWKS, and OAuth discovery. We evaluated several providers; Auth0 and Descope are the leading candidates. Pricing ranges from free tiers for low usage to potentially significant subscription costs at scale — we don't have firm numbers yet and would need PoC work to determine the actual cost. +- **User access risk for email/password users.** Enterprise SSO users can authenticate through their existing IdP directly. But email/password users have no external identity provider — the external IdP must be configured with Mural as a custom social connection to cover them. Without this, they lose access to the MCP server entirely. This is validated in concept but requires PoC work. +- **Infrastructure complexity.** IdP configuration, custom social connection setup, possibly a token vault for single-step UX, and ongoing operational responsibility (monitoring, upgrades, incident response). + +**Security posture:** Strongest. No AS surface on the MCP server. Identity verification fully delegated to a purpose-built IdP. + +--- + +### Path 2: Pure RS with Mural-API as the Authorization Server + +Instead of a third-party IdP, Mural's own API evolves to issue RS256 JWTs, expose a JWKS endpoint, and serve OAuth discovery metadata. The MCP server becomes a pure RS that validates Mural-issued tokens. + +**Costs:** + +- **Non-trivial changes to mural-api's auth infrastructure.** Requires migrating from HS256 to RS256 token signing, adding key management, exposing JWKS and discovery endpoints. This work touches every token validation path in the mural-api codebase. +- **Not planned or prioritized.** This would be a new requirement that needs planning and priotization by R&D. +- **Does not eliminate the AS risk — it moves it.** The same attack surface the audit identified (DCR, authorization code flow, token issuance) would exist in mural-api instead of the MCP server. The risk is arguably better managed there (larger team, more mature infrastructure), but it is not eliminated from the system. + +**Security posture:** Equivalent to Path 1 from the MCP server's perspective, but the AS surface still exists in mural-api. + +--- + +### Path 3: OAuthProxy (Middle Ground) + +The MCP server uses FastMCP's OAuthProxy, which proxies OAuth flows to Mural's existing login endpoints. It serves Protected Resource Metadata (MCP spec compliant), issues its own RS256 JWTs, and stores Mural tokens server-side. Users authenticate with Mural using any login method they have today. + +**Costs:** + +- **The MCP server retains AS-like surface.** Authorization endpoints (`/authorize`, `/token`), DCR, and token issuance remain present — the audit's recommendation is only partially satisfied. +- **No vendor cost, no cross-team dependency, no user access regression.** All Mural login methods continue to work. No subscription. No dependency on platform team prioritization. + +**Security posture:** Meaningfully improved over today (details below), but weaker than Paths 1 or 2 — authorization endpoints still exist. + +--- + +### Comparison + +| | Path 1: External IdP | Path 2: Mural-API as AS | Path 3: OAuthProxy | +|---|---|---|---| +| AS surface on Banksy | Eliminated | Eliminated | Reduced, not eliminated | +| AS surface in system | Eliminated | Moved to mural-api | Remains on Banksy | +| Vendor cost | Ongoing subscription (TBD) | None | None | +| Cross-team dependency | None (MCP team delivers) | Mural platform team | None (MCP team delivers) | +| User access regression | None if IdP configured correctly | None | None | +| Timeline risk | PoC needed (weeks) | Uncertain (months+) | Near-term | +| MCP spec compliant | Yes | Yes | Yes | + +--- + +## What OAuthProxy Improves Over Today + +The current system uses Better Auth with server-side sessions and HTTP-only cookies. OAuthProxy changes the security model in four specific ways: + +**Session management eliminated.** Better Auth maintains server-side sessions in a database and authenticates requests via HTTP-only cookies. OAuthProxy issues self-contained RS256 JWTs. No session store means no cookie theft, replay, or fixation risk. + +**Token validation is stateless and cryptographic.** Better Auth validates tokens by looking up sessions in Postgres. OAuthProxy validates JWT signatures against its own keys. This eliminates an entire class of session storage consistency bugs and removes the database as a runtime dependency for authentication. + +**Identity verification delegated.** Better Auth's callback creates local user records and issues tokens based on those records — the MCP server makes independent identity decisions. OAuthProxy's token issuance is directly tied to a successful upstream code exchange with Mural. No independent identity decisions. + +**PKCE and CIMD support.** OAuthProxy supports Proof Key for Code Exchange (mitigates authorization code interception) and Client ID Metadata Documents (reduces DCR abuse risk by allowing URL-based client identity verification instead of open registration). + +--- + +## What OAuthProxy Does NOT Address + +To be clear about what remains: + +- **DCR endpoint still exists.** Clients can still register dynamically, though CIMD reduces the necessity for open DCR. +- **Authorization code flow still runs through the MCP server.** The `/authorize` and `/token` endpoints are proxied but present. +- **An attacker can still register clients and initiate authorization flows against the MCP server.** The attack surface is reduced (identity decisions are delegated, tokens are cryptographically verified) but not eliminated. + +--- + +## The Question + +Given the costs of full RS migration — vendor dependency and subscription cost (Path 1), or unplanned cross-team platform work with uncertain timeline (Path 2) — **is the pure RS recommendation firm?** + +Or would OAuthProxy — which eliminates session management risks, adds cryptographic token validation, delegates identity verification to Mural, and satisfies the MCP specification — be an acceptable intermediate posture while we evaluate the long-term IdP path? + +We are not asking to skip the security improvements. We are asking whether a meaningful partial improvement now, followed by a planned evolution to full RS, is preferable to blocking on a vendor decision or cross-team dependency. + +--- + +## Note on MCP Spec Compliance + +OAuthProxy satisfies the MCP specification's requirements: it serves Protected Resource Metadata (RFC 9728), issues audience-bound tokens, and does not pass client tokens through to upstream APIs. The spec explicitly allows the authorization server to be colocated with the resource server. + +The full RS migration is driven by the security audit, not by spec compliance. This distinction matters because the spec compliance timeline is more urgent — AI tools are actively dropping support for the legacy auth model — while the audit's AS-elimination recommendation can be phased. OAuthProxy addresses the urgent spec deadline. The question is whether it also addresses enough of the audit's concerns to be an acceptable intermediate state. From cfe4ef383d70e95f8f1ba1cc0f268615edfd0a92 Mon Sep 17 00:00:00 2001 From: Willis Kirkham Date: Mon, 23 Mar 2026 12:41:48 -0700 Subject: [PATCH 4/5] Tightens tradeoff review for conciseness --- .../security-oauthproxy-tradeoff-review.md | 105 +++++------------- 1 file changed, 30 insertions(+), 75 deletions(-) diff --git a/fastmcp-migration/security-oauthproxy-tradeoff-review.md b/fastmcp-migration/security-oauthproxy-tradeoff-review.md index 787a9ee..f71c88c 100644 --- a/fastmcp-migration/security-oauthproxy-tradeoff-review.md +++ b/fastmcp-migration/security-oauthproxy-tradeoff-review.md @@ -1,10 +1,6 @@ # MCP Server Auth Migration: Tradeoff Review for Security Team -**Author:** Willis Kirkham - -**Date:** March 2026 - -**Status:** Seeking Security Team Input +**Author:** Willis Kirkham | **Date:** March 2026 | **Status:** Seeking Security Team Input --- @@ -12,105 +8,64 @@ In early 2026, the security team audited our MCP server and recommended that it stop acting as an OAuth Authorization Server and become a pure Resource Server that validates externally-issued tokens. We agree with the direction — eliminating AS surface reduces attack vectors (confused deputy, DCR abuse, code interception). -Before we commit to the implementation path, we want to make sure the full cost picture is visible. - -We've identified three concrete paths. Each has different costs and different security outcomes. We'd like your assessment of whether the strongest option is required, or whether a meaningful partial improvement is acceptable as an intermediate posture. +Before we commit to the implementation path, we want to make sure the full cost picture is visible. We've identified three paths and we'd like your assessment of whether the strongest option is required or whether a partial improvement is acceptable as an intermediate posture. --- ## The Three Paths -### Path 1: Pure RS with External IdP (Auth0 or Descope) - -The MCP server serves only Protected Resource Metadata (RFC 9728), validates JWTs from an external identity provider, and has zero authorization endpoints. This is the audit recommendation fully implemented. +### Path 1: Pure RS with External IdP -**Costs:** +The audit recommendation fully implemented — zero authorization endpoints, JWTs validated from an external IdP. -- **Requires purchasing an external identity provider.** Mural's authentication infrastructure doesn't currently meet the requirements to serve as an IdP for the RS model — it issues HS256 tokens with no JWKS endpoint and no OAuth discovery metadata (Path 2 addresses this, but that work isn't planned). Until then, a third-party IdP (Auth0, Descope, or similar) is required to provide the protocol infrastructure the RS model needs: RS256 JWTs, JWKS, and OAuth discovery. We evaluated several providers; Auth0 and Descope are the leading candidates. Pricing ranges from free tiers for low usage to potentially significant subscription costs at scale — we don't have firm numbers yet and would need PoC work to determine the actual cost. -- **User access risk for email/password users.** Enterprise SSO users can authenticate through their existing IdP directly. But email/password users have no external identity provider — the external IdP must be configured with Mural as a custom social connection to cover them. Without this, they lose access to the MCP server entirely. This is validated in concept but requires PoC work. -- **Infrastructure complexity.** IdP configuration, custom social connection setup, possibly a token vault for single-step UX, and ongoing operational responsibility (monitoring, upgrades, incident response). - -**Security posture:** Strongest. No AS surface on the MCP server. Identity verification fully delegated to a purpose-built IdP. - ---- +- **Requires purchasing an external IdP.** Mural's auth infrastructure doesn't currently meet the RS model's requirements (HS256 tokens, no JWKS, no OAuth discovery). A third-party IdP (Auth0, Descope) is needed. Pricing needs PoC work to determine but ranges from free tiers at low usage to potentially significant costs at scale. +- **User access risk for email/password users.** Enterprise SSO users authenticate through their existing IdP. But email/password users have no external IdP — the provider must be configured with Mural as a custom social connection to cover them, or they lose access entirely. Validated in concept but requires PoC. +- **Infrastructure complexity.** IdP configuration, custom social connection, possibly a token vault for single-step UX, ongoing operational responsibility. ### Path 2: Pure RS with Mural-API as the Authorization Server -Instead of a third-party IdP, Mural's own API evolves to issue RS256 JWTs, expose a JWKS endpoint, and serve OAuth discovery metadata. The MCP server becomes a pure RS that validates Mural-issued tokens. +Mural's API evolves to issue RS256 JWTs, expose JWKS, and serve OAuth discovery. The MCP server becomes a pure RS. -**Costs:** - -- **Non-trivial changes to mural-api's auth infrastructure.** Requires migrating from HS256 to RS256 token signing, adding key management, exposing JWKS and discovery endpoints. This work touches every token validation path in the mural-api codebase. -- **Not planned or prioritized.** This would be a new requirement that needs planning and priotization by R&D. -- **Does not eliminate the AS risk — it moves it.** The same attack surface the audit identified (DCR, authorization code flow, token issuance) would exist in mural-api instead of the MCP server. The risk is arguably better managed there (larger team, more mature infrastructure), but it is not eliminated from the system. - -**Security posture:** Equivalent to Path 1 from the MCP server's perspective, but the AS surface still exists in mural-api. - ---- +- **Non-trivial changes to mural-api.** HS256 → RS256 migration, key management, JWKS and discovery endpoints. Touches every token validation path. +- **Not planned or prioritized.** New requirement needing planning and prioritization by R&D. +- **Does not eliminate the AS risk — it moves it.** The same attack surface would exist in mural-api instead. Arguably better managed there, but not eliminated from the system. ### Path 3: OAuthProxy (Middle Ground) -The MCP server uses FastMCP's OAuthProxy, which proxies OAuth flows to Mural's existing login endpoints. It serves Protected Resource Metadata (MCP spec compliant), issues its own RS256 JWTs, and stores Mural tokens server-side. Users authenticate with Mural using any login method they have today. +The MCP server proxies OAuth flows to Mural's existing login endpoints via FastMCP's OAuthProxy. Serves Protected Resource Metadata, issues its own RS256 JWTs, stores Mural tokens server-side. All current login methods continue to work. -**Costs:** - -- **The MCP server retains AS-like surface.** Authorization endpoints (`/authorize`, `/token`), DCR, and token issuance remain present — the audit's recommendation is only partially satisfied. -- **No vendor cost, no cross-team dependency, no user access regression.** All Mural login methods continue to work. No subscription. No dependency on platform team prioritization. - -**Security posture:** Meaningfully improved over today (details below), but weaker than Paths 1 or 2 — authorization endpoints still exist. - ---- +- **Retains AS-like surface.** `/authorize`, `/token`, DCR, and token issuance remain present — audit recommendation only partially satisfied. +- **No vendor cost, no cross-team dependency, no user access regression.** ### Comparison -| | Path 1: External IdP | Path 2: Mural-API as AS | Path 3: OAuthProxy | +| | External IdP | Mural-API as AS | OAuthProxy | |---|---|---|---| -| AS surface on Banksy | Eliminated | Eliminated | Reduced, not eliminated | -| AS surface in system | Eliminated | Moved to mural-api | Remains on Banksy | -| Vendor cost | Ongoing subscription (TBD) | None | None | -| Cross-team dependency | None (MCP team delivers) | Mural platform team | None (MCP team delivers) | -| User access regression | None if IdP configured correctly | None | None | -| Timeline risk | PoC needed (weeks) | Uncertain (months+) | Near-term | -| MCP spec compliant | Yes | Yes | Yes | +| AS surface on MCP server | Eliminated | Eliminated | Reduced | +| AS surface in system | Eliminated | Moved to mural-api | Remains | +| Vendor cost | Ongoing (TBD) | None | None | +| Cross-team dependency | None | Mural platform team | None | +| User access regression | Risk without config | None | None | +| Timeline | PoC needed (weeks) | Uncertain (months+) | Near-term | --- -## What OAuthProxy Improves Over Today - -The current system uses Better Auth with server-side sessions and HTTP-only cookies. OAuthProxy changes the security model in four specific ways: +## OAuthProxy: What It Improves and What It Doesn't -**Session management eliminated.** Better Auth maintains server-side sessions in a database and authenticates requests via HTTP-only cookies. OAuthProxy issues self-contained RS256 JWTs. No session store means no cookie theft, replay, or fixation risk. +The current system uses Better Auth with server-side sessions and HTTP-only cookies. OAuthProxy improves this in three ways: -**Token validation is stateless and cryptographic.** Better Auth validates tokens by looking up sessions in Postgres. OAuthProxy validates JWT signatures against its own keys. This eliminates an entire class of session storage consistency bugs and removes the database as a runtime dependency for authentication. +1. **Stateless cryptographic tokens replace sessions.** RS256 JWTs validated by signature, not database lookup. Eliminates cookie theft/replay/fixation, session storage bugs, and the database as a runtime auth dependency. +2. **Identity verification delegated.** Token issuance tied to a successful upstream code exchange with Mural — no independent identity decisions. +3. **PKCE and CIMD support.** PKCE mitigates code interception. CIMD reduces DCR abuse risk via URL-based client identity verification. -**Identity verification delegated.** Better Auth's callback creates local user records and issues tokens based on those records — the MCP server makes independent identity decisions. OAuthProxy's token issuance is directly tied to a successful upstream code exchange with Mural. No independent identity decisions. - -**PKCE and CIMD support.** OAuthProxy supports Proof Key for Code Exchange (mitigates authorization code interception) and Client ID Metadata Documents (reduces DCR abuse risk by allowing URL-based client identity verification instead of open registration). - ---- - -## What OAuthProxy Does NOT Address - -To be clear about what remains: - -- **DCR endpoint still exists.** Clients can still register dynamically, though CIMD reduces the necessity for open DCR. -- **Authorization code flow still runs through the MCP server.** The `/authorize` and `/token` endpoints are proxied but present. -- **An attacker can still register clients and initiate authorization flows against the MCP server.** The attack surface is reduced (identity decisions are delegated, tokens are cryptographically verified) but not eliminated. +What remains: DCR endpoint still exists (though CIMD reduces its necessity), and `/authorize` and `/token` endpoints are proxied but present. An attacker can still register clients and initiate authorization flows. --- ## The Question -Given the costs of full RS migration — vendor dependency and subscription cost (Path 1), or unplanned cross-team platform work with uncertain timeline (Path 2) — **is the pure RS recommendation firm?** - -Or would OAuthProxy — which eliminates session management risks, adds cryptographic token validation, delegates identity verification to Mural, and satisfies the MCP specification — be an acceptable intermediate posture while we evaluate the long-term IdP path? - -We are not asking to skip the security improvements. We are asking whether a meaningful partial improvement now, followed by a planned evolution to full RS, is preferable to blocking on a vendor decision or cross-team dependency. - ---- - -## Note on MCP Spec Compliance +Given the costs of full RS migration — vendor dependency (Path 1) or unplanned cross-team work with uncertain timeline (Path 2) — **is the pure RS recommendation firm?** Or would OAuthProxy be an acceptable intermediate posture while we evaluate the long-term IdP path? -OAuthProxy satisfies the MCP specification's requirements: it serves Protected Resource Metadata (RFC 9728), issues audience-bound tokens, and does not pass client tokens through to upstream APIs. The spec explicitly allows the authorization server to be colocated with the resource server. +We are not asking to skip the security improvements. We are asking whether a meaningful partial improvement now is preferable to blocking on a vendor decision or cross-team dependency. -The full RS migration is driven by the security audit, not by spec compliance. This distinction matters because the spec compliance timeline is more urgent — AI tools are actively dropping support for the legacy auth model — while the audit's AS-elimination recommendation can be phased. OAuthProxy addresses the urgent spec deadline. The question is whether it also addresses enough of the audit's concerns to be an acceptable intermediate state. +**Note on MCP spec compliance:** OAuthProxy satisfies the MCP spec (Protected Resource Metadata, audience-bound tokens, no token passthrough). The spec allows the AS to be colocated with the RS. The full RS migration is driven by the audit, not spec compliance — and the spec timeline is more urgent (AI tools are dropping legacy auth support), while the audit's AS-elimination recommendation can be phased. From 412bc1ce6283ae8b5e501bc07d7796f892ea37dd Mon Sep 17 00:00:00 2001 From: Willis Kirkham Date: Mon, 23 Mar 2026 12:57:15 -0700 Subject: [PATCH 5/5] Surfaces user coverage and competitive context, reframes ask --- .../security-oauthproxy-tradeoff-review.md | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/fastmcp-migration/security-oauthproxy-tradeoff-review.md b/fastmcp-migration/security-oauthproxy-tradeoff-review.md index f71c88c..ab8d09c 100644 --- a/fastmcp-migration/security-oauthproxy-tradeoff-review.md +++ b/fastmcp-migration/security-oauthproxy-tradeoff-review.md @@ -6,9 +6,11 @@ ## Context -In early 2026, the security team audited our MCP server and recommended that it stop acting as an OAuth Authorization Server and become a pure Resource Server that validates externally-issued tokens. We agree with the direction — eliminating AS surface reduces attack vectors (confused deputy, DCR abuse, code interception). +In early 2026, the security team audited our MCP server and recommended that it stop acting as an OAuth Authorization Server (AS) and become a pure Resource Server (RS) that validates externally-issued tokens. We agree with the direction — eliminating AS surface reduces attack vectors (confused deputy, DCR abuse, code interception). -Before we commit to the implementation path, we want to make sure the full cost picture is visible. We've identified three paths and we'd like your assessment of whether the strongest option is required or whether a partial improvement is acceptable as an intermediate posture. +Before we commit to the implementation path, we want to make sure the full cost picture is visible — particularly around user coverage. Today, the MCP server supports all Mural users regardless of how they authenticate (email/password, Google, Microsoft, SAML SSO, OAuth2 SSO). The pure RS migration puts this at risk: enterprise users with SSO can authenticate through their existing IdP, but users without an external identity provider (primarily email/password) have no path unless we add one. Miro's MCP integration supports all their users today, so a coverage regression carries competitive weight. + +We could move ahead with pure RS and limit access to enterprise SSO users, accepting the regression. Or we could pursue a middle ground. We've identified three paths and we'd like your assessment to help inform this business and engineering decision. --- @@ -19,7 +21,7 @@ Before we commit to the implementation path, we want to make sure the full cost The audit recommendation fully implemented — zero authorization endpoints, JWTs validated from an external IdP. - **Requires purchasing an external IdP.** Mural's auth infrastructure doesn't currently meet the RS model's requirements (HS256 tokens, no JWKS, no OAuth discovery). A third-party IdP (Auth0, Descope) is needed. Pricing needs PoC work to determine but ranges from free tiers at low usage to potentially significant costs at scale. -- **User access risk for email/password users.** Enterprise SSO users authenticate through their existing IdP. But email/password users have no external IdP — the provider must be configured with Mural as a custom social connection to cover them, or they lose access entirely. Validated in concept but requires PoC. +- **User coverage gap.** Without configuring Mural as a custom social connection in the IdP, email/password users lose access entirely (see Context above). This configuration is validated in concept but requires PoC. - **Infrastructure complexity.** IdP configuration, custom social connection, possibly a token vault for single-step UX, ongoing operational responsibility. ### Path 2: Pure RS with Mural-API as the Authorization Server @@ -45,7 +47,7 @@ The MCP server proxies OAuth flows to Mural's existing login endpoints via FastM | AS surface in system | Eliminated | Moved to mural-api | Remains | | Vendor cost | Ongoing (TBD) | None | None | | Cross-team dependency | None | Mural platform team | None | -| User access regression | Risk without config | None | None | +| User coverage | SSO only without custom social config | All users | All users | | Timeline | PoC needed (weeks) | Uncertain (months+) | Near-term | --- @@ -64,8 +66,8 @@ What remains: DCR endpoint still exists (though CIMD reduces its necessity), and ## The Question -Given the costs of full RS migration — vendor dependency (Path 1) or unplanned cross-team work with uncertain timeline (Path 2) — **is the pure RS recommendation firm?** Or would OAuthProxy be an acceptable intermediate posture while we evaluate the long-term IdP path? +Given the costs of full RS migration — vendor dependency and user coverage risk (Path 1), or unplanned cross-team work with uncertain timeline (Path 2) — **we'd like your assessment of OAuthProxy as an intermediate posture.** It addresses the most concrete risks from the audit (session management, independent identity decisions, lack of PKCE) while preserving full user coverage and avoiding external dependencies. -We are not asking to skip the security improvements. We are asking whether a meaningful partial improvement now is preferable to blocking on a vendor decision or cross-team dependency. +Does this provide enough of a security improvement to move forward with while we evaluate the long-term IdP path? If so, are there specific residual risks you'd want us to mitigate, or additional hardening measures we should apply to the OAuthProxy surface? **Note on MCP spec compliance:** OAuthProxy satisfies the MCP spec (Protected Resource Metadata, audience-bound tokens, no token passthrough). The spec allows the AS to be colocated with the RS. The full RS migration is driven by the audit, not spec compliance — and the spec timeline is more urgent (AI tools are dropping legacy auth support), while the audit's AS-elimination recommendation can be phased.