From 41d03d2fcc123a152e4172fc2bcd16e81f5606f8 Mon Sep 17 00:00:00 2001 From: Willis Kirkham Date: Mon, 16 Mar 2026 13:38:25 -0700 Subject: [PATCH 1/4] Adds FastMCP migration execution plan and linked research --- .../auth-provider-alternatives.md | 529 ++++++ .../banksy-architecture-research.md | 1456 +++++++++++++++++ .../canvas-mcp-alignment-assessment.md | 178 ++ .../fastmcp-auth-migration-research.md | 1095 +++++++++++++ .../fastmcp-custom-routes-research.md | 894 ++++++++++ .../fastmcp-react-spa-serving-research.md | 815 +++++++++ .../fastmcp-starlette-routing-research.md | 475 ++++++ .../monorepo-layout-agent-harness-research.md | 1118 +++++++++++++ ...right-strict-dependency-typing-research.md | 597 +++++++ .../python-314-compatibility-research.md | 384 +++++ ...ool-visibility-server-topology-research.md | 779 +++++++++ .../00-migration-execution-strategy.md | 1198 ++++++++++++++ .../01-supporting-docs-digest.md | 127 ++ .../02-banksy-repo-exploration.md | 351 ++++ .../03-fastmcp-project-patterns.md | 33 + .../04-repo-organization-evaluation.md | 184 +++ .../05-toolchain-project-mgmt-monorepo.md | 187 +++ ...-toolchain-linting-typing-testing-hooks.md | 290 ++++ .../07-toolchain-db-http-config.md | 232 +++ fastmcp-migration/fastmcp-auth-strategy.md | 168 ++ .../research-fastmcp-project-structure.md | 624 +++++++ ...earch-fastmcp-structuredContent-support.md | 163 ++ .../research-mcp-server-frameworks.md | 395 +++++ .../research-xmcp-vs-fastmcp-deep-dive.md | 366 +++++ .../resource-server-migration-eval.md | 373 +++++ fastmcp-migration/security-audit-analysis.md | 297 ++++ 26 files changed, 13308 insertions(+) create mode 100644 fastmcp-migration/auth-provider-alternatives.md create mode 100644 fastmcp-migration/banksy-research/banksy-architecture-research.md create mode 100644 fastmcp-migration/banksy-research/canvas-mcp-alignment-assessment.md create mode 100644 fastmcp-migration/banksy-research/fastmcp-auth-migration-research.md create mode 100644 fastmcp-migration/banksy-research/fastmcp-custom-routes-research.md create mode 100644 fastmcp-migration/banksy-research/fastmcp-react-spa-serving-research.md create mode 100644 fastmcp-migration/banksy-research/fastmcp-starlette-routing-research.md create mode 100644 fastmcp-migration/banksy-research/monorepo-layout-agent-harness-research.md create mode 100644 fastmcp-migration/banksy-research/pyright-strict-dependency-typing-research.md create mode 100644 fastmcp-migration/banksy-research/python-314-compatibility-research.md create mode 100644 fastmcp-migration/banksy-research/tool-visibility-server-topology-research.md create mode 100644 fastmcp-migration/execution-strategy-research/00-migration-execution-strategy.md create mode 100644 fastmcp-migration/execution-strategy-research/01-supporting-docs-digest.md create mode 100644 fastmcp-migration/execution-strategy-research/02-banksy-repo-exploration.md create mode 100644 fastmcp-migration/execution-strategy-research/03-fastmcp-project-patterns.md create mode 100644 fastmcp-migration/execution-strategy-research/04-repo-organization-evaluation.md create mode 100644 fastmcp-migration/execution-strategy-research/05-toolchain-project-mgmt-monorepo.md create mode 100644 fastmcp-migration/execution-strategy-research/06-toolchain-linting-typing-testing-hooks.md create mode 100644 fastmcp-migration/execution-strategy-research/07-toolchain-db-http-config.md create mode 100644 fastmcp-migration/fastmcp-auth-strategy.md create mode 100644 fastmcp-migration/research-fastmcp-project-structure.md create mode 100644 fastmcp-migration/research-fastmcp-structuredContent-support.md create mode 100644 fastmcp-migration/research-mcp-server-frameworks.md create mode 100644 fastmcp-migration/research-xmcp-vs-fastmcp-deep-dive.md create mode 100644 fastmcp-migration/resource-server-migration-eval.md create mode 100644 fastmcp-migration/security-audit-analysis.md diff --git a/fastmcp-migration/auth-provider-alternatives.md b/fastmcp-migration/auth-provider-alternatives.md new file mode 100644 index 0000000..eb45add --- /dev/null +++ b/fastmcp-migration/auth-provider-alternatives.md @@ -0,0 +1,529 @@ +# Auth Provider Alternatives for the Resource Server Migration + +## Executive Summary + +Banksy's migration from an OAuth Authorization Server to a Resource Server under the MCP specification requires an external Identity Provider that issues RS256/ES256 JWTs validatable via JWKS, serves OAuth discovery metadata, and — critically — supports configuring an arbitrary upstream OAuth provider (Mural) as an identity source. This last requirement is the primary differentiator: without it, non-enterprise Mural users (email/password, non-Google social login, SAML SSO through non-supported providers) cannot authenticate through Banksy at all. + +After evaluating twelve providers across four dimensions (technical fit, pricing, operational complexity, and MCP ecosystem compatibility), three stand out: + +| | Auth0 | Descope | Keycloak | +|---|---|---|---| +| **Custom upstream OAuth** | Yes — custom social connections with manual endpoint configuration | Yes — custom OAuth providers with manual endpoint configuration | Yes — identity brokering with arbitrary OIDC/OAuth providers | +| **Upstream token storage** | Yes — Token Vault stores access + refresh tokens, retrievable via token exchange | Yes — Outbound Apps stores tokens, retrievable via management API | Yes — "Store Token" feature, retrievable via broker token endpoint | +| **DCR (RFC 7591)** | Yes — enables `RemoteAuthProvider` | Yes — enables `RemoteAuthProvider` | Yes (OIDC DCR certified) — enables `RemoteAuthProvider` | +| **FastMCP integration** | Built-in `Auth0Provider` class | Built-in `DescopeProvider` class; dedicated MCP SDKs | No dedicated provider; manual `RemoteAuthProvider` configuration | +| **Pricing model** | Free up to 25K MAU; $35–$240/mo paid; Enterprise for Token Vault | Free up to 7.5K MAU; $249/mo (Pro) for 10K MAU | Free and open-source; self-hosted infrastructure costs only | +| **Key risk** | Token Vault may require Enterprise plan (opaque pricing) | Custom OAuth + Outbound Apps interplay needs PoC validation | Operational burden of self-hosting (HA, upgrades, certificates) | + +**Recommended PoC order:** Auth0 first (strongest upstream token storage story, built-in FastMCP support, most documentation), Descope second (MCP-native, strong DCR, competitive alternative), Keycloak third (self-hosted fallback if managed provider pricing or capabilities prove unacceptable). + +The remaining nine providers evaluated either fail the custom upstream OAuth requirement entirely (WorkOS, Azure AD/Entra ID, Google Cloud Identity Platform), partially meet requirements with significant gaps (Clerk, Stytch, AWS Cognito, FusionAuth), or are viable self-hosted alternatives with less ecosystem maturity (Ory Hydra, Authentik). + +--- + +## Requirements Recap + +Banksy is migrating from operating as an OAuth Authorization Server (via Better Auth's MCP plugin) to a Resource Server (via FastMCP's `RemoteAuthProvider` or `OAuthProxy`). The external IdP must satisfy five requirements, in descending order of criticality: + +1. **RS256/ES256 JWT issuance with JWKS** — Banksy must validate IDE-presented tokens without shared secrets. The IdP must issue asymmetrically-signed JWTs and expose a `/.well-known/jwks.json` or equivalent endpoint. + +2. **OAuth discovery metadata** — `/.well-known/openid-configuration` or RFC 8414 `/.well-known/oauth-authorization-server` — so IDE clients can discover endpoints automatically after following Banksy's `/.well-known/oauth-protected-resource` link. + +3. **Dynamic Client Registration (DCR, RFC 7591)** — so IDE MCP clients can register automatically. With DCR, Banksy uses FastMCP's `RemoteAuthProvider` for a pure resource server. Without DCR, Banksy must use `OAuthProxy`, which reintroduces some AS-like surface area. + +4. **Custom upstream OAuth connections** — the critical differentiator. The IdP must allow configuring Mural as an identity source by specifying authorization URL, token URL, and user info URL manually. Without this, non-enterprise Mural users (email/password, non-Google social login) have no account with the IdP and cannot authenticate. + +5. **Upstream token storage** (strong preference, not hard requirement) — the IdP stores access and refresh tokens obtained from Mural during authentication. Banksy retrieves them via API, eliminating the separate browser-based "Mural connect" step (Layer 2). Without this, users complete a two-step flow (IdP auth + Mural connect) instead of one. + +Requirement 4 eliminates most providers. Requirement 5 is the differentiator between a one-step and two-step user experience. + +--- + +## Provider Evaluations + +### 1. Auth0 (Okta CIC) + +Auth0 is the strongest overall candidate. It meets all five requirements, has the most mature upstream token storage capability, and is the only provider with both a dedicated FastMCP authentication class and comprehensive MCP documentation. + +#### Technical Fit + +**JWT issuance:** Auth0 issues RS256 JWTs by default, with ES256 as a configurable option. Audiences and scopes are fully customizable per API definition in the Auth0 Dashboard. Banksy would define an API with audience `https://banksy.example.com` and custom scopes like `mcp:tools`, `mural:read`, `mural:write`. + +**JWKS endpoint:** Exposed at `https://{tenant}.auth0.com/.well-known/jwks.json`. Keys rotate automatically. + +**Discovery metadata:** Full OIDC Discovery at `https://{tenant}.auth0.com/.well-known/openid-configuration`. Also serves RFC 8414 metadata. + +**DCR support:** Auth0 supports RFC 7591 Dynamic Client Registration, enabled at the tenant level via the `enable_dynamic_client_registration` flag. This enables FastMCP's `RemoteAuthProvider` for a pure resource server with no AS-like surface. Auth0 implements Open Dynamic Registration — any client can register if DCR is enabled — so Banksy would need to configure Auth0's recommended DCR security policies (IP allowlisting, rate limiting, or a registration access token). + +**Custom upstream OAuth:** Auth0's "Generic OAuth2 Authorization Server" connection type allows configuring any OAuth provider as an identity source. Configuration requires: +- Authorization URL (Mural's `/api/public/v1/oauth/authorize`) +- Token URL (Mural's `/api/public/v1/oauth/token`) +- Client ID and Client Secret (from Mural's OAuth app registration) +- A Fetch User Profile Script (Node.js function that calls Mural's user info endpoint and returns a normalized profile) +- Scopes (Mural's `identity:read`, `murals:read`, etc.) + +This is the exact pattern needed for "Sign in with Mural." Auth0 handles the OAuth flow with Mural, obtains tokens, fetches user info, and creates/links an Auth0 user record. + +**Upstream token storage (Token Vault):** Auth0's Token Vault is the most mature implementation of upstream token storage among the evaluated providers. When configured with the "Connected Accounts" purpose on a social connection: +- Auth0 stores the upstream provider's access and refresh tokens after authentication +- Applications exchange an Auth0 access token for the upstream provider's access token via a dedicated grant type (`urn:auth0:params:oauth:grant-type:token-exchange:federated-connection-access-token`) at the `/oauth/token` endpoint +- Refresh token exchange is supported: Banksy can exchange an Auth0 refresh token for a fresh Mural access token without user interaction, provided Refresh Token Rotation is disabled for the application +- The Privileged Worker Token Exchange (currently in Beta) allows backend services to exchange JWT bearer tokens for external provider tokens without active user sessions + +If Token Vault works as documented with a custom social connection to Mural, it would eliminate the separate "Mural connect" browser step entirely. The authentication flow would be: +1. IDE discovers Banksy's PRM, follows `authorization_servers` link to Auth0 +2. Auth0 presents login, user chooses "Sign in with Mural" +3. Mural OAuth consent → Auth0 receives Mural tokens, stores them in Token Vault +4. Auth0 issues its own RS256 JWT to the IDE +5. IDE presents Auth0 JWT to Banksy; Banksy validates via JWKS +6. Banksy exchanges Auth0 tokens for Mural access token via Token Vault API +7. Banksy calls Mural API with the retrieved access token + +This collapses the two-layer auth into a single user-facing step — the same UX advantage that mural-oauth mode has today. + +**Upstream token refresh:** Token Vault handles refresh via the refresh token exchange endpoint. Banksy presents its Auth0 refresh token, Auth0 uses the stored Mural refresh token to obtain a fresh Mural access token, and returns it. Banksy does not need to manage Mural refresh tokens directly. + +**User account linking:** Auth0 creates a unified user profile linked to the Mural social connection. The Mural user ID is stored in the `user_id` field of the connection identity. Banksy maps Auth0's `sub` claim to its internal user records. + +**Caveats requiring PoC validation:** +- Token Vault documentation refers to "Connected Accounts" and "Connected Accounts Flow" which may differ from the standard social login flow. The PoC must confirm that a custom social connection + Token Vault can be configured to store Mural tokens during the standard MCP OAuth flow initiated by an IDE, not only through Auth0's dedicated Connected Accounts UI. +- The Fetch User Profile Script for custom social connections is a Node.js function that runs in Auth0's sandbox. It must return a `user_id` field — the PoC must confirm that Mural's user info endpoint provides sufficient data. +- MFA policies must be set to "Never" to retrieve access tokens from Token Vault. If Banksy later requires MFA, this conflicts. +- Scope handling for custom social connections has known quirks — Auth0 escapes spaces differently than some providers. The PoC must confirm Mural's scopes are requested correctly. + +#### Pricing + +Auth0's pricing is tiered by MAU: + +| Plan | MAU Included | Price | Key Features | +|---|---|---|---| +| Free | 25,000 | $0/mo | Unlimited social connections, 1 enterprise connection, custom domain | +| Essentials | 500 | $35/mo | Standard support, RBAC, Pro MFA | +| Professional | 500 | $240/mo | Enterprise MFA, enhanced attack protection, existing user DB | +| Enterprise | Custom | Contact sales | Token Vault, CIBA, advanced features | + +Token Vault is listed as an Enterprise add-on. This is the critical pricing question: if Token Vault requires Enterprise (which typically starts at $1,000+/mo), the cost equation changes significantly. Without Token Vault, Auth0 still meets Requirements 1–4 and Banksy falls back to the two-step auth flow — functional but not the optimal UX. + +Custom social connections (Generic OAuth2) are available on the Free plan. DCR is configurable at all tiers. The limiting factor is Token Vault. + +**Estimated costs without Token Vault:** +- 100 MAU: Free tier ($0/mo) +- 1,000 MAU: Essentials at $35/mo or Free tier ($0/mo) +- 10,000 MAU: Free tier ($0/mo, up to 25K) + +**Estimated costs with Token Vault:** Enterprise pricing, likely $1,000–$3,000+/mo. Requires sales engagement to confirm. + +#### Operational Complexity + +**Setup effort:** Moderate. Configuring a custom social connection requires writing a Fetch User Profile Script, configuring the connection in the Dashboard, enabling DCR, and setting up the Auth0 API definition with appropriate audiences and scopes. Auth0's documentation for this pattern is extensive, with working examples for custom providers. + +**Maintenance:** Low for managed service. Auth0 handles key rotation, certificate management, and infrastructure. Banksy needs to monitor token exchange health, handle Auth0 outage scenarios, and keep the Fetch User Profile Script in sync with any changes to Mural's user info endpoint. + +**Vendor lock-in:** Moderate. Auth0's custom social connection configuration is Auth0-specific, but the underlying pattern (custom OAuth connection with upstream token storage) is available from other providers. The Token Vault API is proprietary. Migration would require moving to another provider's equivalent feature. + +**Documentation quality:** Excellent. Auth0 has dedicated documentation for custom social connections, Token Vault, DCR, and MCP integration. The MCP documentation at `auth0.com/ai/docs/` is purpose-built. + +#### MCP Ecosystem Fit + +**FastMCP integration:** FastMCP includes a built-in `Auth0Provider` class (`fastmcp.server.auth.providers.auth0`) that handles PRM endpoint generation, token validation, and Auth0-specific configuration. Minimal custom code needed. + +**IDE compatibility:** Auth0 has published integration guides for MCP servers, including examples with Cloudflare Workers. The combination of Auth0 + FastMCP + Cursor/VS Code/Claude Desktop is a documented, tested pattern. + +**MCP documentation:** Auth0 maintains `auth0.com/ai/docs/mcp/` with guides for resource parameter compatibility, DCR setup, MCP Inspector testing, and deployment patterns. + +--- + +### 2. Descope + +Descope is the most MCP-native provider evaluated. It was designed with agentic identity as a core use case and provides purpose-built SDKs for MCP server authentication. It meets all five requirements, though the interplay between custom OAuth providers and Outbound Apps needs PoC validation. + +#### Technical Fit + +**JWT issuance:** Descope issues RS256 JWTs with configurable audiences and scopes. Session tokens and access tokens are standard JWTs. + +**JWKS endpoint:** Exposed via standard OIDC Discovery. Keys are managed by Descope. + +**Discovery metadata:** Full OIDC Discovery at the project's base URL. Serves `/.well-known/openid-configuration`. + +**DCR support:** Descope supports both DCR (RFC 7591) and CIMD (Client-Initiated Metadata Discovery) for MCP client registration. DCR is the primary onboarding method for MCP servers in Descope's architecture. + +**Custom upstream OAuth:** Descope's "Custom Providers" feature allows configuring any OAuth provider by specifying: +- Authorization Endpoint (e.g., Mural's `/api/public/v1/oauth/authorize`) +- Token Endpoint (e.g., Mural's `/api/public/v1/oauth/token`) +- User Info Endpoint (e.g., Mural's `/api/public/v1/users/me`) +- Client ID and Client Secret +- Scopes +- User attribute mapping (email, display name from the User Info response) + +The documentation provides examples for Spotify, Login.gov, LinkedIn, and other non-default providers, confirming that arbitrary upstream OAuth providers are supported. Mural would be configured the same way. + +**Upstream token storage (Outbound Apps):** Descope's Outbound Apps feature provides token vault functionality. After a user authenticates via a custom OAuth provider, Descope can store the upstream provider's tokens. Applications retrieve them via the management API: + +``` +POST /v1/mgmt/outbound/app/user/token +{ + "appId": "", + "userId": "", + "scopes": ["murals:read", "murals:write"], + "options": { "withRefreshToken": true, "forceRefresh": false } +} +``` + +The response includes `accessToken`, `accessTokenExpiry`, `refreshToken`, and `scopes`. Descope's Python MCP SDK exposes this as `get_connection_token()`. + +**Interplay caveat:** The documentation describes Custom Providers (inbound authentication) and Outbound Apps (outbound token management) as distinct features. The PoC must confirm that authenticating via a custom OAuth provider automatically populates an Outbound App's token store — or whether this requires separate configuration where the user both authenticates through the custom provider and then separately connects to an Outbound App. If these are separate flows, the single-step UX advantage may not materialize without additional orchestration. + +**Upstream token refresh:** Outbound Apps support the `forceRefresh` option in the token retrieval API, which triggers a refresh using the stored refresh token. Descope handles the refresh exchange with the upstream provider. + +**User account linking:** Descope creates a unified user record linked to the custom OAuth provider. User attributes are mapped from the upstream provider's user info response during authentication. + +#### Pricing + +| Plan | MAU Included | Price | Key Features | +|---|---|---|---| +| Free | 7,500 | $0/mo | Core auth features | +| Pro | 10,000 | $249/mo (annual) | 2 federated apps, $0.05 overage per MAU | +| Growth | 25,000 | $799/mo (annual) | 10 SSO connections | +| Enterprise | Custom | Contact sales | Unlimited test users, advanced features | + +Custom OAuth providers are available on the Free plan. Outbound Apps availability by tier needs confirmation — the documentation does not clearly gate this feature. + +**Estimated costs:** +- 100 MAU: Free tier ($0/mo) +- 1,000 MAU: Free tier ($0/mo, up to 7.5K) +- 10,000 MAU: Pro at $249/mo + +#### Operational Complexity + +**Setup effort:** Low to moderate. Descope's Console provides a visual flow builder for authentication. Custom OAuth providers are configured through the Dashboard with a form-based interface (no code needed for the connection itself). The MCP SDK handles server-side integration. + +**Maintenance:** Low for managed service. Descope handles infrastructure, key rotation, and token management. + +**Vendor lock-in:** Moderate. Descope's Flow Builder and SDK APIs are proprietary. The underlying OAuth/OIDC configuration is portable, but the Outbound Apps API and MCP SDK would need replacement if migrating. + +**Documentation quality:** Good. Custom provider setup is documented with step-by-step guides and examples (Spotify, Login.gov). MCP-specific documentation is purpose-built. However, the documentation for the interplay between custom providers and Outbound Apps is sparse — this is the gap the PoC must close. + +#### MCP Ecosystem Fit + +**FastMCP integration:** FastMCP includes a built-in `DescopeProvider` class. Descope also provides a standalone Python MCP SDK (`descope-mcp`) with `validate_token()`, `require_scopes()`, and `get_connection_token()`. + +**IDE compatibility:** Descope documents MCP server integration patterns and provides an Express SDK for TypeScript MCP servers. The DCR + PRM pattern works with Cursor, VS Code, and Claude Desktop. + +**MCP documentation:** Descope maintains dedicated MCP documentation at `docs.descope.com/mcp/` covering authorization, server setup, client connections, and SDK usage. + +--- + +### 3. WorkOS AuthKit + +WorkOS AuthKit is a strong MCP-oriented provider with DCR support and documented MCP integration, but it **fails Requirement 4** — it does not support configuring arbitrary upstream OAuth providers. + +**Custom upstream OAuth:** WorkOS AuthKit only supports predefined social login providers: Google, GitHub, Microsoft, Apple, LinkedIn, Slack, GitLab, Bitbucket, Xero, Rippling, ADP, and Intuit. There is no mechanism to configure Mural (or any other arbitrary OAuth provider) as an identity source by specifying manual authorization/token/userinfo URLs. WorkOS's enterprise SSO supports generic SAML and OIDC federation, but this is for enterprise customers' identity providers (Okta, Azure AD, etc.), not for adding arbitrary social login options. + +**What it does well:** WorkOS has generous pricing (first 1M MAU free, $2,500 per additional 1M), supports DCR with MCP server authorization (feature preview since May 2025), and provides clean integration documentation. If Mural were a predefined WorkOS provider, this would be a strong candidate. + +**Verdict:** Disqualified by Requirement 4. Non-enterprise Mural users (email/password, non-Google social login) would have no way to authenticate. WorkOS could only serve a deployment where all users happen to have Google, GitHub, or Microsoft accounts — a subset, not the full Mural user population. + +--- + +### 4. Azure AD / Entra ID + +Microsoft Entra External ID added OIDC federation for custom identity providers in March 2025 (GA). This theoretically allows configuring Mural as an upstream provider, but the implementation has a critical dependency. + +**Custom upstream OAuth:** Entra External ID's OIDC federation requires the upstream provider to expose a well-known/metadata endpoint. Configuration in the Entra admin center takes a "well-known endpoint" URL, from which it automatically discovers authorization, token, and JWKS URLs. Mural has no `/.well-known/openid-configuration` or equivalent discovery endpoint. There is no option to specify individual URLs manually. This makes Entra OIDC federation incompatible with Mural as it exists today. + +**DCR:** Azure AD / Entra ID does not support Dynamic Client Registration for arbitrary clients. Banksy would need FastMCP's `OAuthProxy`, reintroducing some AS-like surface area. + +**Other capabilities:** Entra ID issues RS256 JWTs, exposes JWKS, and serves full OIDC Discovery. It is a production-grade IdP used by Microsoft's own MCP integration with VS Code. Pricing for External ID starts at pay-per-authentication (~$0.01–$0.03 per authentication for the first 50K MAU/mo on the free tier). + +**Verdict:** Blocked by Mural's lack of discovery metadata. If Mural added `/.well-known/openid-configuration` in the future, Entra ID could become viable — but that is a Mural platform dependency, not something Banksy controls. Also lacks DCR, requiring `OAuthProxy`. + +--- + +### 5. AWS Cognito + +AWS Cognito User Pools support OIDC federation with external identity providers and issue JWTs validatable via JWKS. + +**Custom upstream OAuth:** Cognito supports adding OIDC identity providers with manual configuration of issuer, authorization URL, token URL, user info URL, JWKS URL, and client credentials. However, Cognito expects the upstream provider to be OIDC-compliant — it validates tokens from the upstream provider using the upstream's JWKS endpoint. Mural's HS256 tokens with no JWKS endpoint present a problem: Cognito would obtain tokens from Mural during the federation flow but cannot validate them using standard OIDC procedures. Whether Cognito proceeds despite validation failure (treating the upstream as opaque OAuth rather than OIDC) needs testing. + +**DCR:** Cognito does not support Dynamic Client Registration. Banksy would need `OAuthProxy`. + +**Upstream token storage:** Cognito does not expose upstream provider tokens to applications after federation. After a user authenticates through an external OIDC provider, Cognito issues its own tokens — the upstream provider's tokens are used internally during federation and discarded. This means Layer 2 (separate Mural connect) would always be needed. + +**Pricing:** Pay-per-use, approximately $0.0055 per MAU for basic tier (Essentials). Free tier includes 10,000 MAU/mo. Advanced features (threat protection, access token customization) add cost. + +**Verdict:** Partially viable. Meets Requirements 1–2, partially meets Requirement 4 (OIDC federation, but Mural's non-OIDC-compliant tokens may cause issues), fails Requirement 3 (no DCR), and fails Requirement 5 (no upstream token storage). Would require `OAuthProxy` and two-step auth. The OIDC compliance gap with Mural is a significant risk. + +--- + +### 6. Google Cloud Identity Platform / Firebase Auth + +Google Cloud Identity Platform (GCIP) and Firebase Auth share the same underlying infrastructure for identity management. + +**Custom upstream OAuth:** GCIP/Firebase Auth supports a fixed set of OAuth providers: Apple, Apple Game Center, Facebook, GitHub, Google, Google Play Games, LinkedIn, Microsoft, Twitter, and Yahoo. Adding providers outside this list requires using generic SAML or OIDC federation — but like Azure AD, generic OIDC requires the upstream provider to serve discovery metadata. There is no "Generic OAuth2" option with manual endpoint configuration equivalent to Auth0's. Firebase Auth's `signInWithCustomToken()` allows custom token-based auth but requires Banksy to implement its own token issuance — this defeats the purpose of using an external IdP. + +**DCR:** Not supported. Would require `OAuthProxy`. + +**Upstream token storage:** Not available through standard APIs for federated providers. + +**Verdict:** Disqualified by Requirement 4. No mechanism to configure Mural as an arbitrary OAuth identity source without discovery metadata. + +--- + +### 7. Clerk + +Clerk is a developer-focused auth provider that recently expanded its OAuth capabilities significantly (June 2025). + +**Custom upstream OAuth:** Clerk supports custom OAuth/OIDC providers as social connections. Configuration accepts either a discovery endpoint URL or manual specification of authorization, token, and userinfo endpoints. Client ID, Client Secret, and attribute mapping are configurable. PKCE support for custom OAuth providers was added in November 2025. This means Mural could be configured as a custom social connection in Clerk. + +**DCR:** Clerk added Dynamic Client Registration and OAuth server capabilities in June 2025, alongside consent screens, public client support, and MCP service compatibility. This positions Clerk as compatible with FastMCP's `RemoteAuthProvider`. + +**JWT/JWKS/Discovery:** Clerk issues RS256 JWTs and exposes JWKS and OIDC Discovery endpoints. + +**Upstream token storage:** This is the gap. Clerk's documentation does not describe a Token Vault or equivalent feature for storing and retrieving upstream provider tokens after social login. Clerk's primary model is user-facing authentication — it handles the OAuth flow, extracts user identity, and issues Clerk tokens. Whether the upstream provider's access/refresh tokens are accessible to the application (via API or webhook) is not clearly documented. Without upstream token storage, Banksy would need the two-step auth flow. + +**Pricing:** Clerk's pricing is per-MAU, starting free for up to 10,000 MAU. Beyond that, pricing scales per-MAU (approximately $0.02/MAU). Enterprise pricing is available for advanced features. + +**Verdict:** Meets Requirements 1–4 (custom OAuth, DCR, JWT/JWKS, discovery). Unclear on Requirement 5 (upstream token storage). Worth investigating if Auth0 and Descope prove unacceptable, but the token storage gap would likely mean two-step auth. + +--- + +### 8. Stytch + +Stytch is a developer-focused auth provider with Connected Apps and DCR support. + +**Custom upstream OAuth:** Stytch supports OIDC connections with manual configuration of issuer, client ID, client secret, authorization URL, token URL, userinfo URL, and JWKS URL. The configuration model expects the upstream provider to have a JWKS URL — Mural does not. However, Stytch may treat the upstream as an opaque OAuth provider if the OIDC fields are partially populated. This needs validation. + +**DCR:** Stytch fully supports Dynamic Client Registration for both public and confidential clients. DCR is enabled at the project level in the Connected Apps section. + +**JWT/JWKS/Discovery:** Stytch issues RS256 JWTs and exposes JWKS and discovery endpoints. + +**Upstream token storage:** Stytch's Connected Apps feature positions Stytch as an OAuth provider — it manages client registrations, consent, and token issuance. However, upstream token storage for external identity providers (the reverse direction) is not clearly documented. Stytch's "External Identity Providers" feature exchanges third-party JWTs for Stytch sessions, but this is about accepting external tokens, not storing them for later retrieval. + +**Pricing:** Free tier includes 10,000 MAU. Beyond that, $0.10 per MAU. All features are available on the free tier (no feature gating). Additional SSO connections cost $125 each beyond the 5 included. + +**Estimated costs:** +- 100 MAU: Free ($0/mo) +- 1,000 MAU: Free ($0/mo) +- 10,000 MAU: Free ($0/mo, at the limit) +- Beyond 10K: $0.10/MAU + +**Verdict:** Partially meets requirements. DCR support and JWT infrastructure are solid. Custom OAuth connection support is present but requires a JWKS URL from the upstream provider — a potential blocker with Mural. Upstream token storage is unclear. Lower priority than Auth0 and Descope. + +--- + +### 9. FusionAuth + +FusionAuth is a self-hosted/cloud auth provider with a generous free Community plan. + +**Custom upstream OAuth:** FusionAuth supports generic OIDC identity providers (automatic discovery or manual endpoint configuration) and an External JWT identity provider for custom JWT-based federation. The OIDC provider can be configured with manual URLs if discovery is not available. The External JWT provider accepts JWTs from third parties and maps claims to FusionAuth users. However, External JWT is API-only (not compatible with hosted login pages), which limits its usefulness for browser-based OAuth flows. + +**DCR:** FusionAuth's DCR support is not clearly documented. The feature is not mentioned in pricing or feature documentation. This likely means Banksy would need `OAuthProxy`. + +**JWT/JWKS/Discovery:** FusionAuth issues RS256 JWTs, exposes JWKS, and serves OIDC Discovery metadata. + +**Upstream token storage:** FusionAuth stores the upstream provider's tokens after OIDC federation. These can be accessed via the FusionAuth API (Identity Provider Link API). The level of control over token refresh and retrieval needs validation. + +**Pricing:** +- Community (self-hosted): Free, unlimited MAU +- Starter (self-hosted): $125/mo for up to 240K MAU +- Essentials: $850/mo +- Enterprise: $3,300/mo + +**Verdict:** Partially viable as a self-hosted option. Custom OIDC provider support is present, upstream token storage exists, and the Community plan is free. However, DCR support is unclear, MCP ecosystem presence is minimal, and the External JWT provider's API-only limitation adds friction. Lower priority than Keycloak for self-hosted. + +--- + +### 10. Keycloak + +Keycloak is the strongest open-source, self-hosted option. It meets all five requirements and has the most mature identity brokering implementation among self-hosted alternatives. + +#### Technical Fit + +**JWT issuance:** Keycloak issues RS256 JWTs by default (ES256 and other algorithms configurable). Audiences, scopes, and claims are fully customizable via client scopes and protocol mappers. + +**JWKS endpoint:** Exposed at `https://{host}/realms/{realm}/protocol/openid-connect/certs`. + +**Discovery metadata:** Full OIDC Discovery at `https://{host}/realms/{realm}/.well-known/openid-configuration`. + +**DCR support:** Keycloak supports both RFC 7591 and OpenID Connect Dynamic Client Registration (OIDC DCR certified as of Keycloak 18.0.0). The registration endpoint is at `/realms/{realm}/clients-registrations/openid-connect`. Authentication can use bearer tokens, initial access tokens, or registration access tokens. This enables FastMCP's `RemoteAuthProvider`. Known limitations: no Software Statement support (RFC 7591 optional parameter), limited custom metadata extensions. + +**Custom upstream OAuth:** Keycloak's Identity Brokering feature supports arbitrary OIDC and OAuth 2.0 upstream providers. Configuration specifies: +- Authorization URL +- Token URL +- User Info URL (optional) +- Client ID and Client Secret +- Scopes +- Claim mapping (via mappers) + +No discovery endpoint is required from the upstream provider. Mural would be configured as a generic OAuth 2.0 identity provider in Keycloak's admin console. + +**Upstream token storage:** Keycloak's "Store Token" configuration option on the identity provider settings page stores upstream access and refresh tokens after authentication. Applications retrieve them via the broker token endpoint: + +``` +GET /realms/{realm}/broker/{provider_alias}/token +Authorization: Bearer {keycloak_access_token} +``` + +The accessing client requires the `broker` client-level role `read-token`. The "Stored Tokens Readable" switch automatically assigns this role to new users. Upstream tokens can be re-established via re-authentication or the client-initiated account linking API. + +**Upstream token refresh:** Keycloak does not automatically refresh stored upstream tokens. When a stored token expires, the application must trigger re-authentication or use Keycloak's token exchange endpoint (if configured with RFC 8693 support) to obtain fresh upstream tokens. This is a gap compared to Auth0's Token Vault, which handles refresh transparently. + +**User account linking:** Keycloak creates a linked identity between its internal user record and the upstream provider's user identity. Automatic linking can be configured by email match or explicit user consent. + +#### Pricing + +Keycloak is fully open-source under the Apache 2.0 license. There are no per-MAU costs. Red Hat offers a commercially supported distribution (Red Hat build of Keycloak) with subscription pricing, but the community edition is free. + +**Costs are infrastructure-only:** +- Compute: Keycloak runs as a Java application (Quarkus-based since Keycloak 17+). Minimum viable deployment on a small VM or container (~2 CPU, 4GB RAM). +- Database: PostgreSQL or MySQL required. +- HA: Production deployments need at least 2 Keycloak instances with shared database and cache (Infinispan). +- Estimated cloud infrastructure: $50–$200/mo for a minimal production setup on Azure/AWS. + +#### Operational Complexity + +**Setup effort:** Moderate to high. Keycloak requires deploying and configuring a Java application, setting up a database, configuring realms, clients, identity providers, and protocol mappers. The identity brokering setup for a custom OAuth provider is well-documented but involves multiple configuration screens. + +**Maintenance:** Moderate. Keycloak requires regular version upgrades, database maintenance, certificate management, monitoring, and capacity planning. The Keycloak team releases quarterly updates. Running a HA cluster adds operational overhead for cache synchronization and session management. + +**Vendor lock-in:** Low. Keycloak is open-source with standard OIDC/OAuth interfaces. Configuration can be exported and imported. Migration to another OIDC provider is straightforward at the protocol level. + +**Documentation quality:** Good. Keycloak has extensive official documentation for identity brokering, token storage, DCR, and all OIDC features. Community documentation and third-party guides are abundant. + +#### MCP Ecosystem Fit + +**FastMCP integration:** No built-in Keycloak provider class in FastMCP. Would require manual `RemoteAuthProvider` configuration with Keycloak's OIDC Discovery URL, JWKS URI, and audience settings. This is straightforward but requires more initial setup than Auth0 or Descope. + +**IDE compatibility:** Keycloak is a standards-compliant OIDC provider. Any IDE that supports standard OIDC Discovery + DCR will work with Keycloak. No Keycloak-specific IDE integration is needed or available. + +**MCP documentation:** No Keycloak-specific MCP documentation exists. The Quarkus MCP server tutorial references Keycloak as a potential auth provider, but this is a third-party example rather than official guidance. + +--- + +### 11. Ory Hydra + +Ory Hydra is an open-source OAuth 2.0 / OIDC server that separates the authorization server (Hydra) from the identity management layer (Kratos). + +**Custom upstream OAuth:** Upstream OAuth federation is handled by Ory Kratos (the identity layer), not Hydra itself. Kratos supports generic OIDC social sign-in with manual configuration of issuer URL, client credentials, and scopes. Configuring Mural as a social sign-in provider in Kratos is possible but requires running both Hydra and Kratos, which adds architectural complexity. + +**DCR:** Ory Hydra supports Dynamic Client Registration. + +**JWT/JWKS/Discovery:** Hydra issues RS256 JWTs, exposes JWKS, and serves OIDC Discovery. + +**Upstream token storage:** Kratos stores the upstream provider's tokens during social sign-in. Retrieval mechanisms are less mature than Keycloak's broker token endpoint. + +**Pricing:** Open-source under Apache 2.0. Ory offers a cloud-hosted version (Ory Network) with per-MAU pricing. Self-hosted deployments require the Ory Enterprise License for production use. + +**Verdict:** Technically capable but architecturally complex. Running Hydra + Kratos + a database for a feature that Keycloak provides in a single application is hard to justify. If already invested in the Ory ecosystem, this could work. Otherwise, Keycloak is the simpler self-hosted choice. + +--- + +### 12. Authentik + +Authentik is an open-source identity provider with a clean UI and growing community. + +**Custom upstream OAuth:** Authentik supports custom OAuth sources with manual configuration of authorization URL, token URL, profile URL, client key/secret, and scopes. If the upstream provider supports OIDC Discovery, Authentik can auto-populate configuration from the well-known URL. Mural would need manual configuration since it lacks discovery. + +**DCR:** Authentik's DCR support is not clearly documented in the current version. The OAuth2 provider feature supports standard OIDC flows but explicit DCR documentation is absent. + +**JWT/JWKS/Discovery:** Authentik issues RS256 JWTs with configurable JWKS. It supports OIDC Discovery at `/.well-known/openid-configuration`. JWKS can be configured to trust external sources for machine-to-machine authentication. + +**Upstream token storage:** Not clearly documented. Authentik stores user identity information from upstream sources but whether upstream access/refresh tokens are stored and retrievable via API needs validation. + +**Pricing:** Open-source and free for self-hosted deployments. Authentik offers an Enterprise plan with additional features for $5/user/month. + +**Verdict:** Promising as a lightweight self-hosted alternative to Keycloak. Custom OAuth source support is confirmed. DCR and upstream token storage need validation. Less mature than Keycloak for production use, but simpler to operate. Worth evaluating if Keycloak's complexity is a concern and Auth0/Descope pricing is prohibitive. + +--- + +## Comparison Matrix + +| Provider | JWT (RS256/ES256) | JWKS | Discovery | DCR (RFC 7591) | Custom Upstream OAuth | Upstream Token Storage | FastMCP Class | Pricing Model | Estimated Cost (1K MAU) | +|---|---|---|---|---|---|---|---|---|---| +| **Auth0** | Yes | Yes | Yes | Yes | Yes (Generic OAuth2) | Yes (Token Vault) | `Auth0Provider` | Per-MAU, tiered | Free (up to 25K MAU) | +| **Descope** | Yes | Yes | Yes | Yes | Yes (Custom Providers) | Yes (Outbound Apps) | `DescopeProvider` | Per-MAU, tiered | Free (up to 7.5K MAU) | +| **WorkOS** | Yes | Yes | Yes | Yes | No (predefined only) | N/A | Manual config | Per-MAU | Free (up to 1M MAU) | +| **Azure AD** | Yes | Yes | Yes | No | Partial (needs upstream discovery) | N/A | `OAuthProxy` | Per-auth | ~Free tier | +| **AWS Cognito** | Yes | Yes | Yes | No | Partial (OIDC, needs JWKS) | No | `OAuthProxy` | Per-MAU | ~$5.50/mo | +| **Google/Firebase** | Yes | Yes | Yes | No | No (fixed provider list) | No | `OAuthProxy` | Per-MAU/auth | ~Free tier | +| **Clerk** | Yes | Yes | Yes | Yes | Yes (manual endpoints) | Unclear | Manual config | Per-MAU | Free (up to 10K MAU) | +| **Stytch** | Yes | Yes | Yes | Yes | Partial (needs upstream JWKS) | Unclear | Manual config | Per-MAU | Free (up to 10K MAU) | +| **FusionAuth** | Yes | Yes | Yes | Unclear | Yes (OIDC + External JWT) | Partial | Manual config | Self-hosted free; cloud tiered | $0 (self-hosted) | +| **Keycloak** | Yes | Yes | Yes | Yes (certified) | Yes (Identity Brokering) | Yes (Store Token) | Manual config | Free (open-source) | $50–200/mo infra | +| **Ory Hydra** | Yes | Yes | Yes | Yes | Yes (via Kratos) | Partial | Manual config | Free (open-source) | $50–150/mo infra | +| **Authentik** | Yes | Yes | Yes | Unclear | Yes (OAuth Sources) | Unclear | Manual config | Free (open-source) | $30–100/mo infra | + +**Legend:** +- "Yes" = confirmed via documentation +- "Partial" = supported with caveats (noted in provider evaluation) +- "Unclear" = not clearly documented; needs PoC validation +- "N/A" = not applicable (provider fails prerequisite requirement) +- Infrastructure costs for self-hosted options are rough estimates for a minimal production deployment. + +--- + +## Recommendation + +### Primary Recommendation: Auth0 PoC First + +Auth0 should be the first proof-of-concept for three reasons: + +1. **Token Vault is the only mature mechanism for preserving single-step auth.** Auth0's Token Vault + custom social connection pattern is the most documented path to eliminating the separate "Mural connect" browser step. If Token Vault works with a custom social connection to Mural, Banksy's auth UX post-migration is equivalent to — or better than — the current mural-oauth mode's single-step experience. No other managed provider has an equivalent feature at the same level of maturity. + +2. **FastMCP provides a built-in Auth0Provider.** The integration path is the shortest of any provider: configure Auth0, pass the configuration to FastMCP's `Auth0Provider`, and Banksy gets PRM, DCR, and JWT validation with minimal custom code. This has been tested by multiple MCP server implementations. + +3. **Free tier covers initial development.** Auth0's 25K MAU free tier is generous enough for development, testing, and early adoption. The pricing risk is concentrated on Token Vault, which appears to be an Enterprise feature. The PoC should confirm whether Token Vault is required (it is only needed if single-step auth is a priority) and what Enterprise pricing looks like. + +**Auth0 PoC validation targets:** +- Configure Mural as a custom social connection (authorization URL, token URL, Fetch User Profile Script) +- Authenticate through the custom connection from a browser — confirm Auth0 obtains Mural tokens and creates a user record +- Enable Token Vault on the custom social connection — confirm Mural access + refresh tokens are stored +- Exchange an Auth0 access token for a Mural access token via the Token Vault grant type — confirm Banksy can retrieve Mural tokens programmatically +- Test the refresh flow — exchange an Auth0 refresh token for a fresh Mural access token, confirm Mural's refresh token is used automatically +- Enable DCR, configure FastMCP's `Auth0Provider`, connect from Cursor and VS Code — confirm end-to-end PRM discovery, OAuth flow, and token validation +- Determine the minimum Auth0 plan required for Token Vault + custom social connections + DCR + +### Secondary Recommendation: Descope PoC Second + +If Auth0's Token Vault requires Enterprise pricing that exceeds Banksy's budget, or if the Token Vault + custom social connection integration proves unreliable, Descope is the next candidate: + +1. **MCP-native design.** Descope was built with agentic identity as a first-class use case. Its MCP SDKs, DCR support, and Outbound Apps feature align with Banksy's architecture more naturally than Auth0's general-purpose platform. + +2. **Outbound Apps as an alternative to Token Vault.** Descope's Outbound Apps provide similar upstream token storage and retrieval, with a cleaner API (`get_connection_token()`). The risk is that Custom Providers (inbound auth) and Outbound Apps (outbound token management) may not integrate seamlessly — the PoC must validate this. + +3. **Pricing is transparent.** Pro plan at $249/mo for 10K MAU is a known cost, unlike Auth0's opaque Enterprise tier. + +**Descope PoC validation targets:** +- Configure Mural as a custom OAuth provider (authorization, token, user info endpoints) +- Authenticate through the custom provider — confirm Descope creates a user record with Mural identity +- Configure an Outbound App for Mural — determine whether authenticating via the custom provider automatically populates the Outbound App's token store +- Retrieve Mural access tokens via `get_connection_token()` — confirm Banksy can access Mural tokens programmatically +- Test the refresh flow — confirm `forceRefresh` obtains fresh Mural tokens +- Enable DCR, configure FastMCP's `DescopeProvider`, connect from Cursor and VS Code — confirm end-to-end flow +- Determine which pricing tier is needed for Custom Providers + Outbound Apps + DCR + +### Tertiary Recommendation: Keycloak as Self-Hosted Fallback + +If both managed providers prove unacceptable (Auth0 too expensive for Token Vault, Descope's custom provider + Outbound Apps integration doesn't work), Keycloak provides a self-hosted path: + +1. **No per-MAU costs.** Infrastructure costs ($50–$200/mo) are fixed regardless of user count. +2. **Full feature set.** Identity brokering + Store Token + DCR + full OIDC stack — all requirements met in a single application. +3. **Operational tradeoff.** The cost is ongoing maintenance: upgrades, HA configuration, database management, certificate rotation, monitoring. + +Keycloak is the right choice if Banksy's team has Kubernetes infrastructure (likely, given Mural's deployment patterns) and the operational capacity to maintain another stateful service. It is not the right choice if the team wants to minimize infrastructure scope. + +### Build vs. Buy Framing + +The twelve providers evaluated represent the "buy" side of the spectrum (managed service or self-hosted open-source). The "build" alternative is for Mural to evolve its own OAuth infrastructure: + +**What Mural would need to change:** +- Add asymmetric signing (RS256 or ES256) for OAuth tokens, alongside or replacing HS256 +- Expose a JWKS endpoint (`/.well-known/jwks.json`) +- Add OAuth/OIDC Discovery (`/.well-known/openid-configuration` or `/.well-known/oauth-authorization-server`) +- Optionally add DCR support (without DCR, Banksy uses `OAuthProxy`) + +**What this would enable:** Mural could serve as the Layer 1 IdP. The IDE would authenticate with Mural (through Banksy's PRM → Mural's discovery → Mural's OAuth), receive a Mural-issued RS256 JWT, and present it to Banksy. Banksy would validate via Mural's JWKS. Layer 2 (Mural API tokens) would still be separate per the MCP spec's token passthrough prohibition, but the user authenticates with Mural for both layers — and if Mural supported RFC 8693 Token Exchange, the Layer 2 step could happen server-to-server without additional user interaction. + +**Assessment:** This eliminates the intermediary provider entirely and covers all Mural user segments by definition (every Mural user has a Mural account). However, it depends on Mural platform engineering work with uncertain timeline and scope. The HS256 → RS256 migration in `api/src/core/session/tokens/` affects every token validation path in the mural-api codebase. Adding discovery and JWKS endpoints is net-new infrastructure. This is not a patch — it is a meaningful evolution of Mural's auth platform. + +**Recommended framing:** Pursue a managed provider PoC (Auth0 or Descope) for the near term. In parallel, socialize the "Mural as OIDC provider" requirements with the Mural platform team as a long-term direction. If Mural evolves its OAuth infrastructure, Banksy can migrate from the intermediary provider to Mural-as-IdP later — the resource server architecture (PRM, JWT validation, Layer 2 token storage) remains the same regardless of who issues the Layer 1 JWT. diff --git a/fastmcp-migration/banksy-research/banksy-architecture-research.md b/fastmcp-migration/banksy-research/banksy-architecture-research.md new file mode 100644 index 0000000..ec4b9d8 --- /dev/null +++ b/fastmcp-migration/banksy-research/banksy-architecture-research.md @@ -0,0 +1,1456 @@ +# Banksy Architecture Research: Server Topology, Monorepo Layout, and Code Organization + +## 1. Executive Summary + +Banksy is a monorepo of MCP capabilities that serves different audiences with different +tool sets under different authentication modes. The migration from TypeScript/xmcp to +Python/FastMCP requires resolving three interrelated design questions: server topology +(how auth modes map to deployments), monorepo layout (how the repo accommodates multiple +Python services), and code organization (how tools, auth, and shared code are structured +within the workspace). + +### Server Topology + +The current TypeScript architecture uses two entirely separate deployment modes: an +internal mode (39 tools, SSO proxy + session JWTs) and a public mode (87 tools, Mural +OAuth tokens). Two hard constraints shape the FastMCP migration: + +1. **FastMCP enforces one auth provider per server instance.** The `auth=` parameter + accepts exactly one provider. When a child server is mounted via `mount()`, the + parent's auth applies -- the child's auth is ignored. + +2. **The MCP protocol defines auth at the transport level, not per-tool.** There is no + mechanism in `tools/list` responses for a server to advertise per-tool auth + requirements. + +**Recommendation: Option E -- Deployment Mode Selection with Tag-Based Refinement.** +Build one Docker image. At runtime, `BANKSY_MODE` selects the auth provider and tool +set. Within each mode, tags provide finer-grained client-side filtering. + +### Monorepo Layout + +Beyond the MCP server migration, the repo must house an agent orchestration service +(MCP client, not server) with distinct dependencies (LLM provider SDKs) and potential +for independent deployment. + +**Recommendation: Use uv workspaces from Phase 1** with three workspace members: +`banksy-server` (FastMCP MCP server), `banksy-harness` (MCP client / agent +orchestration), and `banksy-shared` (Pydantic models, database access, auth utilities). +Python workspace members live under `pypackages/` to coexist with the existing +TypeScript `packages/` during the transition. + +### Code Organization + +**The `domains/` concept remains valid inside the server workspace member.** Workspace +boundaries handle dependency isolation and deployment. Domains handle tool organization +within the server. Canvas-mcp tools are absorbed into the server under +`domains/canvas/`, not as a separate workspace member. + + +## 2. Auth x Tool Group Matrix + +### Current State + +| Tool Group | Count | Auth Mode | Token Type | Backend Target | Why | +|---|---|---|---|---|---| +| Internal API tools | 39 | `sso-proxy` | Session JWT | `banksy-mural-api:5678` -> mural-api internal REST | Calls internal endpoints requiring corporate IdP session | +| Public API tools | 87 | `mural-oauth` | OAuth access token | `banksy-public-api:5679` -> mural-api `/api/public/v1` | Calls public API requiring user's Mural OAuth token | + +### Future Tool Groups + +| Tool Group | Expected Auth Mode | Rationale | +|---|---|---| +| canvas-mcp tools | `mural-oauth` (public) | Canvas operations use Mural's public API; no internal endpoints needed | +| Composite tools | Depends on downstream API | A tool calling both internal and public APIs would need both token types; no such tool exists today | +| Utility/diagnostic tools | None | Health checks, echo, version info need no auth | +| Machine-to-machine tools | `m2m` (future) | Worker-to-worker calls using `client_credentials` grant; already defined as a type in `auth-mode.ts` | + +### Cross-Mode Tool Overlap + +No tools currently work under both auth modes. The internal and public tool directories +are completely disjoint. Some tool names appear in both directories (e.g., +`get-mural-by-id`, `get-workspace`, `duplicate-mural`, `create-room`), but they import +from different generated clients (`client.banksy-mural-api` vs `client.banksy-public-api`) +and call different backend APIs. + +Both modes share infrastructure code: + +- `src/lib/call-mural-tool.ts` -- unified tool-calling function (backend URL differs per mode via `BANKSY_MURAL_API_URL`) +- `src/lib/auth/` -- Better Auth provider, session management, user context +- `src/lib/db/` -- PostgreSQL connection pool, token storage +- `src/lib/config.ts` -- centralized config with `AUTH_MODE` env var +- `src/lib/mural-session/` -- token refresh, content session creation +- `src/middleware.ts` -- Express middleware for auth and mural session + +### Auth Mode as Capability Constraint + +The auth mode is not merely a policy choice -- it is a capability constraint: + +- Internal tools call `banksy-mural-api` which proxies to mural-api's internal REST. + These endpoints require session JWTs obtained through Session Activation (SSO proxy -> + Google OAuth -> session creation). A Mural OAuth token cannot authenticate to these + endpoints. + +- Public tools call `banksy-public-api` which proxies to mural-api's `/api/public/v1`. + These endpoints require Mural OAuth access tokens with specific scopes + (`murals:read,murals:write,workspaces:read,...`). A session JWT cannot authenticate + to these endpoints. + +- The `callMuralTool` function retrieves the active token type via + `getActiveTokenType()`, which reads from `config().authMode`. The token type + (`session` vs `oauth`) determines which credentials are fetched from PostgreSQL and + sent to the backend. + +This means that even if both tool sets were registered on a single server, a tool from +the wrong auth mode would fail at the API call layer because the token type would be +wrong for the target endpoint. + + +## 3. FastMCP Capabilities and Constraints + +### 3.1 Single Auth Provider Per Server + +FastMCP accepts exactly one auth provider per server instance: + +```python +mcp = FastMCP("Banksy", auth=auth_provider) +``` + +The provider: +1. Registers discovery routes via `get_routes()` (e.g., `/.well-known/oauth-protected-resource`) +2. Registers HTTP middleware via `get_middleware()` (e.g., `BearerAuthBackend`, `AuthContextMiddleware`) +3. Supplies `verify_token()` for token validation + +All HTTP routes -- MCP transport, custom routes, and mounted sub-server routes -- share +the same auth middleware. There is no per-route or per-endpoint auth configuration built +into FastMCP. + +### 3.2 mount() Auth Inheritance + +Verified against FastMCP source (`server.py`, `transport.py`, `http.py`): + +- When `parent.mount(child)` is called, the child is wrapped as a `FastMCPProvider` + and added to the parent's provider list. +- Only the parent's `http_app()` is built. The child's `http_app()` is never + constructed. +- The parent's auth middleware applies to all traffic, including requests that invoke + child tools. +- The child's `auth=` parameter is completely ignored at runtime. + +| Scenario | Who controls auth? | Effect | +|---|---|---| +| `parent.mount(child)` where child has `auth=X` | Parent | Child's `auth=X` is ignored | +| `parent.mount(child)` where child has `auth=None` | Parent | Fine; parent's auth applies to all | +| Child tool has `tool.auth=AuthCheck` | Parent's token is used | `AuthCheck` runs against the parent's verified token | + +**Implication:** `mount()` cannot be used to create tool groups with different auth +strategies. All mounted sub-servers share the parent's single auth provider. + +### 3.3 Per-Tool Auth Checks (Component-Level) + +FastMCP supports `auth=AuthCheck` on individual tools: + +```python +@mcp.tool(auth=my_auth_check) +def admin_tool() -> str: ... +``` + +During `list_tools()` and `get_tool()`, the framework calls `run_auth_checks(tool.auth, +AuthContext(token=token))` where `token` is the token from the parent's auth middleware. + +This is a **visibility filter**, not a separate auth strategy. It can hide tools from +users who lack certain claims in their token, but it cannot change how the token was +obtained or validate a different token type. + +When tools are mounted from a child server, `FastMCPProviderTool.wrap()` does not copy +the child tool's `auth` field. The wrapped tool has `auth=None` on the parent. To add +per-tool auth checks to mounted tools, a custom Transform is needed. + +### 3.4 Transforms Pipeline + +Custom `Transform` subclasses filter or modify tools as they flow from providers to +clients: + +```python +class TagFilter(Transform): + def __init__(self, required_tags: set[str]): + self.required_tags = required_tags + + async def list_tools(self, tools: Sequence[Tool]) -> Sequence[Tool]: + return [t for t in tools if t.tags & self.required_tags] + + async def get_tool(self, name: str, call_next: GetToolNext) -> Tool | None: + tool = await call_next(name) + if tool and tool.tags & self.required_tags: + return tool + return None +``` + +Transforms can be applied at the server level (all providers) or provider level (specific +mount). They are a visibility mechanism, not a security boundary -- a client that knows a +tool's name can still attempt to call it unless a `get_tool` transform also blocks the +lookup. + +### 3.5 Server-Level Tag Filtering + +FastMCP provides `enable()` for startup-time tag filtering: + +```python +mcp = FastMCP("Production") +mcp.mount(api_server, namespace="api") +mcp.enable(tags={"production"}, only=True) +``` + +This applies tag filtering recursively to all mounted servers. Combined with +`tags={...}` on tool definitions, this provides a declarative way to control which +tools are visible at the server level. + +### 3.6 Custom Routes + +`custom_route()` registers non-MCP HTTP endpoints on the same Starlette app: + +```python +@mcp.custom_route("/health", methods=["GET"]) +async def health(request): + return JSONResponse({"status": "ok"}) +``` + +Custom routes from mounted sub-servers propagate to the parent (per current FastMCP +docs). Auth on custom routes comes from the parent's `AuthenticationMiddleware` but does +NOT get `RequireAuthMiddleware` (which only wraps the MCP transport route). Protected +custom routes must check `isinstance(request.user, AuthenticatedUser)` explicitly. + +### 3.7 MCP Protocol Constraints + +From the MCP specification (2025-03-26): + +- **Transport-level auth:** OAuth is defined at the HTTP transport layer. Every HTTP + request from client to server must include `Authorization: Bearer `. +- **No per-tool auth:** `tools/list` responses contain tool name, description, + inputSchema, and annotations. There is no field for auth requirements, scopes, or + permissions. +- **No per-tool scopes:** The protocol has no mechanism for a server to advertise that + different tools need different OAuth scopes. +- **Clients support multiple servers:** Cursor, Claude Desktop, and VS Code Copilot all + support connecting to multiple MCP servers simultaneously. Each connection has its own + auth flow. Separate servers per auth mode is transparent to the client. +- **No ecosystem precedent for multi-auth servers:** The MCP ecosystem uniformly assumes + one auth strategy per server. Multi-auth would require custom, non-standard solutions. + + +## 4. Server Topology Options + +### Option A: Separate Servers per Auth Mode (Current Model Preserved) + +One FastMCP server per auth mode, each with its own `auth=` provider. +Different Docker images, different deployments. Direct port of the current TS +architecture. + +| Dimension | Assessment | +|---|---| +| Auth handling | Clean -- each server has exactly one auth strategy | +| Tool visibility | Hard isolation -- a tool only exists in one server | +| Operational model | Two Docker images, two deployments, two scaling configs | +| Code sharing | Shared library code in the monorepo, no runtime sharing | +| Growth path | New auth mode = new Docker image, new deployment | +| CI/CD | Duplicated build pipelines | + +**Pros:** Zero risk of auth cross-contamination. Proven model. Independent scaling. + +**Cons:** Duplicated Docker images and deployment configs. Changes to shared code require +rebuilding both images. Operational overhead scales linearly with auth modes. + +### Option B: Single Server with Auth Multiplexer + +One FastMCP server with a custom `TokenVerifier` that inspects the token format and +dispatches to different verification logic. + +**Verdict: Not recommended.** Token type sniffing is fragile and a security risk. SSO +JWTs and Mural OAuth tokens are both JWTs; distinguishing them by format inspection is +unreliable. A bug in the multiplexer could grant access to the wrong tool set. No +ecosystem precedent. + +### Option C: Single Image, Auth Mode as Runtime Config + +One Docker image with a `BANKSY_MODE` environment variable. At startup, the server reads +the mode, configures the matching auth provider, and registers only the tools for that +mode. + +| Dimension | Assessment | +|---|---| +| Auth handling | Clean -- each deployment instance has one auth provider | +| Tool visibility | Startup-time registration -- unregistered tools don't exist | +| Operational model | One Docker image, multiple deployments with different env vars | +| Code sharing | Full runtime sharing of infrastructure code | +| Growth path | New auth mode = new enum value, new registration function | +| CI/CD | Single build pipeline | + +**Pros:** Single Docker image simplifies CI/CD. Auth isolation preserved per deployment. +Clean startup-time tool registration. Shared infrastructure code runs once. + +**Cons:** Composite tools needing multiple auth modes cannot run in any single mode. +Mode validation must happen at startup. Slightly more complex entrypoint logic. + +### Option D: Server-per-Mount with Shared Infrastructure + +Multiple FastMCP server instances in one process, each with its own `auth=`. Shared +database connections, HTTP clients, config. Different ports or path prefixes behind a +reverse proxy. + +**Verdict: Not recommended.** Significant operational complexity (reverse proxy, +multi-port management) provides no meaningful advantage over Option C. Same isolation, +more complexity. + +### Option E: Hybrid -- Deployment Mode Selection with Tag-Based Refinement (Recommended) + +Combines Option C's runtime mode selection with tag-based visibility filtering within +each mode. `BANKSY_MODE` selects the auth provider and broad tool set. Tags and +transforms provide finer-grained organization within that boundary. + +| Dimension | Assessment | +|---|---| +| Auth handling | Clean -- deployment mode selects one auth provider | +| Tool visibility | Layered: deployment (coarse), tags/transforms (fine) | +| Operational model | One Docker image, multiple deployments | +| Growth path | New mode for new auth; new tags for new groupings within a mode | +| Flexibility | Tags enable client-side filtering without server changes | + +**Pros:** All of Option C's benefits. Tags provide within-mode organization. `enable()` +can create specialized deployments. Custom transforms can implement runtime visibility +rules. Enterprise per-customer tool subsets achievable via tags + `enable()`. + +**Cons:** Tags are not a security boundary. Tag taxonomy requires design and +maintenance. + + +## 5. Recommended Server Topology: Option E Detailed + +### 5.1 Runtime Mode Selection + +``` +BANKSY_MODE=internal -> FastMCP(auth=SSOProxyAuth) + internal tools + internal tags +BANKSY_MODE=public -> FastMCP(auth=MuralOAuthAuth) + public tools + public tags +BANKSY_MODE=dev -> FastMCP(auth=None) + all tools + all tags +``` + +At startup: + +1. Read `BANKSY_MODE` from environment (default: fail with clear error if unset) +2. Instantiate the matching auth provider (Layer 1) +3. Create the FastMCP server with `auth=provider` +4. Register tool domains for the mode (e.g., `register_internal_tools(mcp)` or + `register_public_tools(mcp)`) +5. Register common infrastructure: health endpoint, auth callback routes +6. Optionally apply tag filters via `mcp.enable(tags={...}, only=True)` if a + specialized deployment is needed + +### 5.2 Startup Flow + +```python +from banksy_server.config import settings + +def create_server() -> FastMCP: + auth = create_auth_provider(settings.banksy_mode) + mcp = FastMCP("Banksy", auth=auth) + + register_common_routes(mcp) # /health, /version + + match settings.banksy_mode: + case "internal": + register_internal_tools(mcp) + register_session_activation_routes(mcp) + case "public": + register_public_tools(mcp) + register_mural_oauth_routes(mcp) + case "dev": + register_internal_tools(mcp) + register_public_tools(mcp) + register_session_activation_routes(mcp) + register_mural_oauth_routes(mcp) + + if settings.enabled_tags: + mcp.enable(tags=settings.enabled_tags, only=True) + + return mcp +``` + +### 5.3 Auth Provider per Mode + +**Internal mode (`sso-proxy`):** +- Layer 1: `OAuthProxy` or `RemoteAuthProvider` with Google IdP via SSO proxy +- Layer 2: Session Activation flow stores session JWTs in `mural_tokens` table +- Tools call `banksy-mural-api` (internal REST) with session JWTs + +**Public mode (`mural-oauth`):** +- Layer 1: `OAuthProxy` wrapping Mural's OAuth authorization server +- Layer 2: Mural OAuth access/refresh tokens stored in `mural_tokens` table +- Tools call mural-api's public API with OAuth access tokens + +**Dev mode:** +- Layer 1: No auth (`auth=None` or `StaticTokenVerifier`) +- Layer 2: Tokens loaded from dev seed data or `DISABLE_AUTH=true` bypass +- Both tool sets registered; backend URLs configurable + +### 5.4 Tag-Based Refinement + +Within each mode, tags organize tools along orthogonal dimensions: + +```python +@mcp.tool(tags={"murals", "read"}) +def get_mural_by_id(mural_id: str) -> dict: ... + +@mcp.tool(tags={"murals", "write"}) +def create_mural(title: str, workspace_id: str) -> dict: ... + +@mcp.tool(tags={"widgets", "write"}) +def create_sticky_note(mural_id: str, text: str) -> dict: ... +``` + +Clients can filter by tags to get focused tool sets. A specialized deployment could +use `enable(tags={"murals"}, only=True)` to expose only mural-related tools. + +### 5.5 How MCP Clients Connect + +MCP clients (Cursor, Claude Desktop, VS Code Copilot) support multiple simultaneous +server connections. A typical user's MCP configuration: + +```json +{ + "mcpServers": { + "banksy-internal": { + "url": "https://banksy-internal.example.com/mcp" + }, + "banksy-public": { + "url": "https://banksy-public.example.com/mcp" + } + } +} +``` + +Each connection has its own independent OAuth flow. The client handles auth for each +server separately. From the user's perspective, all tools appear in a unified list +regardless of which server provides them. + + +## 6. Monorepo Workspace Architecture + +### 6.1 Why Workspaces + +The migration strategy's original escape hatch states: "If a second Python service is +needed in the same repo, graduate to uv workspaces." That trigger has arrived. Beyond +the FastMCP server, the repo must house an agent orchestration service (MCP client) with +distinct dependencies and potential for independent deployment. + +Starting with uv workspaces in Phase 1 costs ~1 hour of additional setup (3 +`pyproject.toml` files instead of 1, workspace config, `--package` flags). Restructuring +from single package to workspaces later costs 2-4 hours of churn (moving files, updating +every import, rewriting configs) plus regression risk. Every file, test, and Docker layer +created before the switch would need to be moved. The compounding cost grows linearly +with code written before restructuring. + +### 6.2 uv Workspace Mechanics + +A uv workspace is declared via `[tool.uv.workspace]` in a root `pyproject.toml`: + +```toml +[tool.uv.workspace] +members = ["pypackages/*"] +``` + +**Cross-member dependencies** use `[tool.uv.sources]` with `workspace = true`: + +```toml +# pypackages/server/pyproject.toml +[project] +dependencies = ["banksy-shared"] + +[tool.uv.sources] +banksy-shared = { workspace = true } +``` + +This resolves `banksy-shared` from the workspace rather than PyPI. Workspace member +dependencies are installed as editable by default. + +**Single lockfile:** The workspace produces one `uv.lock` resolving dependencies for all +members together, guaranteeing version consistency. If `banksy-server` and +`banksy-harness` both depend on `httpx`, they get the same version. + +**Targeted execution:** + +```bash +uv run --package banksy-server pytest # Run server tests +uv sync --package banksy-harness # Sync only harness deps +uv run --exact --package banksy-server pytest # Strict isolation check +``` + +**`requires-python` constraint:** All members must share a compatible `requires-python`. +For this project, all members use `>=3.14`. + +### 6.3 Package Boundary Design + +Three workspace members: + +| Member | Role | Key Dependencies | +|--------|------|-----------------| +| **banksy-server** | FastMCP MCP server serving tools to IDE clients | `fastmcp`, `httpx`, `starlette`, `sqlalchemy`, `pydantic-settings` | +| **banksy-harness** | MCP client, agent orchestration loop | `openai`, `anthropic`, `mcp` SDK, `pydantic` | +| **banksy-shared** | Shared library consumed by server and harness | `pydantic`, `sqlalchemy`, `httpx` | + +``` +banksy-server ──depends-on──> banksy-shared +banksy-harness ──depends-on──> banksy-shared +``` + +Neither `banksy-server` nor `banksy-harness` depends on each other. Both depend only on +`banksy-shared`. This creates a clean DAG with no cycles. + +**Why three (not two, not four):** +- Too few (1-2): A single package forces LLM SDKs into the server. Two packages + (server + harness, no shared) either duplicates code or couples the harness to the + server's full dependency tree. +- Too many (4+): Splitting `banksy-shared` into sub-packages (`banksy-db`, + `banksy-auth`, `banksy-models`) adds `pyproject.toml` overhead without meaningful + isolation -- these would almost always be installed together. + +### 6.4 Agent Harness: Workspace-Relevant Constraints + +The agent harness is an MCP **client** that connects to MCP servers and orchestrates +agent loops with LLM providers. Its dependency profile diverges significantly from the +server: + +- **LLM SDKs**: `openai`, `anthropic`, `azure-identity`. The `openai` package alone + pulls in 6+ transitive dependencies. The server does not need any of them. +- **MCP client SDK**: Uses `mcp` SDK or `fastmcp.Client` for connecting to servers. + Different usage than the server side. +- **No FastMCP server deps**: The harness does not import server machinery, middleware, + or transport. + +This dependency divergence motivates a separate workspace member. Without it, deploying +the MCP server would install LLM SDKs unnecessarily (~30-50MB of packages, increased +security surface area). + +Deployment is uncertain: the harness could run in-process, as a sidecar, or as a +separate deployment. A separate workspace member with its own entry point supports all +three models naturally. + +Multiple agents with different focuses may exist over time, but they will likely share +the same base dependencies and differ in configuration/prompts, not code structure. +They are instances of a harness, not separate packages. If a future agent has truly +different dependencies, it can be added as a new workspace member -- the glob +`pypackages/*` picks it up automatically. + +### 6.5 Canvas-mcp: Domain, Not Package + +The canvas-mcp alignment assessment concluded that canvas-mcp's tools should be copied +into the server rather than maintained as a separate workspace member: + +1. canvas-mcp has only two trivial tools (`check_health`, `canvas_haiku`). The overhead + of a separate workspace member is not justified. +2. `mount()` ignores child auth -- the parent's auth applies anyway. +3. Custom routes and middleware would need re-registration on the parent. +4. The domain concept within the server (`domains/canvas/`) handles organizational + separation. + +Canvas-mcp tools are absorbed into `banksy-server` under `domains/canvas/` with +`register_canvas_tools(mcp)`. + +### 6.6 Workspace Limitations + +1. **No import boundary enforcement.** Python's module system does not prevent member A + from importing a transitive dependency installed for member B. Linting rules (ruff + import conventions) can partially enforce this. +2. **Single `requires-python`.** All members target compatible Python versions. +3. **Single lockfile resolution.** If two members need incompatible versions of the same + transitive dependency, the workspace cannot resolve. Rare in practice. + + +## 7. Recommended Repo Layout + +### 7.1 Directory Tree + +``` +banksy/ +├── packages/ # EXISTING TS (untouched during transition) +│ ├── banksy-core/ +│ ├── banksy-mural-api/ +│ └── banksy-public-api/ +├── pypackages/ # Python workspace members +│ ├── server/ +│ │ ├── pyproject.toml +│ │ ├── src/ +│ │ │ └── banksy_server/ +│ │ │ ├── __init__.py +│ │ │ ├── server.py +│ │ │ ├── config.py +│ │ │ ├── mural_api.py +│ │ │ ├── spa.py +│ │ │ ├── auth/ +│ │ │ │ ├── __init__.py +│ │ │ │ ├── providers.py +│ │ │ │ ├── sso_proxy.py +│ │ │ │ ├── mural_oauth.py +│ │ │ │ ├── token_manager.py +│ │ │ │ └── token_verifier.py +│ │ │ ├── domains/ +│ │ │ │ ├── __init__.py +│ │ │ │ ├── internal/ +│ │ │ │ │ ├── __init__.py +│ │ │ │ │ └── tools.py +│ │ │ │ ├── public/ +│ │ │ │ │ ├── __init__.py +│ │ │ │ │ └── tools.py +│ │ │ │ ├── canvas/ +│ │ │ │ │ ├── __init__.py +│ │ │ │ │ └── tools.py +│ │ │ │ └── shared/ +│ │ │ │ ├── __init__.py +│ │ │ │ └── tools.py +│ │ │ ├── routes/ +│ │ │ │ ├── __init__.py +│ │ │ │ ├── health.py +│ │ │ │ ├── session_activation.py +│ │ │ │ └── mural_oauth_callback.py +│ │ │ └── middleware/ +│ │ │ ├── __init__.py +│ │ │ ├── logging.py +│ │ │ └── metrics.py +│ │ └── tests/ +│ │ ├── conftest.py +│ │ ├── test_tools/ +│ │ ├── test_auth/ +│ │ └── test_integration/ +│ ├── harness/ +│ │ ├── pyproject.toml +│ │ ├── src/ +│ │ │ └── banksy_harness/ +│ │ │ ├── __init__.py +│ │ │ ├── agent.py +│ │ │ ├── config.py +│ │ │ ├── llm/ +│ │ │ │ ├── __init__.py +│ │ │ │ ├── openai.py +│ │ │ │ └── anthropic.py +│ │ │ └── mcp_client/ +│ │ │ ├── __init__.py +│ │ │ └── client.py +│ │ └── tests/ +│ │ ├── conftest.py +│ │ ├── test_agent/ +│ │ └── test_llm/ +│ └── shared/ +│ ├── pyproject.toml +│ ├── src/ +│ │ └── banksy_shared/ +│ │ ├── __init__.py +│ │ ├── models/ +│ │ │ ├── __init__.py +│ │ │ └── tokens.py +│ │ ├── auth/ +│ │ │ ├── __init__.py +│ │ │ └── token_utils.py +│ │ ├── mural_client/ +│ │ │ ├── __init__.py +│ │ │ └── client.py +│ │ └── observability/ +│ │ ├── __init__.py +│ │ └── logging.py +│ └── tests/ +│ ├── conftest.py +│ ├── test_models/ +│ └── test_auth/ +├── conftest.py # Shared test fixtures (root level) +├── ui/ # React SPA (standalone Node.js project) +│ ├── package.json +│ └── src/ +├── migrations/ # Alembic (shared DB schema) +│ ├── alembic.ini +│ ├── env.py +│ └── versions/ +├── pyproject.toml # Workspace root +├── uv.lock # Single lockfile +├── .python-version # 3.14 +├── package.json # EXISTING TS workspace root +├── package-lock.json # EXISTING npm lockfile +└── .github/workflows/ + ├── build.yml # EXISTING TS Docker builds + ├── quality.yml # EXISTING TS lint/test + └── python.yml # NEW Python CI +``` + +### 7.2 Workspace Root `pyproject.toml` + +```toml +[project] +name = "banksy-workspace" +version = "0.0.0" +description = "Banksy monorepo workspace root" +requires-python = ">=3.14" + +[tool.uv.workspace] +members = ["pypackages/*"] + +[build-system] +requires = ["hatchling"] +build-backend = "hatchling.build" + +[dependency-groups] +dev = [ + "pyright>=1.1.0", + "pytest>=8.0.0", + "pytest-asyncio>=0.24.0", + "ruff>=0.8.0", + "pre-commit>=4.0.0", +] + +[tool.ruff] +target-version = "py314" +src = ["pypackages/*/src", "pypackages/*/tests"] + +[tool.ruff.lint] +select = ["E", "F", "W", "I", "UP", "B", "SIM", "TCH"] + +[tool.pyright] +pythonVersion = "3.14" +typeCheckingMode = "strict" +include = ["pypackages/*/src", "pypackages/*/tests"] + +[tool.pytest.ini_options] +asyncio_mode = "auto" +``` + +### 7.3 Server `pyproject.toml` + +```toml +[project] +name = "banksy-server" +version = "0.1.0" +description = "Banksy MCP server" +requires-python = ">=3.14" +dependencies = [ + "banksy-shared", + "fastmcp>=3.1.0", + "httpx>=0.28.0", + "pydantic-settings>=2.0.0", + "sqlalchemy[asyncio]>=2.0.0", + "asyncpg>=0.30.0", + "alembic>=1.14.0", +] + +[tool.uv.sources] +banksy-shared = { workspace = true } + +[project.scripts] +banksy-server = "banksy_server.server:main" + +[build-system] +requires = ["hatchling"] +build-backend = "hatchling.build" + +[tool.pytest.ini_options] +testpaths = ["tests"] +``` + +### 7.4 Harness `pyproject.toml` + +```toml +[project] +name = "banksy-harness" +version = "0.1.0" +description = "Banksy agent harness (MCP client)" +requires-python = ">=3.14" +dependencies = [ + "banksy-shared", + "openai>=1.0.0", + "anthropic>=0.40.0", + "azure-identity>=1.19.0", + "mcp>=1.0.0", +] + +[tool.uv.sources] +banksy-shared = { workspace = true } + +[project.scripts] +banksy-harness = "banksy_harness.agent:main" + +[build-system] +requires = ["hatchling"] +build-backend = "hatchling.build" + +[tool.pytest.ini_options] +testpaths = ["tests"] +``` + +### 7.5 Shared `pyproject.toml` + +```toml +[project] +name = "banksy-shared" +version = "0.1.0" +description = "Shared models, auth, and utilities for banksy" +requires-python = ">=3.14" +dependencies = [ + "pydantic>=2.0.0", + "sqlalchemy[asyncio]>=2.0.0", + "httpx>=0.28.0", +] + +[build-system] +requires = ["hatchling"] +build-backend = "hatchling.build" + +[tool.pytest.ini_options] +testpaths = ["tests"] +``` + +### 7.6 Docker Builds + +**Server Dockerfile** (`Dockerfile.server`): + +```dockerfile +FROM python:3.14-slim AS builder +COPY --from=ghcr.io/astral-sh/uv:0.10 /uv /uvx /bin/ +WORKDIR /app + +# Copy workspace root config and lockfile +COPY pyproject.toml uv.lock ./ + +# Copy only the member pyproject.toml files needed for resolution +COPY pypackages/server/pyproject.toml pypackages/server/pyproject.toml +COPY pypackages/shared/pyproject.toml pypackages/shared/pyproject.toml + +# Install dependencies (no source code yet) +RUN --mount=type=cache,target=/root/.cache/uv \ + uv sync --frozen --no-install-workspace --no-dev --package banksy-server + +# Copy source code +COPY pypackages/server/src pypackages/server/src +COPY pypackages/shared/src pypackages/shared/src + +# Install workspace members (non-editable for production) +RUN --mount=type=cache,target=/root/.cache/uv \ + uv sync --locked --no-dev --no-editable --package banksy-server + +# SPA build stage (Node.js) +FROM node:22 AS spa-builder +WORKDIR /app/ui +COPY ui/package.json ui/package-lock.json ./ +RUN npm ci +COPY ui/ . +RUN npm run build + +# Production image +FROM python:3.14-slim +COPY --from=builder /app/.venv /app/.venv +COPY --from=spa-builder /app/ui/dist /app/ui/dist +COPY migrations /app/migrations +ENV PATH="/app/.venv/bin:$PATH" +WORKDIR /app +CMD ["banksy-server"] +``` + +**Harness Dockerfile** (`Dockerfile.harness`): + +```dockerfile +FROM python:3.14-slim AS builder +COPY --from=ghcr.io/astral-sh/uv:0.10 /uv /uvx /bin/ +WORKDIR /app + +COPY pyproject.toml uv.lock ./ +COPY pypackages/harness/pyproject.toml pypackages/harness/pyproject.toml +COPY pypackages/shared/pyproject.toml pypackages/shared/pyproject.toml + +RUN --mount=type=cache,target=/root/.cache/uv \ + uv sync --frozen --no-install-workspace --no-dev --package banksy-harness + +COPY pypackages/harness/src pypackages/harness/src +COPY pypackages/shared/src pypackages/shared/src + +RUN --mount=type=cache,target=/root/.cache/uv \ + uv sync --locked --no-dev --no-editable --package banksy-harness + +FROM python:3.14-slim +COPY --from=builder /app/.venv /app/.venv +ENV PATH="/app/.venv/bin:$PATH" +WORKDIR /app +CMD ["banksy-harness"] +``` + +The server image includes FastMCP, the SPA, and Alembic migrations but not LLM SDKs. +The harness image includes LLM SDKs but not FastMCP server machinery, the SPA, or +migrations. Both share `banksy-shared`. + +### 7.7 CI + +```yaml +# .github/workflows/python.yml +name: Python CI + +on: [push, pull_request] + +jobs: + lint: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + - uses: astral-sh/setup-uv@v4 + - run: uv sync --frozen + - run: uv run ruff check . + - run: uv run ruff format --check . + - run: uv run pyright + + test-server: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + - uses: astral-sh/setup-uv@v4 + - run: uv sync --frozen --package banksy-server + - run: uv run --package banksy-server pytest + + test-harness: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + - uses: astral-sh/setup-uv@v4 + - run: uv sync --frozen --package banksy-harness + - run: uv run --package banksy-harness pytest + + test-shared: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + - uses: astral-sh/setup-uv@v4 + - run: uv sync --frozen --package banksy-shared + - run: uv run --package banksy-shared pytest +``` + +### 7.8 Test Co-location + +Tests are co-located inside each workspace member (`pypackages/server/tests/`, +`pypackages/harness/tests/`, `pypackages/shared/tests/`) rather than in a top-level +`tests/` directory. This is the dominant pattern for uv workspaces (modeled after Cargo) +and keeps tests coupled to the code they verify. + +A root-level `conftest.py` provides shared fixtures (database factories, mock Mural API +responses). Per-member `conftest.py` files provide fixtures specific to that member +(mock FastMCP server instances, domain-specific test data). pytest auto-discovers +`conftest.py` files up the directory tree. + +Per-member `[tool.pytest.ini_options]` sets `testpaths = ["tests"]`, so +`uv run --package X pytest` discovers the correct tests automatically. For stricter +dependency isolation, `uv run --exact --package X pytest` ensures only that member's +declared dependencies are available. + +### 7.9 Rejected Layout Alternatives + +**2-Member Workspace (Server + Harness, No Shared):** The harness would depend on the +server package to access shared code, coupling their deployments and install footprints. +The harness `pip install` would pull FastMCP and all server deps. Rejected. + +**Single Package with Optional Dependency Groups:** No import isolation, no independent +deployment, optional deps not enforced at import time. Adequate if the harness were a +small add-on; it isn't. Rejected. + +### 7.10 Transition Period Coexistence + +- `packages/` holds TypeScript npm workspace members (read-only reference) +- `pypackages/` holds Python uv workspace members (new code) +- `package.json` and `pyproject.toml` coexist at the repo root +- `.github/workflows/quality.yml` handles TS CI; `.github/workflows/python.yml` handles + Python CI +- No naming collision: npm workspaces use `packages/*`, uv workspaces use `pypackages/*` + +### 7.11 After TS Cleanup + +``` +banksy/ +├── pypackages/ # Optionally renamed to packages/ +│ ├── server/ +│ │ ├── pyproject.toml +│ │ ├── src/banksy_server/ +│ │ └── tests/ +│ ├── harness/ +│ │ ├── pyproject.toml +│ │ ├── src/banksy_harness/ +│ │ └── tests/ +│ └── shared/ +│ ├── pyproject.toml +│ ├── src/banksy_shared/ +│ └── tests/ +├── conftest.py +├── ui/ +├── migrations/ +├── pyproject.toml # Workspace root +├── uv.lock +├── .python-version +├── Dockerfile.server +├── Dockerfile.harness +├── Makefile +└── .github/workflows/ +``` + +Once `packages/` (TS) is deleted, `pypackages/` can optionally be renamed to `packages/` +for simplicity, updating the workspace glob accordingly. + +### 7.12 Import Examples + +```python +# From banksy-server code: +from banksy_shared.models.tokens import TokenRecord +from banksy_shared.auth.token_utils import verify_token +from banksy_server.domains.internal.tools import get_mural_content +from banksy_server.config import ServerSettings + +# From banksy-harness code: +from banksy_shared.models.tokens import TokenRecord +from banksy_shared.mural_client.client import MuralApiClient +from banksy_harness.llm.openai import call_openai +from banksy_harness.config import HarnessSettings +``` + +The harness never imports from `banksy_server`. The server never imports from +`banksy_harness`. Both import from `banksy_shared`. + +### 7.13 Adding Future Workspace Members + +When a new service is needed (e.g., a webhook receiver, a data pipeline): + +```bash +cd pypackages +uv init new-service --package +``` + +This creates `pypackages/new-service/pyproject.toml` and the `src/` layout. The +workspace glob `pypackages/*` picks it up automatically. + + +## 8. Code Organization: Domains within Workspace + +### 8.1 Directory Structure + +Within the `banksy-server` workspace member, tools are organized by domain: + +``` +pypackages/server/src/banksy_server/ + __init__.py + server.py # Entry point: reads BANKSY_MODE, wires auth + domains + config.py # pydantic-settings with BANKSY_MODE, DB URLs, auth config + mural_api.py # FastMCP.from_openapi() integration + spa.py # SpaStaticFiles for serving the React auth UI + auth/ + __init__.py + providers.py # create_auth_provider(mode) -> AuthProvider | None + sso_proxy.py # OAuthProxy/RemoteAuthProvider for SSO proxy mode + mural_oauth.py # OAuthProxy wrapping Mural OAuth + token_manager.py # Layer 2: per-user Mural token CRUD and refresh + token_verifier.py # BanksyTokenVerifier (custom subclass of TokenVerifier) + domains/ + __init__.py + internal/ + __init__.py # register_internal_tools(mcp) function + tools.py # Tool definitions (from_openapi or manual) + public/ + __init__.py # register_public_tools(mcp) function + tools.py # Tool definitions (from_openapi or manual) + canvas/ + __init__.py # register_canvas_tools(mcp) function + tools.py + shared/ + __init__.py # Utility tools shared across modes (echo, health) + tools.py + routes/ + __init__.py + health.py # GET /health custom_route + session_activation.py # POST /auth/mural-link/code, /claim (internal mode) + mural_oauth_callback.py # GET /auth/mural-oauth/callback (public mode) + middleware/ + __init__.py + logging.py # MCP protocol middleware for request logging + metrics.py # MCP protocol middleware for Datadog metrics +``` + +Database models and auth utilities live in `banksy-shared`, not inside the server: + +``` +pypackages/shared/src/banksy_shared/ + __init__.py + models/ + __init__.py + tokens.py # mural_tokens, pending_connections tables + auth/ + __init__.py + token_utils.py # Token verification, refresh helpers + mural_client/ + __init__.py + client.py # httpx-based Mural API wrapper + observability/ + __init__.py + logging.py # Structured logging, metrics utilities +``` + +### 8.2 Domains vs. Packages: Two Orthogonal Concepts + +Workspace members and domains serve different purposes: + +| Concern | Mechanism | Granularity | +|---------|-----------|-------------| +| Tool organization within the server | `domains/` directories with `register_*_tools(mcp)` | Per-tool-group | +| Dependency isolation between services | uv workspace members | Per-deployable | + +They coexist without conflict. The server workspace member contains `domains/` +internally. The workspace boundary wraps the server (and its domains) as a single unit. + +### 8.3 Key Design Decisions + +**`domains/` directory:** Each tool domain is a self-contained module with a +`register_*_tools(mcp)` function. This function takes a `FastMCP` instance and registers +all tools for that domain, including tags and metadata. The domain owns its tool +definitions, schemas, and any domain-specific helpers. + +**`auth/providers.py` as factory:** A single `create_auth_provider(mode)` function +returns the correct `AuthProvider` for the given mode. This keeps the server entry point +clean and makes mode-specific auth configuration testable in isolation. + +**`routes/` for custom HTTP endpoints:** Non-MCP HTTP routes (health, auth callbacks, +OAuth flows) are organized by concern, not by mode. Mode-specific routes are registered +conditionally in `server.py` based on `BANKSY_MODE`. + +**`shared/` domain:** Tools that work under any auth mode (or no auth) live here. The +`echo` tool and future diagnostic tools belong in this domain. + +**Shared code in `banksy-shared`, not `banksy-server`:** Database models, auth +utilities, and Mural API client code that both the server and harness consume live in +the `banksy-shared` workspace member. This avoids forcing the harness to depend on the +server package. + +### 8.4 `from_openapi()` Generated Tools + +`FastMCP.from_openapi()` tools are generated at import time from OpenAPI specs, not +hand-written files. In the domain structure, they are invoked from within a domain's +registration function: + +```python +# banksy_server/domains/internal/__init__.py +from banksy_server.mural_api import create_mural_api_tools + +def register_internal_tools(mcp: FastMCP) -> None: + from .tools import my_composite_tool + mcp.add_tool(my_composite_tool) + + mural_api = create_mural_api_tools() + mcp.mount("mural", mural_api) +``` + +The `from_openapi()` call produces a FastMCP sub-server. It can be mounted or its tools +registered individually. Either way, it is invoked from a domain registration function, +not represented as files in the directory tree. + + +## 9. Tag Taxonomy + +Tags are a client-side filtering mechanism. They organize tools for discoverability and +enable specialized deployments via `enable()`. Tags are NOT a security boundary. + +### Taxonomy Dimensions + +Three orthogonal dimensions allow cross-cutting queries: + +**Domain tags** (which API surface): +- `internal-api` -- tools calling mural-api internal REST +- `public-api` -- tools calling mural-api public REST +- `canvas` -- future canvas-mcp tools +- `utility` -- diagnostic and infrastructure tools + +**Entity tags** (what resource type): +- `murals` -- mural CRUD and lifecycle +- `workspaces` -- workspace operations +- `rooms` -- room management +- `widgets` -- widget creation and manipulation +- `templates` -- template operations +- `users` -- user management and invitations +- `assets` -- file and image assets +- `voting` -- voting session management +- `labels` -- label/tag operations +- `search` -- search across entities + +**Capability tags** (what operation type): +- `read` -- read-only operations (GET) +- `write` -- create/update operations (POST/PUT/PATCH) +- `delete` -- destructive operations (DELETE) +- `admin` -- administrative operations (user creation, company setup) + +### Example Tool Tagging + +```python +# Internal mode tool +@mcp.tool(tags={"internal-api", "murals", "read"}) +def get_mural_by_id(mural_id: str) -> dict: ... + +# Public mode tool +@mcp.tool(tags={"public-api", "widgets", "write"}) +def create_sticky_note(mural_id: str, text: str) -> dict: ... + +# Utility tool (no auth needed) +@mcp.tool(tags={"utility", "read"}) +def echo(message: str) -> str: ... +``` + +### Cross-Cutting Queries + +Tags enable queries like: +- "All read-only mural tools" -> `{"murals", "read"}` +- "All widget tools" -> `{"widgets"}` +- "All admin tools" -> `{"admin"}` +- "All public API tools for rooms" -> `{"public-api", "rooms"}` + +### Specialized Deployments + +Tags combined with `enable()` enable deployment-time subsetting: + +```python +# Read-only deployment for auditors +mcp.enable(tags={"read"}, only=True) + +# Mural-focused deployment +mcp.enable(tags={"murals"}, only=True) + +# Full deployment (default -- no filtering) +# Don't call enable() at all +``` + +### Tag Governance + +- Domain tags are mandatory: every tool must have exactly one domain tag +- Entity tags are mandatory: every tool must have at least one entity tag +- Capability tags are mandatory: every tool must have exactly one of `read`, `write`, + `delete`, or `admin` +- New tags require updating this taxonomy document +- Tags must use lowercase kebab-case + + +## 10. Migration Plan Impact + +The following updates are needed to the migration execution strategy plan at +`willik-notes/fastmcp-migration/execution-strategy-research/00-migration-execution-strategy.plan.md`. + +### 10.1 Repo Layout (Lines 97-196) + +**Current:** Shows `src/banksy/` as a single Python package at the repo root. + +**Updated:** Replace with the workspace layout from Section 7.1: +- `src/banksy/` becomes `pypackages/server/src/banksy_server/` +- Add `pypackages/shared/src/banksy_shared/` for extracted shared code +- Add `pypackages/harness/src/banksy_harness/` (placeholder or populated later) +- `pyproject.toml` at root becomes workspace root instead of single project +- `tests/` moves from top-level to co-located inside each workspace member + +The "After Cleanup (Phase 9)" layout (lines 160-192) is replaced with the workspace +post-cleanup layout from Section 7.11. + +### 10.2 Resolution of the Mode Merging Open Question + +The open question at approximately line 318 ("A dedicated research prompt is needed to +explore whether mode merging is feasible or whether mode selection should be preserved as +a runtime configuration flag") is now resolved: + +**Mode merging is not recommended. Mode selection is preserved as a runtime +configuration flag (`BANKSY_MODE`).** + +Rationale: Auth modes are capability constraints, not policy choices. Internal and public +tools call different APIs with incompatible token types. FastMCP's one-auth-per-server +constraint means a single server cannot cleanly handle multiple auth strategies. MCP +clients support multiple servers, so separate deployments per auth mode is transparent to +users. + +### 10.3 Server Topology Updates + +The plan's current server topology section describes `mount()` for composing sub-servers +across auth modes: + +```python +mcp.mount("internal", internal_api) +mcp.mount("public", public_api) +``` + +This should be updated to reflect that `mount()` is used within a single mode for +organizing tools by namespace, not across auth modes: + +```python +# In BANKSY_MODE=internal +mcp.mount(internal_api, namespace="mural") + +# In BANKSY_MODE=public +mcp.mount(public_api, namespace="mural") +mcp.mount(canvas_tools, namespace="canvas") # future +``` + +### 10.4 Docker Build Updates + +**Current:** Single Dockerfile builds one Python image. + +**Updated:** Two Dockerfiles (`Dockerfile.server`, `Dockerfile.harness`) as shown in +Section 7.6. Both use workspace-aware `--no-install-workspace` and `--package` flags. +The two current TS Dockerfiles (`Dockerfile` and `Dockerfile.mural-oauth`) are replaced +by a single server Dockerfile (mode is runtime config, not build-time). + +### 10.5 Config Schema Updates + +Add to the `config.py` pydantic-settings model: + +```python +class Settings(BaseSettings): + banksy_mode: Literal["internal", "public", "dev"] = "dev" + enabled_tags: set[str] | None = None + # ... existing fields ... +``` + +Add `BANKSY_MODE` and `ENABLED_TAGS` to the Environment Variable Changes table +(lines 490-500). + +### 10.6 CI Updates + +**Current:** Single `python.yml` workflow runs tests for one Python package. + +**Updated:** `python.yml` runs per-package test jobs in parallel (server, harness, +shared) plus a shared lint/typecheck job. See Section 7.7. + +### 10.7 Phase Updates + +**Phase 1 (Bootstrap):** +- Create workspace root `pyproject.toml` with `[tool.uv.workspace]` +- Create `pypackages/server/` with server `pyproject.toml` and echo tool +- Create `pypackages/shared/` with shared `pyproject.toml` (minimal, can start empty) +- `pypackages/harness/` is omitted or a placeholder until agent work begins + +**Phase 2 (Public API tools):** +- Register public tools when `BANKSY_MODE=public` or `BANKSY_MODE=dev` +- Apply `public-api` domain tag to all public tools +- Wire Mural OAuth auth provider for public mode + +**Phase 3 (Tool Curation):** +- Implement tag taxonomy from Section 9 +- Add `enable(tags=..., only=True)` support for specialized deployments +- Tag-based visibility is a refinement layer, not a replacement for mode selection + +**Phase 4+ (Internal API tools):** +- Register internal tools when `BANKSY_MODE=internal` or `BANKSY_MODE=dev` +- Apply `internal-api` domain tag to all internal tools +- Wire SSO proxy auth provider for internal mode + +### 10.8 Escape Hatches Update + +**Current:** Lists "Single pyproject.toml → uv workspaces: If a second Python service +is needed in the same repo, graduate to uv workspaces." + +**Updated:** Remove this escape hatch -- it has been exercised. Replace with: +"3-member workspace → additional members: If a new Python service with distinct +dependencies is needed, add a directory under `pypackages/` with its own +`pyproject.toml`." + +### 10.9 Decision 8 Resolution (Tool Tags and Meta) + +Decision 8 in the plan ("Tool tags and meta -- use as tag-based visibility in Phase 3, or +leave for implementation") should be resolved as: + +**Use tags as the primary tool organization mechanism.** Follow the three-dimensional +taxonomy (domain, entity, capability) defined in Section 9. Tags are mandatory on all +tools. `meta={}` can carry additional structured metadata (e.g., API version, rate limit +hints) but is not used for visibility filtering. + +### 10.10 Deep Research Index + +Add to the Deep Research Index: + +- [Tool visibility, auth modes, and server topology](../banksy/docs/tool-visibility-server-topology-research.md) -- Auth x tool matrix, FastMCP constraints, Option A-E analysis, tag taxonomy +- [Monorepo layout for MCP servers and agent harness](../banksy/docs/monorepo-layout-agent-harness-research.md) -- uv workspaces, package boundaries, agent harness constraints, layout options +- [Combined architecture research](../banksy/docs/banksy-architecture-research.md) -- Unified server topology, workspace layout, code organization, and migration impact + +### 10.11 Sections Not Affected + +The following sections of the migration strategy remain unchanged: +- Auth architecture (SSO proxy, Mural OAuth, Session Activation) +- Tool migration approach (from_openapi, hand-written composites) +- SPA architecture (Vite, React, served via StaticFiles) +- Database schema (PostgreSQL, Alembic, token storage) +- Risk matrix (risks remain the same; workspace adds no new risks) + + +## 11. Open Questions + +### 11.1 Composite Tools Needing Multiple Auth Modes + +No composite tools exist today that call both internal and public APIs. If one is needed: + +- **Option A:** Run it in a mode that has access to one API, and call the other API via + a service-to-service token (machine-to-machine auth). +- **Option B:** Split the composite operation into two tools, one per mode, and let the + AI agent orchestrate them. +- **Option C:** Create a new mode that carries both token types (requires custom Layer 2 + token management). + +**Recommendation:** Defer until a concrete use case exists. Option B (agent orchestration) +is the most aligned with MCP's design philosophy of composable tools. + +### 11.2 Enterprise Per-Customer Tool Subsets + +Some enterprise customers may need restricted tool sets (e.g., read-only, no admin +tools). This is achievable with the recommended architecture: + +- Deploy with `BANKSY_MODE=public` and `ENABLED_TAGS=read` to restrict to read-only tools +- Or implement a custom `Transform` that filters tools based on claims in the user's + token (e.g., organization membership, role) +- No architectural changes needed + +### 11.3 API Key Auth for Machine-to-Machine + +The `m2m` auth mode is already defined as a type in the current TS codebase +(`auth-mode.ts`). In the FastMCP migration: + +- Add `BANKSY_MODE=m2m` as a valid mode +- Implement a `TokenVerifier` that validates API keys (or client_credentials JWTs) +- Register the appropriate tool set for machine-to-machine use cases +- Same architecture, new mode value + +### 11.4 canvas-mcp Auth Requirements + +The canvas-mcp prototype currently has no auth. When absorbed into banksy: + +- Confirm that canvas operations use Mural's public API (most likely) +- If so, canvas tools belong in `BANKSY_MODE=public` with a `canvas` domain tag +- If canvas tools need internal API access, they belong in `BANKSY_MODE=internal` +- The `domains/canvas/` structure is ready for either case + +### 11.5 `from_openapi()` and Mode Selection + +The migration plan uses `FastMCP.from_openapi()` to generate tools from OpenAPI specs. +This needs to work with mode selection: + +- In `BANKSY_MODE=public`: load the public API OpenAPI spec, generate tools, tag with + `public-api` +- In `BANKSY_MODE=internal`: load the internal API OpenAPI spec, generate tools, tag with + `internal-api` +- In `BANKSY_MODE=dev`: load both specs, generate both tool sets with appropriate + namespace prefixes + +Verify that `from_openapi()` supports passing `tags=` to generated tools. If not, tags +can be applied via a `Transform` after generation. + +### 11.6 Migration Ordering + +The migration plan currently targets public API tools first (Phase 2). This research +confirms that is the right sequencing: + +- Public mode is the primary external-facing deployment +- Public API tools (87) outnumber internal tools (39) by more than 2:1 +- canvas-mcp absorption (public mode) will happen alongside public tool migration +- Internal mode can be migrated later with the SSO proxy auth provider + +### 11.7 Workspace Phasing + +Workspaces don't require all members to exist from day one: + +1. **Phase 1**: Create workspace structure with `server` and `shared` as members. + `harness` can be an empty placeholder or omitted. +2. **Phases 2-8**: Migration proceeds as planned, but code goes into + `pypackages/server/` instead of `src/banksy/`. +3. **When agent work begins**: Add or flesh out `harness/` under `pypackages/`. The + workspace glob picks it up automatically. diff --git a/fastmcp-migration/banksy-research/canvas-mcp-alignment-assessment.md b/fastmcp-migration/banksy-research/canvas-mcp-alignment-assessment.md new file mode 100644 index 0000000..7fcf9b9 --- /dev/null +++ b/fastmcp-migration/banksy-research/canvas-mcp-alignment-assessment.md @@ -0,0 +1,178 @@ +# Canvas-MCP Alignment Assessment + +Assessment of `tactivos/canvas-mcp` against the banksy FastMCP migration plan (`00-migration-execution-strategy.plan.md`). The goal is to ensure the banksy migration produces a project that can fully absorb canvas-mcp's tools and supporting code — either via `mount()` or direct migration. + +**canvas-mcp repo**: `/Users/wkirkham/dev/canvas-mcp/` +**banksy migration plan**: `/Users/wkirkham/dev/willik-notes/fastmcp-migration/execution-strategy-research/00-migration-execution-strategy.plan.md` + +--- + +## 1. Toolchain Comparison + +| Category | Banksy Plan | canvas-mcp Actual | Status | +|---|---|---|---| +| Package manager | uv, PEP 621 `pyproject.toml`, hatchling build | uv, PEP 621 `pyproject.toml`, **no `[build-system]`** | Partial match | +| Python version | `3.12` (`.python-version`) | `3.14` (`.python-version`, `pyrightconfig.json`, CI) | **Diverges** | +| Linting/formatting | Ruff (`select = ["E","W","F","I","B","UP","S","ASYNC","RUF"]`, `line-length = 88`, `quote-style = "double"`) | **None** — no Ruff config, no lint step in CI | **Diverges** | +| Type checking | Pyright (standard) + mypy w/ Pydantic plugin | Pyright only (**strict** mode, standalone `pyrightconfig.json`) | Partial match | +| Testing | pytest + pytest-asyncio, `asyncio_mode = "auto"` | pytest only — no pytest-asyncio, no `asyncio_mode` | Partial match | +| HTTP client | httpx + httpx-retries | **None** — no HTTP client | Diverges (n/a currently) | +| Configuration | pydantic-settings `BaseSettings` | **None** — bare `os.environ.get()` for `ENABLE_PROFILING` | **Diverges** | +| Database | SQLAlchemy async + Alembic | **None** | Diverges (n/a currently) | +| Git hooks | pre-commit + pre-commit-uv | **None** | **Diverges** | +| CI | `python.yml`: ruff check, ruff format --check, pyright, pytest | `build.yml`: uv sync, pyright, pytest — **no lint step** | Partial match | + +### Key Divergences + +- **Python 3.14 vs 3.12**: canvas-mcp targets Python 3.14 (pre-release as of March 2026). Banksy plan specifies 3.12. This is the most significant toolchain divergence — it affects base image availability, dependency compatibility, and CI reproducibility. +- **No Ruff**: canvas-mcp has zero linting or formatting configuration. No `[tool.ruff]` in `pyproject.toml`, no CI lint step. +- **No `[build-system]`**: canvas-mcp's `pyproject.toml` omits the `[build-system]` table entirely. Banksy plan specifies hatchling. This matters for `pip install -e .` and wheel builds. +- **Pyright strict vs standard**: canvas-mcp uses `typeCheckingMode = "strict"` in a standalone `pyrightconfig.json`. Banksy plan uses `"standard"` in `pyproject.toml`. Strict is stricter but may cause friction when absorbing code that only passes standard checks. +- **Dev deps via `[dependency-groups]` vs `[project.optional-dependencies]`**: canvas-mcp uses PEP 735 dependency groups; banksy plan uses the traditional `[project.optional-dependencies].dev` pattern. Both work with uv, but the banksy plan's approach is more portable to non-uv tools. + +--- + +## 2. Project Structure Comparison + +| Aspect | Banksy Plan | canvas-mcp | +|---|---|---| +| Layout | `src/banksy/` package (src layout) | Flat — single `main.py` at repo root | +| Server instantiation | `src/banksy/server.py` | `main.py` line ~40 | +| Tool organization | `src/banksy/tools/` directory, one file per tool | All tools inline in `main.py` | +| Config module | `src/banksy/config.py` (pydantic-settings `BaseSettings`) | None — raw `os.environ.get()` | +| Auth | `src/banksy/auth/` package (OAuth, sessions, Mural tokens) | None | +| Database | `src/banksy/db/` package (SQLAlchemy models, session, storage) | None | +| Tests | `tests/` with `conftest.py`, per-feature test modules | `tests/test_health.py` (single file) | +| Entry point | `uv run fastmcp dev src/banksy/server.py` | `uv run fastmcp run main.py --transport http` or `python main.py` | + +### Key Observations + +- canvas-mcp is a **single-file prototype** (`main.py` ~97 lines of code). Banksy's target is a multi-package project. The structural gap is large but expected — canvas-mcp has 2 placeholder tools, banksy will have 50+. +- canvas-mcp uses a flat layout (`pythonpath = ["."]` in pytest config), which conflicts with banksy's `src/` layout. When absorbed, all imports (`from main import ...`) break. +- No separation of concerns: server instantiation, tool definitions, middleware, config, and health endpoint all live in one file. + +--- + +## 3. FastMCP Usage Patterns + +| Pattern | Banksy Plan | canvas-mcp | +|---|---|---| +| `FastMCP.from_openapi()` | Core pattern — generates all Mural API tools | Not used | +| Tool definition | `@server.tool()` decorators in separate files | `@mcp.tool()` decorators inline in `main.py` | +| `mount()` sub-servers | OpenAPI sub-server mounted onto main server | Not used | +| Custom routes | Not mentioned in plan | `@mcp.custom_route("/health")` for health endpoint | +| Middleware | Not mentioned in plan | `ProfilingMiddleware` (Pyinstrument-based, opt-in via `ENABLE_PROFILING`) | +| Server variable name | `server` (implied by `server.py`) | `mcp` | +| Tool metadata | Not specified | Uses `tags={}` and `meta={}` kwargs | +| Entry point | `uv run fastmcp dev src/banksy/server.py` | `mcp.run(transport="http")` or `fastmcp run main.py` | + +### Features canvas-mcp uses that banksy plan doesn't account for + +1. **`mcp.custom_route()`** — canvas-mcp adds a `GET /health` route for load balancer probes. Banksy will likely need this too. The plan should add a health endpoint. +2. **`Middleware` class** — canvas-mcp uses `fastmcp.server.middleware.Middleware` for profiling. The banksy plan doesn't mention middleware, but this is a clean pattern for cross-cutting concerns (logging, auth context, etc.). +3. **Tool `tags` and `meta`** — canvas-mcp annotates tools with tags and metadata. Banksy's PR3 (tool curation) mentions "tag-based tool visibility scaffolding" but doesn't specify the mechanism. + +### Patterns that would conflict + +- **Server variable naming**: canvas-mcp uses `mcp`, banksy plan implies `server` or `app`. Minor but relevant for merge. +- **Flat imports**: canvas-mcp tools import from `main`, which won't work in banksy's `src/banksy/` package structure. + +--- + +## 4. Dependency Delta + +### In canvas-mcp but NOT in banksy plan + +| Dependency | Version | Group | Notes | +|---|---|---|---| +| `memray` | `>=1.19.0` | dev | Memory profiler. | +| `pyinstrument` | `>=5.1.0` | dev | CPU profiler used by `ProfilingMiddleware`. | + +### In banksy plan but NOT in canvas-mcp + +| Dependency | Version | Group | Notes | +|---|---|---|---| +| `httpx` | `>=0.27` | prod | Required for `from_openapi()`. canvas-mcp has no HTTP calls. | +| `httpx-retries` | `>=0.1` | prod | Retry wrapper. | +| `pydantic` | `>=2.0` | prod | Explicit dep. FastMCP bundles it transitively but banksy pins it. | +| `pydantic-settings` | `>=2.0` | prod | Config management. | +| `sqlalchemy[asyncio]` | `>=2.0` | prod | ORM + async. | +| `asyncpg` | `>=0.30` | prod | PostgreSQL driver. | +| `alembic` | `>=1.15` | prod | Migrations. | +| `pytest-asyncio` | `>=1.0` | dev | Async test support. | +| `pytest-cov` | `>=6.0` | dev | Coverage. | +| `inline-snapshot` | `>=0.15` | dev | Snapshot testing. | +| `dirty-equals` | `>=0.8` | dev | Flexible assertions. | +| `ruff` | `>=0.9` | dev | Linter/formatter. | +| `mypy` | `>=1.14` | dev | Type checker (w/ Pydantic plugin). | +| `pydantic[mypy]` | `>=2.0` | dev | mypy plugin support. | + +### Shared dependencies with version range differences + +| Dependency | Banksy Plan | canvas-mcp | Delta | +|---|---|---|---| +| `fastmcp` | `>=3.1` | `>=3.1.0` | Equivalent (`.0` is implicit) | +| `pyright` | `>=1.1` | `>=1.1.0` | Equivalent | +| `pytest` | `>=8.0` | `>=8.0.0` | Equivalent | + +No actual version conflicts in shared deps. + +--- + +## 5. Absorption Feasibility + +### Option A: Mount as FastMCP sub-server via `mount()` + +**Feasibility: Medium-Low in current state** + +- canvas-mcp's tools are trivial placeholders (health check, haiku). There's no real business logic to mount. +- The `mcp.custom_route("/health")` pattern is useful but custom routes don't compose via `mount()` — they'd need to be re-registered on the parent server. +- The `ProfilingMiddleware` is a reusable pattern but middleware also doesn't compose automatically via `mount()`. +- **Verdict**: Not worth mounting as a sub-server. The tools are too simple and the infrastructure patterns should be adopted directly. + +### Option B: Copy tools directly into `src/banksy/tools/` + +**Feasibility: High (trivial)** + +- The two tools (`check_health`, `canvas_haiku`) are self-contained functions with no external dependencies. Copying them into `src/banksy/tools/health.py` and registering them on the banksy server is straightforward. +- The `_health_payload()` helper and `/health` custom route could be adopted into banksy's server setup. + +### Shared concerns that need unification + +| Concern | canvas-mcp pattern | Banksy plan pattern | Unification needed | +|---|---|---|---| +| Config | `os.environ.get()` | pydantic-settings `BaseSettings` | Yes — canvas-mcp config must migrate to BaseSettings | +| HTTP client | None | httpx.AsyncClient (lifespan-managed) | N/A until canvas-mcp adds Mural API calls | +| Auth | None | OAuth + Mural token mgmt | N/A until canvas-mcp adds auth | +| Server instance | `mcp = FastMCP(...)` in `main.py` | `server` in `src/banksy/server.py` | Naming convention alignment | +| Health endpoint | `@mcp.custom_route("/health")` | Not in plan | Banksy should adopt this pattern | + +### Architectural conflicts + +- **None that are blocking.** canvas-mcp is small enough that all patterns can be adapted. +- The only structural friction is the flat layout vs src layout, and the `[dependency-groups]` vs `[project.optional-dependencies]` pattern. + +### Refactoring needed at absorption time + +1. Extract tools from `main.py` into individual files under banksy's `src/banksy/tools/` structure. +2. Replace `os.environ.get()` config with pydantic-settings. +3. Change all imports from flat `main` module to `banksy.*` package paths. +4. Adopt banksy's `[build-system]` (hatchling) and `[project.optional-dependencies]` pattern. +5. Align Python version requirement. + +--- + +## 6. Decisions Needed + +Each divergence must be resolved in the banksy migration plan so the resulting project can absorb canvas-mcp cleanly. For each item below, the decision determines whether banksy adapts, canvas-mcp adapts at absorption time, or both meet in the middle. + +| # | Topic | Options | Affects | Status | +|---|---|---|---|---| +| 1 | Python version (banksy 3.12 vs canvas-mcp 3.14) | (a) Stay 3.12 — canvas-mcp downgrades at absorption (b) Move banksy to 3.13 — both meet in the middle (c) Move banksy to 3.14 — adopt canvas-mcp's version | PR1, canvas-mcp | Resolved -- 3.14. See [Python 3.14 compatibility research](python-314-compatibility-research.md) | +| 2 | Pyright mode (banksy standard vs canvas-mcp strict) | (a) Stay standard — canvas-mcp relaxes at absorption (b) Adopt strict in banksy | PR1, canvas-mcp | Resolved -- strict. See [Pyright strict dependency typing research](pyright-strict-dependency-typing-research.md) | +| 3 | mypy (banksy has it, canvas-mcp doesn't) | (a) Keep mypy + Pydantic plugin in banksy (b) Drop mypy, go Pyright-only (aligns with canvas-mcp) | PR1 | Resolved -- drop mypy, Pyright strict only | +| 4 | Dev dep format (banksy `[project.optional-dependencies]` vs canvas-mcp `[dependency-groups]`) | (a) Stay `optional-dependencies` — canvas-mcp switches at absorption (b) Switch banksy to PEP 735 `dependency-groups` | PR1, canvas-mcp | Resolved -- PEP 735 `[dependency-groups]` | +| 5 | Health endpoint (`GET /health` custom route) | (a) Add to banksy in PR1 (alongside echo tool) (b) Add to banksy in PR2 (alongside OpenAPI tools) (c) Skip — add only at absorption time | PR1 or PR2 | Resolved -- incorporated into plan | +| 6 | Profiling middleware (canvas-mcp `ProfilingMiddleware`) | (a) Adopt in banksy PR1 as opt-in dev tool (b) Skip — add only at absorption time | PR1 | Resolved -- skip, absorb with canvas-mcp | +| 7 | Profiling dev deps (`pyinstrument`, `memray`) | (a) Add to banksy dev deps (b) Skip — canvas-mcp-only concern | PR1 | Resolved -- skip, absorb with canvas-mcp | +| 8 | Tool tags and meta (`tags={}`, `meta={}` kwargs) | (a) Specify as the mechanism for tag-based visibility in PR3 (b) Leave PR3 unspecified — decide during implementation | PR3 | Open | diff --git a/fastmcp-migration/banksy-research/fastmcp-auth-migration-research.md b/fastmcp-migration/banksy-research/fastmcp-auth-migration-research.md new file mode 100644 index 0000000..3a050db --- /dev/null +++ b/fastmcp-migration/banksy-research/fastmcp-auth-migration-research.md @@ -0,0 +1,1095 @@ +# FastMCP Auth Architecture Compatibility with Banksy Auth Modes + +## 1. Executive Summary + +FastMCP's auth system can fully support banksy's target Resource Server (RS) architecture. Layer 1 (IDE → banksy) maps cleanly: `RemoteAuthProvider` with `JWTVerifier` validates IdP-issued JWTs and auto-generates the RFC 9728 Protected Resource Metadata endpoint — replacing Better Auth's MCP plugin, session cookies, and `/.well-known/oauth-authorization-server` in under 20 lines of Python. If the chosen IdP lacks DCR, `OAuthProxy` serves as a fallback, acting as a local Authorization Server that proxies to the upstream IdP, issues its own JWTs, and stores upstream tokens encrypted. Layer 2 (banksy → Mural API) has no FastMCP equivalent and must remain custom: per-user Mural token storage, refresh, and per-request injection via a `Depends()`-style dependency that reads `TokenClaim("sub")` to look up the user's Mural tokens from the database. The SPA OAuth callback pages are eliminated — both `OAuthProxy` and standard Starlette routes handle OAuth callbacks server-side — but the React SPA itself is preserved for the home/landing page, Session Activation UI, and completion/error pages, served from the same FastMCP process via Starlette's `StaticFiles` (see [SPA serving research](fastmcp-react-spa-serving-research.md) for details). The biggest implementation-level decisions are: (1) `RemoteAuthProvider` vs `OAuthProxy` (determined by IdP DCR support), (2) where to store Layer 2 Mural tokens (IdP upstream token storage vs banksy's own DB), and (3) whether Session Activation survives or is replaced by IdP-mediated Mural token acquisition. + +## 2. Detailed Findings + +### 2.1 Session Cookies vs. Bearer Tokens (Q1) + +**Finding: Bearer tokens are required by the MCP protocol. Session cookies are not supported.** + +The MCP specification mandates OAuth 2.1 with Bearer token authentication. MCP clients (Cursor, Claude Desktop, VS Code Copilot) send `Authorization: Bearer ` on every HTTP request to the MCP endpoint. FastMCP enforces this via `BearerAuthBackend` from the MCP SDK, which reads the `Authorization` header and calls `provider.verify_token(token)`. + +Banksy's current approach — session cookies set by Better Auth and validated via `auth.api.getMcpSession()` — is incompatible with the MCP spec. This was an implementation artifact of using Better Auth (which is session-oriented) for an MCP server. The migration to FastMCP naturally resolves this by switching to standard Bearer token auth. + +**What changes:** + +| Concern | Current (TS/Better Auth) | Target (Python/FastMCP) | +|---------|--------------------------|------------------------| +| MCP request auth | Session cookie → `getMcpSession()` | Bearer token → `verify_token()` | +| Token format | Better Auth session ID | IdP-issued RS256 JWT | +| Token storage (client) | Browser cookie | IDE-managed OAuth token | +| Token lifetime | Session expiry (cookie) | JWT `exp` claim | +| Discovery | `/.well-known/oauth-authorization-server` | `/.well-known/oauth-protected-resource` (RS) | + +**IDE client contract:** The switch from cookies to Bearer tokens does **not** break the IDE client contract because MCP clients already expect Bearer token auth. The current cookie-based approach was non-standard — IDEs were working around it via Better Auth's MCP plugin, which hid the cookie mechanics. With FastMCP, banksy becomes a standard MCP Resource Server that IDEs know how to talk to natively. + +**Token flow with FastMCP:** + +```python +# FastMCP's BearerAuthBackend (from MCP SDK) does this automatically: +# 1. Extract token from Authorization header +# 2. Call provider.verify_token(token) +# 3. Set request.scope["user"] = AuthenticatedUser(access_token) + +# In tools, access the validated token via DI: +from fastmcp.server.auth import AccessToken +from fastmcp.server.dependencies import CurrentAccessToken, TokenClaim + +@mcp.tool() +async def my_tool( + user_id: str = TokenClaim("sub"), + token: AccessToken = CurrentAccessToken(), +): + # user_id is extracted from the JWT's "sub" claim + # token.claims has the full JWT payload + ... +``` + +### 2.2 Resource Server Implementation with FastMCP Auth Providers (Q2) + +#### RemoteAuthProvider (Primary Path — IdP Supports DCR) + +`RemoteAuthProvider` is the simplest RS implementation. It composes a `TokenVerifier` with authorization server metadata to create PRM endpoints. The IdP handles all OAuth flows (authorization, token issuance, DCR); banksy only validates the resulting JWTs. + +**Configuration:** + +```python +from fastmcp import FastMCP +from fastmcp.server.auth import RemoteAuthProvider +from fastmcp.server.auth.providers.jwt import JWTVerifier + +# Create JWT verifier for the external IdP +jwt_verifier = JWTVerifier( + jwks_uri="https://idp.example.com/.well-known/jwks.json", + issuer="https://idp.example.com", + audience="banksy-mcp-server", + algorithm="RS256", + required_scopes=["openid", "mural"], +) + +# Create RemoteAuthProvider +auth = RemoteAuthProvider( + token_verifier=jwt_verifier, + authorization_servers=["https://idp.example.com"], + base_url="https://banksy.example.com", + resource_name="Banksy MCP Server", +) + +mcp = FastMCP("Banksy", auth=auth) +``` + +**PRM endpoint (auto-generated):** + +`RemoteAuthProvider.get_routes()` creates `/.well-known/oauth-protected-resource` and `/.well-known/oauth-protected-resource/{mcp_path}` routes via the MCP SDK's `create_protected_resource_routes()`. The response follows RFC 9728: + +```json +{ + "resource": "https://banksy.example.com/mcp", + "authorization_servers": ["https://idp.example.com"], + "scopes_supported": ["openid", "mural"], + "resource_name": "Banksy MCP Server" +} +``` + +The IDE reads this, discovers the IdP, performs OAuth with the IdP directly, and sends the resulting JWT as a Bearer token to banksy. + +**JWT validation internals:** `JWTVerifier` handles: +- JWKS fetching with 1-hour caching and automatic key rotation (`_get_jwks_key`) +- JWT signature verification via `authlib.jose.JsonWebToken` +- Issuer validation (string or list of allowed issuers) +- Audience validation (string or list, handles both string and array `aud` claims) +- Scope extraction from `scope` or `scp` claims +- Expiration checking + +**Custom validation on top of JWT verification:** + +Subclass `TokenVerifier` to add banksy-specific validation (e.g., checking user exists in DB): + +```python +from fastmcp.server.auth import AccessToken, TokenVerifier +from fastmcp.server.auth.providers.jwt import JWTVerifier + +class BanksyTokenVerifier(TokenVerifier): + def __init__(self, jwt_verifier: JWTVerifier, db_pool): + super().__init__(required_scopes=jwt_verifier.required_scopes) + self._jwt_verifier = jwt_verifier + self._db_pool = db_pool + + async def verify_token(self, token: str) -> AccessToken | None: + # First: standard JWT validation + access_token = await self._jwt_verifier.verify_token(token) + if access_token is None: + return None + + # Then: banksy-specific validation + user_id = access_token.claims.get("sub") + if not user_id: + return None + + # Check user exists in banksy DB + async with self._db_pool.acquire() as conn: + user = await conn.fetchrow( + "SELECT id FROM users WHERE idp_sub = $1", user_id + ) + if user is None: + return None + + return access_token +``` + +**Using the DescopeProvider as a reference:** `DescopeProvider` (at `fastmcp/server/auth/providers/descope.py`) is the canonical example of a `RemoteAuthProvider` subclass. It takes a `config_url`, creates a `JWTVerifier` from the OIDC configuration, and adds a `/.well-known/oauth-authorization-server` forwarding route. This is exactly the pattern banksy would follow with any IdP that supports DCR. + +#### OAuthProxy (Fallback — IdP Lacks DCR) + +If the chosen IdP does not support Dynamic Client Registration (DCR), banksy must act as an OAuth Authorization Server that proxies to the upstream IdP. `OAuthProxy` handles this. + +**Key differences from RemoteAuthProvider:** + +| Aspect | RemoteAuthProvider | OAuthProxy | +|--------|-------------------|------------| +| Role | Resource Server (validates tokens) | Authorization Server (issues tokens) | +| Discovery | PRM (`/.well-known/oauth-protected-resource`) | AS metadata (`/.well-known/oauth-authorization-server`) | +| DCR | Handled by external IdP | Handled locally by OAuthProxy | +| Token flow | IDE → IdP → JWT → banksy validates | IDE → banksy → IdP → banksy issues FastMCP JWT | +| Upstream tokens | Not stored | Stored encrypted in `_upstream_token_store` | +| JWT issuer | External IdP | OAuthProxy (banksy) | + +**OAuthProxy token flow in detail:** + +1. IDE discovers banksy's AS metadata, registers via DCR (local to OAuthProxy) +2. IDE initiates OAuth → OAuthProxy redirects to upstream IdP +3. User authenticates with upstream IdP → IdP redirects back to OAuthProxy's callback +4. OAuthProxy exchanges authorization code for upstream tokens (server-side) +5. OAuthProxy stores upstream tokens encrypted, creates JTI mapping +6. OAuthProxy issues its own FastMCP JWT (with optional upstream claims embedded) +7. IDE receives FastMCP JWT, sends as Bearer on MCP requests +8. On each request: OAuthProxy verifies FastMCP JWT → looks up JTI → retrieves upstream token → verifies upstream token via `TokenVerifier` + +**Critical answer — `get_access_token()` semantics with OAuthProxy:** + +This resolves the open question from the prior auth strategy doc. When tools call `get_access_token()` or use `CurrentAccessToken()`: + +- The `AccessToken.token` field contains the **raw Bearer token string** that the client sent (the FastMCP JWT) +- The `AccessToken.claims` dict contains the **upstream claims** from the token verification step (populated by `_extract_upstream_claims()` override or from `TokenVerifier.verify_token()`) +- The `AccessToken.client_id` comes from the upstream verification +- The **raw upstream access token** (e.g., the Mural API token) is **not directly accessible** through `get_access_token()` — it is stored encrypted in `_upstream_token_store`, accessible only through the OAuthProxy's internal storage + +This means that even with OAuthProxy, tools cannot directly get the Mural API token via `get_access_token()`. Layer 2 token management must use a separate mechanism (see Q3). + +#### OAuthProxy Customization Surface (Deep Dive) + +**Class hierarchy:** + +``` +AuthProvider (fastmcp.server.auth.auth) + └── OAuthProvider (fastmcp.server.auth.auth) + └── OAuthProxy (fastmcp.server.auth.oauth_proxy.proxy) + └── OIDCProxy (fastmcp.server.auth.oidc_proxy) + └── Auth0Provider (fastmcp.server.auth.providers.auth0) + └── GoogleProvider (fastmcp.server.auth.providers.google) +``` + +**Overridable methods:** + +| Method | Signature | Purpose | +|--------|-----------|---------| +| `_extract_upstream_claims` | `async def _extract_upstream_claims(self, idp_tokens: dict[str, Any]) -> dict[str, Any] \| None` | Extract claims from upstream IdP tokens to embed in FastMCP JWT. Called during `exchange_authorization_code()`. Override to decode upstream JWTs, call userinfo endpoints, or include custom claims. | +| `_prepare_scopes_for_token_exchange` | `def _prepare_scopes_for_token_exchange(self, scopes: list[str]) -> list[str]` | Modify scopes before sending to upstream IdP during auth code → token exchange. Default: pass through unchanged. Azure overrides this to prefix scopes. | +| `_prepare_scopes_for_upstream_refresh` | `def _prepare_scopes_for_upstream_refresh(self, scopes: list[str]) -> list[str]` | Modify scopes for upstream token refresh. Default: pass through unchanged. | +| `_get_verification_token` | `def _get_verification_token(self, upstream_token_set: UpstreamTokenSet) -> str \| None` | Choose which upstream token to verify. Default: `access_token`. OIDCProxy overrides to use `id_token` when `verify_id_token=True`. | + +**Storage hooks:** OAuthProxy uses `AsyncKeyValue` protocol for all storage (via `key_value` library). The `client_storage` parameter accepts any `AsyncKeyValue` implementation. Default: encrypted file-based storage via `FileTreeStore` + `FernetEncryptionWrapper`. For production multi-instance deployment, pass a custom `AsyncKeyValue` adapter backed by PostgreSQL or Redis. + +**Storage collections (all using `PydanticAdapter`):** + +| Store | Model | Purpose | +|-------|-------|---------| +| `_upstream_token_store` | `UpstreamTokenSet` | Encrypted upstream access/refresh tokens | +| `_client_store` | `ProxyDCRClient` | DCR client registrations | +| `_transaction_store` | `OAuthTransaction` | OAuth flow state | +| `_code_store` | `ClientCode` | Authorization codes (one-time use) | +| `_jti_mapping_store` | `JTIMapping` | FastMCP JWT JTI → upstream token ID | +| `_refresh_token_store` | `RefreshTokenMetadata` | Refresh token metadata (keyed by hash) | + +**How built-in providers customize OAuthProxy:** + +- **GoogleProvider**: Pure config wrapper. Passes Google's OAuth endpoints, creates a `GoogleTokenVerifier` (calls Google's tokeninfo API since Google tokens are opaque, not JWTs), sets `access_type=offline` and `prompt=consent` as default authorize params. No method overrides. +- **Auth0Provider**: Extends `OIDCProxy` (which extends `OAuthProxy`). Pure config wrapper — passes Auth0's OIDC config URL, audience. `OIDCProxy` handles OIDC discovery, creates a `JWTVerifier` from JWKS/issuer. +- **OIDCProxy**: Adds OIDC discovery (fetches `.well-known/openid-configuration`), auto-creates `JWTVerifier` from discovered JWKS URI and issuer. Overrides `_get_verification_token` to optionally verify `id_token` instead of `access_token` (for providers with opaque access tokens). + +**Hypothetical `MuralOAuthProvider(OAuthProxy)`:** + +```python +from fastmcp.server.auth.oauth_proxy import OAuthProxy +from fastmcp.server.auth import AccessToken, TokenVerifier + +class MuralTokenVerifier(TokenVerifier): + """Verify Mural OAuth tokens via /api/v0/users/me.""" + + async def verify_token(self, token: str) -> AccessToken | None: + async with httpx.AsyncClient() as client: + resp = await client.get( + "https://app.mural.co/api/v0/users/me", + headers={"Authorization": f"Bearer {token}"}, + ) + if resp.status_code != 200: + return None + user = resp.json() + return AccessToken( + token=token, + client_id=user.get("id", "unknown"), + scopes=[], + claims={"sub": user["id"], "email": user.get("email")}, + ) + +class MuralOAuthProvider(OAuthProxy): + def __init__(self, *, client_id: str, client_secret: str, base_url: str): + super().__init__( + upstream_authorization_endpoint="https://app.mural.co/api/oauth/authorize", + upstream_token_endpoint="https://app.mural.co/api/oauth/token", + upstream_client_id=client_id, + upstream_client_secret=client_secret, + token_verifier=MuralTokenVerifier(), + base_url=base_url, + ) + + async def _extract_upstream_claims(self, idp_tokens: dict) -> dict | None: + """Embed Mural user info in the FastMCP JWT.""" + access_token = idp_tokens.get("access_token") + if not access_token: + return None + async with httpx.AsyncClient() as client: + resp = await client.get( + "https://app.mural.co/api/v0/users/me", + headers={"Authorization": f"Bearer {access_token}"}, + ) + if resp.status_code == 200: + user = resp.json() + return {"mural_user_id": user["id"], "email": user.get("email")} + return None +``` + +However, this `MuralOAuthProvider` approach **is not recommended** for the RS model. Under RS, Mural is not the IdP — a dedicated IdP (Auth0/Descope) is. The Mural OAuth token is a Layer 2 concern, not Layer 1. + +**Role of OAuthProxy under RS model:** In the RS model with `RemoteAuthProvider`, there is no role for `OAuthProxy` in Layer 1. However, if the IdP stores upstream Mural tokens (e.g., Auth0 Token Vault with Mural as a custom social connection), then the OAuthProxy pattern could conceptually describe the IdP's behavior — but banksy wouldn't implement it. Banksy is purely an RS. + +#### TokenVerifier vs AuthProvider + +`TokenVerifier` is a lightweight class that only verifies tokens — no routes, no middleware, no OAuth flows. `AuthProvider` is the full provider that adds routes, middleware, and can serve as an OAuth AS or RS. + +``` +TokenVerifier: + - verify_token(token) -> AccessToken | None + - required_scopes, scopes_supported (properties) + +AuthProvider (extends TokenVerifier): + - verify_token(token) -> AccessToken | None + - get_routes(mcp_path) -> list[Route] + - get_well_known_routes(mcp_path) -> list[Route] + - get_middleware() -> list[Middleware] +``` + +`TokenVerifier` composes into both `RemoteAuthProvider` and `OAuthProxy` — both take a `token_verifier` parameter. You can reuse the same `JWTVerifier` instance across different providers. + +#### m2m Mode (Machine-to-Machine) + +For m2m auth without social OAuth, use `RemoteAuthProvider` with a `JWTVerifier` that validates `client_credentials` JWTs: + +```python +jwt_verifier = JWTVerifier( + jwks_uri="https://idp.example.com/.well-known/jwks.json", + issuer="https://idp.example.com", + audience="banksy-mcp-server", + algorithm="RS256", +) + +auth = RemoteAuthProvider( + token_verifier=jwt_verifier, + authorization_servers=["https://idp.example.com"], + base_url="https://banksy.example.com", +) +``` + +The IdP issues a JWT via `client_credentials` grant. The JWT is validated identically to user JWTs. The `sub` claim identifies the machine client. + +#### Custom AuthProvider + +If neither `RemoteAuthProvider` nor `OAuthProxy` fits, subclass `AuthProvider` directly: + +```python +from fastmcp.server.auth import AuthProvider, AccessToken + +class CustomAuthProvider(AuthProvider): + def __init__(self, jwt_verifier: JWTVerifier, introspection_url: str): + super().__init__(base_url="https://banksy.example.com") + self._jwt_verifier = jwt_verifier + self._introspection_url = introspection_url + + async def verify_token(self, token: str) -> AccessToken | None: + # Try JWT verification first + result = await self._jwt_verifier.verify_token(token) + if result is not None: + return result + + # Fall back to opaque token introspection + async with httpx.AsyncClient() as client: + resp = await client.post( + self._introspection_url, + data={"token": token}, + ) + if resp.status_code == 200: + data = resp.json() + if data.get("active"): + return AccessToken( + token=token, + client_id=data.get("client_id", "unknown"), + scopes=data.get("scope", "").split(), + claims=data, + ) + return None +``` + +### 2.3 Layer 2: Banksy → Mural Token Management (Q3) + +**Finding: Layer 2 is entirely custom. FastMCP has no concept of downstream API tokens.** + +Neither `RemoteAuthProvider` nor `OAuthProxy` solves Layer 2. After Layer 1 authenticates the IDE user, banksy still needs per-user Mural API tokens to call mural-api on behalf of that user. + +#### Token Storage Strategy + +**With RemoteAuthProvider (recommended RS path):** + +No built-in upstream token storage. Banksy needs its own DB tables, similar to the current `muralSessionToken` / `muralOauthToken` tables: + +```python +# Schema (PostgreSQL via asyncpg) +CREATE TABLE mural_tokens ( + user_id TEXT PRIMARY KEY, -- maps to IdP "sub" claim + access_token TEXT NOT NULL, + refresh_token TEXT, + token_type TEXT NOT NULL, -- 'session' or 'oauth' + expires_at TIMESTAMPTZ, + created_at TIMESTAMPTZ DEFAULT NOW(), + updated_at TIMESTAMPTZ DEFAULT NOW() +); +``` + +**With OAuthProxy (if IdP is Mural directly — NOT recommended):** + +OAuthProxy stores upstream tokens in `_upstream_token_store`, keyed by `upstream_token_id`. However: +- Default storage is file-based (`FileTreeStore`) — not suitable for multi-instance production +- Storage is keyed by JTI, not user ID — you'd need to map user → JTI → upstream token +- The storage API is internal to OAuthProxy; accessing it from tools requires subclassing + +**With IdP upstream token storage (Auth0 Token Vault / Descope Outbound Apps):** + +If the IdP stores upstream Mural tokens (acquired when the user authenticates with Mural as a custom social connection), banksy can retrieve them via the IdP's management API: + +```python +# Auth0 Token Vault example (conceptual) +async def get_mural_token_from_idp(idp_user_id: str) -> str: + async with httpx.AsyncClient() as client: + resp = await client.get( + f"https://your-tenant.auth0.com/api/v2/users/{idp_user_id}", + headers={"Authorization": f"Bearer {mgmt_api_token}"}, + ) + user = resp.json() + identities = user.get("identities", []) + for identity in identities: + if identity["provider"] == "mural": + return identity["access_token"] + raise MuralNotConnectedError(idp_user_id) +``` + +#### Per-Request Token Injection + +The current TS approach uses `AsyncLocalStorage` to inject per-user Mural tokens into outbound API calls. In Python/FastMCP, use the DI system: + +```python +from fastmcp.server.auth import AccessToken +from fastmcp.server.dependencies import TokenClaim +import httpx + +# Module-level singleton (initialized at startup) +_db_pool = None +_http_client = None + +def get_db_pool(): + return _db_pool + +def get_http_client(): + return _http_client + +async def get_mural_tokens(user_id: str) -> MuralTokens: + """Load and refresh Mural tokens for a user. Equivalent to getAndRefreshTokens().""" + pool = get_db_pool() + async with pool.acquire() as conn: + row = await conn.fetchrow( + "SELECT access_token, refresh_token, expires_at, token_type " + "FROM mural_tokens WHERE user_id = $1", + user_id, + ) + if not row: + raise MuralNotConnectedError(user_id) + + tokens = MuralTokens(**dict(row)) + if tokens.is_expired(): + tokens = await refresh_mural_tokens(tokens) + await save_mural_tokens(pool, user_id, tokens) + + return tokens + +async def call_mural_api( + method: str, + path: str, + user_id: str, + **kwargs, +) -> httpx.Response: + """Make an authenticated Mural API call for the given user.""" + tokens = await get_mural_tokens(user_id) + client = get_http_client() + return await client.request( + method, + f"https://app.mural.co{path}", + headers={"Authorization": f"Bearer {tokens.access_token}"}, + **kwargs, + ) + +# In tools, use TokenClaim to get user_id and call Mural API: +@mcp.tool() +async def list_murals( + workspace_id: str, + user_id: str = TokenClaim("sub"), +): + resp = await call_mural_api("GET", f"/api/v0/workspaces/{workspace_id}/murals", user_id) + return resp.json() +``` + +**Can OAuthProxy's stores be repurposed for Layer 2?** + +Theoretically yes — `_upstream_token_store` uses the `AsyncKeyValue` protocol and stores `UpstreamTokenSet` models with `access_token`, `refresh_token`, `expires_at`. However, this is not recommended because: +1. The store is keyed by `upstream_token_id` (random), not by `user_id` +2. The store is internal to OAuthProxy with no public API for external access +3. In the RS model with `RemoteAuthProvider`, there is no OAuthProxy at all +4. Banksy's Layer 2 tokens may come from a different source than Layer 1 (e.g., Session Activation) + +**Recommendation:** Use banksy's own DB tables for Layer 2 tokens, following the same pattern as the current `muralSessionToken` / `muralOauthToken` tables. + +### 2.4 Session Activation Flow (Q4) + +**Finding: Session Activation routes work as `custom_route()` endpoints. No FastAPI required.** + +Session Activation (`/auth/mural-link/code` and `/auth/mural-link/claim`) is banksy-specific. These are standard HTTP POST endpoints, not MCP protocol endpoints. + +**Implementation with `custom_route()`:** + +```python +from starlette.requests import Request +from starlette.responses import JSONResponse +import secrets +import time + +@mcp.custom_route("/auth/mural-link/code", methods=["POST"]) +async def create_mural_link_code(request: Request) -> JSONResponse: + """Generate a session activation code for the authenticated user.""" + # Auth validation: BearerAuthBackend already ran on this request + user = request.scope.get("user") + if not user or not hasattr(user, "access_token"): + return JSONResponse({"error": "unauthorized"}, status_code=401) + + user_id = user.access_token.claims.get("sub") + if not user_id: + return JSONResponse({"error": "missing user identity"}, status_code=401) + + # Generate code and nonce + code = secrets.token_urlsafe(32) + nonce = secrets.token_urlsafe(32) + expires_at = time.time() + 300 # 5 minutes + + # Store pending connection in DB + pool = get_db_pool() + async with pool.acquire() as conn: + await conn.execute( + """INSERT INTO pending_mural_connections + (code, nonce, user_id, expires_at) + VALUES ($1, $2, $3, to_timestamp($4))""", + code, nonce, user_id, expires_at, + ) + + return JSONResponse({ + "code": code, + "muralApiHost": config.mural_api_host, + "expiresAt": int(expires_at * 1000), + }) + + +@mcp.custom_route("/auth/mural-link/claim", methods=["POST"]) +async def claim_mural_link(request: Request) -> JSONResponse: + """Claim Mural tokens using the session activation code.""" + user = request.scope.get("user") + if not user or not hasattr(user, "access_token"): + return JSONResponse({"error": "unauthorized"}, status_code=401) + + user_id = user.access_token.claims.get("sub") + body = await request.json() + code = body.get("code") + + if not code: + return JSONResponse({"error": "missing code"}, status_code=400) + + # Look up pending connection + pool = get_db_pool() + async with pool.acquire() as conn: + row = await conn.fetchrow( + """SELECT nonce, user_id, expires_at FROM pending_mural_connections + WHERE code = $1 AND user_id = $2 AND expires_at > NOW()""", + code, user_id, + ) + + if not row: + return JSONResponse({"error": "invalid or expired code"}, status_code=400) + + # Use nonce to claim tokens from mural-api + nonce = row["nonce"] + tokens = await claim_mural_session_tokens(nonce) + + # Store Mural tokens + async with pool.acquire() as conn: + await conn.execute( + """INSERT INTO mural_tokens (user_id, access_token, refresh_token, token_type, expires_at) + VALUES ($1, $2, $3, 'session', $4) + ON CONFLICT (user_id) DO UPDATE SET + access_token = EXCLUDED.access_token, + refresh_token = EXCLUDED.refresh_token, + expires_at = EXCLUDED.expires_at, + updated_at = NOW()""", + user_id, tokens.access_token, tokens.refresh_token, + tokens.expires_at, + ) + + # Clean up pending connection + async with pool.acquire() as conn: + await conn.execute( + "DELETE FROM pending_mural_connections WHERE code = $1", code + ) + + return JSONResponse({"success": True}) +``` + +**Key implementation details:** + +1. **Auth validation:** `custom_route()` handlers receive the same `BearerAuthBackend` middleware as MCP routes. The authenticated user is at `request.scope["user"]`, which is an `AuthenticatedUser` with an `access_token` property. + +2. **Database access:** Via module-level singleton `get_db_pool()`. No FastAPI `Depends()` needed — the pool is initialized at server startup and accessed directly. + +3. **User identity:** `request.scope["user"].access_token.claims["sub"]` gives the IdP user ID, replacing `getUserId()` from the current TS implementation. + +### 2.5 Server-Side vs. Client-Side OAuth Redirect Handling (Q5) + +**Finding: Server-side handling is native to both OAuthProxy and standard Starlette. The SPA can be eliminated.** + +#### OAuthProxy callback flow + +OAuthProxy registers a server-side callback route at `/auth/callback` (configurable via `redirect_path`). The flow: + +1. OAuthProxy redirects user to upstream IdP with `redirect_uri` pointing to its own `/auth/callback` +2. IdP authenticates user, redirects back to OAuthProxy's `/auth/callback?code=...&state=...` +3. OAuthProxy's `_handle_idp_callback()` method: + - Validates state parameter + - Exchanges authorization code for upstream tokens (server-side HTTP call) + - Stores upstream tokens encrypted + - Creates a client authorization code + - Redirects the client to its `redirect_uri` with the code +4. Client (IDE) exchanges the code for a FastMCP JWT via the `/token` endpoint + +This is **entirely server-side** — no SPA receives the callback. The authorization code never appears in browser URL history. + +#### RemoteAuthProvider callback flow + +With `RemoteAuthProvider`, banksy handles **no callbacks at all**. The flow is: + +1. IDE reads PRM → discovers IdP +2. IDE opens browser → IdP login page +3. User authenticates → IdP redirects back to **IDE's** redirect URI (e.g., `http://localhost:PORT/callback` or `cursor://auth/callback`) +4. IDE exchanges code with IdP for JWT +5. IDE sends JWT as Bearer token to banksy + +Banksy is completely passive — no callback route needed. + +#### Session Activation browser flow + +The Session Activation flow (where the browser does Mural OAuth) can use a server-side Starlette route as the `redirect_uri`: + +```python +@mcp.custom_route("/auth/mural-oauth/callback", methods=["GET"]) +async def mural_oauth_callback(request: Request) -> Response: + """Server-side handler for Mural OAuth callback during Session Activation.""" + code = request.query_params.get("code") + state = request.query_params.get("state") + error = request.query_params.get("error") + + if error: + return HTMLResponse(f"

Error: {error}

") + + if not code or not state: + return HTMLResponse("

Missing parameters

") + + # Validate state, exchange code for Mural tokens, store them + # ... (server-side token exchange, no client-side JS needed) + + # Redirect to a simple completion page or close the window + return HTMLResponse(""" + +

Connected to Mural

+

You can close this window.

+ + + """) +``` + +**Security implications:** Server-side redirect handling eliminates: +- **Audit Ticket 7 (code in URL history):** The authorization code is processed server-side and never stored in browser history via `history.replaceState` +- **Error parameter reflection risk:** Error messages from the IdP are handled server-side; no XSS via reflected parameters +- **SPA build complexity:** No React build, no `oauth-callback.tsx`, no client-side code scrubbing + +**Can we eliminate the SPA build entirely?** + +Yes, with caveats: +- **OAuth callbacks:** Server-side routes handle these (as shown above) +- **Session Activation UI:** The "Connect to Mural" page that opens in the browser could be a minimal server-rendered HTML page instead of an SPA +- **Auth config page (`/auth/config`):** Can be a simple JSON endpoint via `custom_route()` +- **Remaining browser pages:** The OAuthProxy consent screen is built-in (server-rendered HTML). Any other browser-facing UI can use simple `HTMLResponse` or template rendering + +The only browser-facing pages needed are: +1. Session Activation completion page (simple HTML) +2. OAuthProxy consent screen (built-in, only if using OAuthProxy) +3. Error pages (simple HTML) + +### 2.6 Auth Middleware Architecture (Q6) + +**Finding: FastMCP handles auth middleware automatically. MCP and HTTP routes share the same `BearerAuthBackend`.** + +FastMCP's middleware stack: + +``` +HTTP Request + └── AuthenticationMiddleware (BearerAuthBackend) + └── Sets request.scope["user"] = AuthenticatedUser(access_token) + └── AuthContextMiddleware + └── Sets MCP SDK auth context var (for get_access_token()) + └── MCP Protocol Handler (for /mcp routes) + └── Tools/Resources access token via CurrentAccessToken() / TokenClaim() + └── custom_route() handlers (for HTTP routes) + └── Access token via request.scope["user"] +``` + +**Which auth concerns go where:** + +| Concern | Layer | Access Pattern | +|---------|-------|----------------| +| JWT validation | HTTP middleware (automatic) | `BearerAuthBackend` calls `provider.verify_token()` | +| User identity in MCP tools | MCP DI | `TokenClaim("sub")` or `CurrentAccessToken()` | +| User identity in custom routes | HTTP request | `request.scope["user"].access_token.claims["sub"]` | +| Layer 2 Mural tokens | Application code | `get_mural_tokens(user_id)` helper function | + +**Sharing auth validation between MCP and HTTP routes:** + +Both layers use the same `BearerAuthBackend` automatically. There is no need to duplicate auth logic. + +```python +# This helper works in both MCP tools and custom_route handlers: +def get_user_id_from_request(request: Request) -> str: + """Extract user ID from authenticated request.""" + user = request.scope.get("user") + if not user or not isinstance(user, AuthenticatedUser): + raise ValueError("Request not authenticated") + return user.access_token.claims["sub"] + +# In MCP tools (via DI): +@mcp.tool() +async def my_tool(user_id: str = TokenClaim("sub")): + ... + +# In custom routes (via request scope): +@mcp.custom_route("/auth/mural-link/code", methods=["POST"]) +async def handler(request: Request): + user_id = get_user_id_from_request(request) + ... +``` + +**Selective auth bypass:** If certain routes (e.g., `/health`) should be unauthenticated, note that `custom_route()` handlers go through the same middleware. To bypass auth for specific routes, you would need to check the path in a custom middleware or mount the health route on a separate ASGI app that doesn't include the auth middleware. Alternatively, handle the 401 gracefully in the route handler. + +### 2.7 Do We Need FastAPI for Auth? (Q7) + +**Finding: No. `custom_route()` with `request.scope["user"]` is sufficient. FastAPI can be added later if complexity grows.** + +The Session Activation routes need: +1. **Auth validation** — provided by `BearerAuthBackend` (automatic) +2. **Request body parsing** — `await request.json()` (Starlette built-in) +3. **DB access** — module-level singleton (no DI needed) +4. **Response formatting** — `JSONResponse` (Starlette built-in) + +None of these require FastAPI's `Depends()` system. The `custom_route()` approach works cleanly: + +| Need | `custom_route()` Solution | FastAPI Solution | +|------|--------------------------|------------------| +| Auth validation | `request.scope["user"]` | `Depends(get_current_user)` | +| DB session | `get_db_pool()` module singleton | `Depends(get_db)` | +| Request body | `await request.json()` | `body: MyModel` (Pydantic) | +| Response | `JSONResponse({...})` | `return {...}` | + +**When to add FastAPI:** +- If auth routes grow to 10+ endpoints with complex request validation +- If you need OpenAPI schema generation for auth endpoints +- If you want per-route middleware (FastAPI's dependency system is more granular) +- If you need request body validation via Pydantic models + +**Risk of splitting auth across two frameworks:** + +If FastAPI is used for auth routes while FastMCP handles MCP routes, ensure: +1. **Same token verifier:** Both use the same `JWTVerifier` instance +2. **Shared middleware:** Mount the FastAPI app under the FastMCP app (or vice versa) so the same `BearerAuthBackend` applies +3. **Consistent user identity:** Both read `sub` from the same JWT claims + +The recommended approach: start with `custom_route()` for all auth routes (PR5-PR6). If complexity warrants it, refactor to FastAPI in a later PR. + +### 2.8 End-to-End Auth Flow Post-Migration (Q8) + +#### RS Model with External IdP (Primary Target) + +``` +┌─────────┐ ┌─────────────┐ ┌──────────┐ ┌───────────┐ +│ IDE │ │ Banksy │ │ IdP │ │ Mural API │ +│ (Cursor) │ │ (FastMCP) │ │(Auth0/ │ │ │ +│ │ │ │ │ Descope) │ │ │ +└────┬─────┘ └──────┬──────┘ └────┬─────┘ └─────┬─────┘ + │ │ │ │ + │ GET /.well-known/│ │ │ + │ oauth-protected- │ │ │ + │ resource/mcp │ │ │ + │─────────────────>│ │ │ + │ │ │ │ + │ PRM response: │ │ │ + │ {authorization_ │ │ │ + │ servers: [idp]} │ │ │ + │<─────────────────│ │ │ + │ │ │ │ + │ GET /.well-known/│ │ │ + │ oauth-authz-srv │ │ │ + │──────────────────────────────────>│ │ + │ │ │ │ + │ AS metadata │ │ │ + │ (authz, token, │ │ │ + │ DCR endpoints) │ │ │ + │<──────────────────────────────────│ │ + │ │ │ │ + │ POST /register │ │ │ + │ (DCR) │ │ │ + │──────────────────────────────────>│ │ + │ │ │ │ + │ {client_id, │ │ │ + │ client_secret} │ │ │ + │<──────────────────────────────────│ │ + │ │ │ │ + │ Browser: /authorize │ │ + │──────────────────────────────────>│ │ + │ │ │ │ + │ User logs in (Google/Mural social)│ │ + │<─────────────────────────── callback to IDE │ + │ │ │ │ + │ POST /token │ │ │ + │ (code exchange) │ │ │ + │──────────────────────────────────>│ │ + │ │ │ │ + │ {access_token: │ │ │ + │ RS256 JWT} │ │ │ + │<──────────────────────────────────│ │ + │ │ │ │ + │ MCP request │ │ │ + │ Authorization: │ │ │ + │ Bearer │ │ │ + │─────────────────>│ │ │ + │ │ │ │ + │ │ verify JWT │ │ + │ │ (JWKS/issuer/ │ │ + │ │ audience) │ │ + │ │ │ │ + │ │ Look up Mural │ │ + │ │ tokens for user │ │ + │ │ (from DB) │ │ + │ │ │ │ + │ │ Call Mural API │ │ + │ │ with user's │ │ + │ │ Mural token │ │ + │ │────────────────────────────────>│ + │ │ │ │ + │ │ API response │ │ + │ │<────────────────────────────────│ + │ │ │ │ + │ MCP response │ │ │ + │<─────────────────│ │ │ +``` + +#### What Replaces What + +| Current (TS/Better Auth) | Target (Python/FastMCP) | +|--------------------------|------------------------| +| Better Auth `mcp()` plugin | `RemoteAuthProvider` + `JWTVerifier` | +| `/.well-known/oauth-authorization-server` | `/.well-known/oauth-protected-resource` (PRM) | +| `getMcpSession()` (cookie validation) | `verify_token()` (JWT validation) | +| `banksyUserContextProvider` + `getUserId()` | `TokenClaim("sub")` / `CurrentAccessToken()` | +| `muralSessionToken` / `muralOauthToken` tables | `mural_tokens` table (similar schema) | +| `getAndRefreshTokens()` | `get_mural_tokens()` (similar logic) | +| `AsyncLocalStorage` token injection | Module-level helper + `TokenClaim("sub")` DI | +| Better Auth social providers | External IdP handles social login | +| SPA OAuth callback pages | Server-side callback routes (or none with RS) | +| SSO proxy Google OAuth plugin | IdP handles Google as social connection | +| Session Activation routes | `custom_route()` handlers | + +#### What Is New Custom Code vs. FastMCP Native + +**FastMCP provides natively:** +- Bearer token validation pipeline (`BearerAuthBackend`) +- JWT verification with JWKS (`JWTVerifier`) +- PRM endpoint generation (`RemoteAuthProvider`) +- DI for user identity (`TokenClaim`, `CurrentAccessToken`) +- Custom HTTP route registration (`custom_route()`) + +**New custom code needed:** +- `BanksyTokenVerifier` (optional: custom validation on top of JWT) +- `mural_tokens` DB schema and CRUD +- `get_mural_tokens()` / `refresh_mural_tokens()` (Layer 2 management) +- `call_mural_api()` helper (per-request Mural auth injection) +- Session Activation routes (`/auth/mural-link/code`, `/auth/mural-link/claim`) +- `MuralNotConnectedError` handling and user flow + +#### DB Schema Changes + +The current Better Auth tables (`user`, `session`, `account`, `verification`, `muralSessionToken`, `muralOauthToken`, `pendingMuralConnection`) are replaced: + +- **`user` / `session` / `account`** → Eliminated. IdP manages users and sessions. Banksy may keep a lightweight `users` table mapping `idp_sub` → banksy user ID for internal references. +- **`muralSessionToken` / `muralOauthToken`** → Consolidated into `mural_tokens` table. +- **`pendingMuralConnection`** → Kept for Session Activation (if Session Activation survives). +- **`verification`** → Eliminated. IdP handles email verification. + +#### Environment Variable Changes + +| Current | Target | +|---------|--------| +| `AUTH_MODE` | Possibly eliminated (single RS mode) or kept for dev/test | +| `BETTER_AUTH_SECRET` | Eliminated | +| `BETTER_AUTH_BASE_URL` | `BANKSY_BASE_URL` (used for PRM resource URL) | +| `GOOGLE_CLIENT_ID` / `GOOGLE_CLIENT_SECRET` | Eliminated (IdP handles Google) | +| `SSO_PROXY_URL` | Eliminated (SSO proxy not needed with dedicated IdP) | +| `MURAL_OAUTH_*` | Kept if Session Activation requires Mural OAuth | +| (new) `IDP_JWKS_URI` | JWKS URL for JWT verification | +| (new) `IDP_ISSUER` | Expected JWT issuer | +| (new) `IDP_AUDIENCE` | Expected JWT audience | +| (new) `IDP_AUTHORIZATION_SERVER` | IdP URL for PRM metadata | + +#### Layer 2 Token Acquisition + +How Mural tokens are acquired depends on the IdP strategy: + +**Option A — IdP stores upstream Mural tokens:** + +If the IdP (Auth0/Descope) has Mural as a custom social connection with upstream token storage, the user authenticates with Mural through the IdP during the Layer 1 OAuth flow. Banksy retrieves the Mural token from the IdP's management API after authentication. This eliminates Session Activation for mural-oauth equivalent flows. + +**Option B — Session Activation survives:** + +If the IdP does not store upstream Mural tokens, the current Session Activation flow persists (with code/claim endpoints). The browser-based Mural OAuth step happens after Layer 1 authentication. + +**Option C — Hybrid:** + +IdP handles Google/social auth for Layer 1; Session Activation handles Mural token acquisition for Layer 2. This is the most likely near-term path, as it decouples IdP selection from Layer 2 implementation. + +## 3. Architecture Recommendation + +### Recommended Auth Architecture for Python Banksy + +``` +Layer 1 (IDE → Banksy): + RemoteAuthProvider + JWTVerifier + - IdP issues RS256 JWTs with JWKS + - Banksy validates JWTs, serves PRM + - No session management, no token issuance + +Layer 2 (Banksy → Mural API): + Custom token management (banksy's own DB) + - mural_tokens table (user_id, access_token, refresh_token, expires_at) + - get_mural_tokens(user_id) with proactive refresh + - call_mural_api() helper for per-request auth injection + - Session Activation routes via custom_route() (if needed) +``` + +### Two-Layer Split + +**FastMCP handles (Layer 1):** +- Bearer token validation +- JWT signature verification + claims validation +- PRM endpoint serving +- Auth DI in tools (`TokenClaim`, `CurrentAccessToken`) +- Auth middleware for both MCP and custom routes + +**Custom code handles (Layer 2):** +- Mural token storage (PostgreSQL) +- Mural token refresh (session refresh via mural-api, or OAuth refresh) +- Per-request Mural auth header injection +- Session Activation flow (if IdP doesn't store upstream Mural tokens) +- User-facing error handling (MuralNotConnectedError → prompt for Session Activation) + +### SPA Strategy (Revised) + +> **Update:** The original recommendation here was to eliminate the SPA entirely and use inline `HTMLResponse` strings for the few remaining browser-facing pages. After further analysis, this recommendation has been **revised** — the React SPA is preserved. See [Serving a React SPA from FastMCP](fastmcp-react-spa-serving-research.md) for the full research. + +**Revised recommendation: Preserve the React SPA for browser-facing pages. Eliminate only the OAuth callback SPA pages.** + +The OAuth callback pages (`oauth-callback.tsx`, `sso-proxy-google-callback.tsx`, `sso-proxy-connect-callback.tsx`) are eliminated because OAuth callbacks are now handled server-side by `custom_route()` endpoints. However, banksy still needs browser-facing pages — the home/landing page (client connection instructions), the Session Activation login page, completion and error pages — and these are served as a Vite-built React SPA using Mural Design System components. + +The serving mechanism is a `SpaStaticFiles` subclass of Starlette's `StaticFiles` mounted at `/` on the FastMCP app after `http_app()`. This provides SPA catch-all routing (unrecognized paths fall back to `index.html` for React Router) while MCP, auth, and `custom_route()` API endpoints retain precedence via Starlette's positional route ordering. + +**What is eliminated (unchanged from original):** +- `oauth-callback.tsx` / `sso-proxy-google-callback.tsx` / `sso-proxy-connect-callback.tsx` SPA pages +- The `history.replaceState` code scrubbing workaround +- The error parameter reflection XSS risk (audit Ticket 7) +- Better Auth sign-in pages (`sign-in.tsx`) + +**What is preserved (revised):** +- Home page / client selector (connection instructions, copy-to-clipboard, deeplinks) +- Session Activation login page (email form → Mural OAuth redirect) +- Completion page (success/error after Mural OAuth callback) +- Error pages (styled with Mural DS, not inline HTML) +- All Mural Design System components and Tailwind styling + +**Key architectural details:** +- SPA source lives in a standalone `ui/` directory with its own `package.json` +- Vite builds to `ui/dist/`; Python server mounts via `app.mount("/", SpaStaticFiles(directory="ui/dist", html=True))` +- Docker multi-stage build: Node.js stage builds SPA, Python stage copies output (no Node.js in production image) +- Dev mode: Vite dev server proxies API requests to FastMCP (standard React + Python pattern) +- Session Activation auth: IDE pre-generates activation code, passes via URL to the SPA — no Bearer token needed in the browser + +## 4. End-to-End Auth Flow + +See Section 2.8 for the complete sequence diagram and component mapping. + +## 5. Impact on Migration Plan + +### PR5 (Auth — IDE to Banksy) + +**Changes from original PR5 scope:** + +| Original Scope | Updated Scope | +|----------------|---------------| +| Better Auth + Google SSO plugin | `RemoteAuthProvider` + `JWTVerifier` | +| Session cookie management | Eliminated (Bearer tokens) | +| `oauth.py` (Google OAuth config) | `auth.py` (IdP JWT verification config) | +| `session.py` (Better Auth sessions) | Eliminated | +| `middleware.ts` equivalent | Automatic (FastMCP's `BearerAuthBackend`) | +| User/Session DB models | Lightweight `users` table (optional) | +| OAuth discovery endpoint | PRM endpoint (auto-generated) | +| SPA callback pages | OAuth callback pages eliminated; home/Session Activation/error pages preserved as React SPA ([details](fastmcp-react-spa-serving-research.md)) | + +**PR5 deliverables:** +1. `RemoteAuthProvider` configuration with `JWTVerifier` +2. `config.py` with IdP env vars (`IDP_JWKS_URI`, `IDP_ISSUER`, `IDP_AUDIENCE`) +3. `BanksyTokenVerifier` (optional custom validation) +4. MCP Inspector test: complete OAuth flow with IdP → tool call succeeds + +**Dependencies:** Requires an IdP tenant configured. For development, use `StaticTokenVerifier` or a mock IdP. + +### PR6 (Auth — Banksy to Mural) + +**Changes from original PR6 scope:** + +| Original Scope | Updated Scope | +|----------------|---------------| +| Mural token acquisition via Better Auth plugin | Session Activation via `custom_route()` or IdP upstream tokens | +| `muralSessionToken` / `muralOauthToken` tables | `mural_tokens` table | +| `AsyncLocalStorage` token injection | Module-level helper + `TokenClaim("sub")` | +| SPA-based Session Activation UI | React SPA page (modified: IDE passes activation code via URL, no browser-to-server auth needed); server-side `custom_route()` handles Mural OAuth callback and redirects to SPA completion page | + +**PR6 deliverables:** +1. `mural_tokens` DB schema and CRUD +2. `get_mural_tokens()` with refresh logic +3. `call_mural_api()` helper +4. Session Activation routes (`/auth/mural-link/code`, `/auth/mural-link/claim`) +5. Token refresh for both session and OAuth token types +6. Integration test: Layer 1 auth → Session Activation → tool calls Mural API + +### Overall Strategy Impact + +- **Mode convergence accelerated:** The RS model unifies sso-proxy and mural-oauth into a single auth path. The Python banksy only needs one auth mode (RS with external IdP), simplifying the codebase. +- **SSO proxy elimination:** With a dedicated IdP handling Google as a social connection, the SSO proxy service is no longer needed. +- **IdP PoC in parallel:** Start an IdP PoC (Auth0 or Descope) in parallel with PRs 1-4. Use mock auth (`StaticTokenVerifier`) in early PRs. + +## 6. Risk Assessment + +### Resolved Open Questions + +**Q: Does `get_access_token()` return the upstream Mural token or the FastMCP JWT?** + +**A:** Neither directly gives you the Mural API token. +- With `RemoteAuthProvider`: `get_access_token()` returns an `AccessToken` with the IdP JWT as `token` and IdP claims in `claims`. No Mural token. +- With `OAuthProxy`: `get_access_token()` returns an `AccessToken` with the FastMCP JWT as `token` and upstream claims (from `_extract_upstream_claims()` or `TokenVerifier.verify_token()`) in `claims`. The raw upstream token is stored in `_upstream_token_store` but not exposed through the DI. +- **Implication:** Layer 2 Mural tokens must be managed separately, regardless of the Layer 1 auth provider choice. + +**Q: Can tools access upstream tokens?** + +**A:** Not through FastMCP's standard DI. Tools get `AccessToken` with claims, not raw upstream tokens. Layer 2 tokens require a custom lookup pattern (see Q3). + +### Remaining Risks + +| Risk | Severity | Mitigation | +|------|----------|------------| +| **IdP selection delays auth PRs** | High | Use `StaticTokenVerifier` for dev; IdP PoC in parallel | +| **Single auth provider limitation** | Medium | Not a blocker — all routes share the same JWT validation. Session Activation routes use the same `BearerAuthBackend`. | +| **Token refresh race conditions** | Medium | Same risk as current TS implementation. Mitigate with optimistic locking on `mural_tokens` table. Test under concurrent tool calls. | +| **OAuthProxy file-based storage in production** | Medium (only if using OAuthProxy) | Pass a PostgreSQL-backed `AsyncKeyValue` adapter to `client_storage`. Or use `RemoteAuthProvider` (no storage needed). | +| **IDE compatibility with PRM** | Low | PRM (RFC 9728) is the standard. Cursor and Claude Desktop support it. Verify with MCP Inspector. | +| **Session Activation UX change** | Low | SPA is preserved — UX is nearly identical to current. Main change: IDE pre-generates activation code and passes it via URL instead of browser calling `/auth/mural-link/code` directly. | +| **Auth0 Token Vault availability** | Medium | Token Vault may require Enterprise tier. If unavailable, keep Session Activation for Layer 2. Descope Outbound Apps is an alternative. | + +### What Needs Prototyping + +1. **IdP PoC (highest priority):** Configure Auth0 or Descope with Mural as a custom social connection. Verify: DCR works, JWTs have correct claims, upstream Mural tokens are stored (if Token Vault available). + +2. **`RemoteAuthProvider` end-to-end test:** Configure `RemoteAuthProvider` with the PoC IdP's JWKS. Verify: PRM served correctly, MCP Inspector completes OAuth, tool call with `TokenClaim("sub")` succeeds. + +3. **Session Activation with Bearer tokens:** Test Session Activation routes with `custom_route()`. Verify: auth validation works via `request.scope["user"]`, code/claim flow completes, Mural tokens stored. + +4. **Token refresh under concurrency:** Simulate concurrent tool calls that all need Mural token refresh. Verify: no duplicate refreshes, no token corruption, optimistic locking works. + +## 7. Gaps and Open Questions + +### Cannot Determine From Docs/Source Alone + +1. **IdP upstream token storage behavior:** How exactly does Auth0 Token Vault or Descope Outbound Apps store and expose upstream Mural tokens? What API calls retrieve them? What happens on token refresh? **Needs: IdP PoC.** + +2. **Mural OAuth token format:** Is Mural's access token a JWT or opaque? This affects whether `JWTVerifier` or an introspection-based verifier is needed for the `MuralTokenVerifier` (relevant only if using `OAuthProxy` with Mural directly, which is not the recommended path). **Needs: Test Mural OAuth flow.** + +3. **IDE PRM support matrix:** Which IDEs support PRM (RFC 9728) discovery vs only AS metadata discovery? If an IDE only supports AS metadata, it would need the IdP's `/.well-known/oauth-authorization-server` directly. **Needs: Testing with Cursor, Claude Desktop, VS Code.** + +4. **OAuthProxy `AsyncKeyValue` PostgreSQL adapter:** FastMCP's `OAuthProxy` uses the `key_value` library. Is there an existing PostgreSQL adapter, or does one need to be written? **Needs: Check `key_value` library docs.** (Only relevant if using OAuthProxy.) + +5. **Multiple MCP endpoints:** Can banksy expose multiple MCP endpoints (e.g., `/mcp` for authenticated, `/health` for unauthenticated) with selective auth? FastMCP supports one auth provider per server, but `custom_route()` handlers all go through the same middleware. **Needs: Test unauthenticated custom routes.** + +6. **`from_openapi()` auth injection:** How does FastMCP's `from_openapi()` (for auto-generating tools from OpenAPI specs) get per-request auth headers? The current TS approach uses a sidecar with `AsyncLocalStorage`. In Python, does `from_openapi()` support a per-request auth header callback? **Needs: Check `from_openapi()` docs/source.** + +### Recommendations for Next Steps + +1. **Start IdP PoC immediately** — this is the longest-lead-time item and blocks PR5/PR6 scope finalization +2. **Use `StaticTokenVerifier` in PRs 1-4** — allows auth-gated tool development without an IdP +3. **Prototype Session Activation with `custom_route()`** — validate the `request.scope["user"]` pattern works for non-MCP routes +4. **Defer FastAPI decision** — start with `custom_route()` for all auth routes; add FastAPI only if complexity warrants it +5. **Write `MuralTokenManager` class** — encapsulate Layer 2 logic (storage, refresh, injection) in a single class that can be tested independently +6. **Add SPA PR early in the migration sequence** — set up `ui/` directory, migrate surviving React components, configure Vite + `SpaStaticFiles` mount. This PR has no dependency on auth and can land as early as PR2 or PR3. See [SPA serving research](fastmcp-react-spa-serving-research.md) for full directory structure, Dockerfile, and dev workflow. + +### Sub-Research Documents + +The following deeper research documents were produced during this investigation and are referenced above: + +- [Serving a React SPA from FastMCP](fastmcp-react-spa-serving-research.md) — Static file serving, Vite build integration, SPA routing, dev mode HMR, Docker multi-stage build, alternatives evaluation +- [FastMCP Custom Routes, Middleware, and Non-MCP HTTP Endpoints](fastmcp-custom-routes-research.md) — `custom_route()` capabilities, limitations, route precedence, FastAPI outer app pattern diff --git a/fastmcp-migration/banksy-research/fastmcp-custom-routes-research.md b/fastmcp-migration/banksy-research/fastmcp-custom-routes-research.md new file mode 100644 index 0000000..6b77a7f --- /dev/null +++ b/fastmcp-migration/banksy-research/fastmcp-custom-routes-research.md @@ -0,0 +1,894 @@ +# FastMCP Custom Routes, Middleware, and Non-MCP HTTP Endpoints + +## Executive Summary + +FastMCP's HTTP capabilities are sufficient for banksy's migration needs, but require understanding two distinct layers. For MCP protocol concerns (tool auth, logging, rate limiting), FastMCP provides a rich middleware system with operation-level hooks. For non-MCP HTTP endpoints (health checks, auth callbacks, webhooks), FastMCP offers `custom_route()` — a thin decorator over Starlette's routing that supports any HTTP method and the full Starlette request/response API. The main gap is that `custom_route()` does not participate in FastMCP's dependency injection system and provides no built-in route grouping. For banksy's 5+ custom HTTP routes with auth middleware requirements, the recommended architecture is to start with `custom_route()` for early PRs (health endpoint only), then evaluate graduating to a FastAPI outer app when auth routes land in PR5–PR6. This gives full FastAPI routing, DI, and selective middleware while keeping MCP protocol handling inside FastMCP. + +--- + +## 1. `custom_route()` Deep Dive + +### API Signature + +From `fastmcp/server/mixins/transport.py` (`TransportMixin`): + +```python +def custom_route( + self: FastMCP, + path: str, + methods: list[str], + name: str | None = None, + include_in_schema: bool = True, +) -> Callable[ + [Callable[[Request], Awaitable[Response]]], + Callable[[Request], Awaitable[Response]], +] +``` + +**Parameters:** + +| Parameter | Type | Description | +|-----------|------|-------------| +| `path` | `str` | URL path (e.g., `/health`, `/auth/mural-link/code`) | +| `methods` | `list[str]` | HTTP methods (e.g., `["GET"]`, `["POST"]`, `["GET", "POST"]`) | +| `name` | `str \| None` | Optional route name for URL reverse-lookup | +| `include_in_schema` | `bool` | Whether to include in OpenAPI schema (default `True`) | + +### HTTP Methods + +Any standard HTTP method is supported: GET, POST, PUT, DELETE, PATCH, HEAD, OPTIONS. Pass them as strings in the `methods` list. + +### Handler Signature + +Handlers receive a Starlette `Request` and must return a Starlette `Response`: + +```python +from starlette.requests import Request +from starlette.responses import JSONResponse + +@mcp.custom_route("/health", methods=["GET"]) +async def health(request: Request) -> JSONResponse: + return JSONResponse({"status": "ok"}) +``` + +### Request API + +The full Starlette `Request` API is available: + +```python +@mcp.custom_route("/auth/mural-link/claim", methods=["POST"]) +async def claim_tokens(request: Request) -> JSONResponse: + # Path parameters (if path contains {param}) + # param = request.path_params["param"] + + # Query parameters + code = request.query_params.get("code") + + # Request body (JSON) + body = await request.json() + + # Raw body bytes + raw = await request.body() + + # Headers + auth_header = request.headers.get("authorization") + + # Client info + client_ip = request.client.host if request.client else "unknown" + + # Cookies + session = request.cookies.get("session_id") + + return JSONResponse({"success": True}) +``` + +Path parameters use Starlette's standard syntax: + +```python +@mcp.custom_route("/users/{user_id}", methods=["GET"]) +async def get_user(request: Request) -> JSONResponse: + user_id = request.path_params["user_id"] + return JSONResponse({"user_id": user_id}) +``` + +### Response Types + +All Starlette response classes are available: + +```python +from starlette.responses import ( + JSONResponse, # JSON with content-type application/json + PlainTextResponse, # Plain text + HTMLResponse, # HTML content + StreamingResponse, # Streaming/chunked responses + RedirectResponse, # HTTP redirects + Response, # Base response (arbitrary content-type) +) +``` + +### Implementation Details + +Internally, `custom_route()` appends a `starlette.routing.Route` to `self._additional_http_routes`. When the HTTP app is created (via `http_app()` or `run(transport="http")`), these routes are added to the Starlette app with the lowest precedence — after MCP protocol routes and auth routes. + +--- + +## 2. Route Organization and Grouping + +### No Built-in Router Concept + +FastMCP does not provide a router, blueprint, or APIRouter abstraction. All custom routes are registered directly on the `FastMCP` instance via `@mcp.custom_route()` and stored in a flat list. + +### `mount()` Does Not Propagate Custom Routes + +FastMCP's `mount()` is designed for composing MCP sub-servers (tools, resources, prompts). Custom routes registered on a mounted server are **not** propagated to the parent: + +```python +child = FastMCP("child") + +@child.custom_route("/child-health", methods=["GET"]) +async def child_health(request): ... + +parent = FastMCP("parent") +parent.mount(child, namespace="child") +# /child-health is NOT accessible on the parent's HTTP app +``` + +The `_get_additional_http_routes()` method returns only the server's own routes, not those from providers or mounted servers. This is confirmed by tests in `tests/server/mount/test_advanced.py`. + +### Organizing Many Routes + +For projects with 5+ custom HTTP routes, there are two patterns: + +**Pattern 1: Helper functions registering on the main server** + +```python +# src/banksy/routes/health.py +from starlette.requests import Request +from starlette.responses import JSONResponse + +async def health_handler(request: Request) -> JSONResponse: + return JSONResponse({"status": "ok"}) + +def register_health_routes(mcp): + mcp.custom_route("/health", methods=["GET"])(health_handler) +``` + +```python +# src/banksy/server.py +from banksy.routes.health import register_health_routes +mcp = FastMCP("banksy") +register_health_routes(mcp) +``` + +This keeps route handlers in separate files but still registers them flat on the main server. It works but provides no route grouping, prefix management, or per-group middleware. + +**Pattern 2: FastAPI/Starlette outer app (recommended for 5+ routes)** + +Mount FastMCP into a FastAPI app and define HTTP routes using FastAPI's `APIRouter`: + +```python +from fastapi import FastAPI, APIRouter +from fastmcp import FastMCP + +mcp = FastMCP("banksy") +mcp_app = mcp.http_app(path="/") + +auth_router = APIRouter(prefix="/auth") + +@auth_router.post("/mural-link/code") +async def create_code(): ... + +@auth_router.post("/mural-link/claim") +async def claim_tokens(): ... + +api = FastAPI(lifespan=mcp_app.lifespan) +api.include_router(auth_router) +api.mount("/mcp", mcp_app) +``` + +This provides full FastAPI features: `APIRouter` for grouping, `Depends()` for DI, per-router middleware, and automatic OpenAPI docs for the HTTP routes. + +### Nested Mounts + +Starlette supports nested `Mount()` for hierarchical route organization. FastMCP's docs show this pattern: + +```python +from starlette.applications import Starlette +from starlette.routing import Mount + +inner_app = Starlette(routes=[Mount("/inner", app=mcp_app)]) +app = Starlette( + routes=[Mount("/outer", app=inner_app)], + lifespan=mcp_app.lifespan, +) +# MCP accessible at /outer/inner/mcp/ +``` + +--- + +## 3. Middleware + +FastMCP has **two distinct middleware layers** that operate at different levels. Understanding this distinction is critical for banksy's architecture. + +### Layer 1: MCP Protocol Middleware + +**Scope:** Only MCP protocol traffic (tool calls, resource reads, prompt retrieval, list operations). Does **not** apply to custom HTTP routes. + +**API:** Subclass `fastmcp.server.middleware.Middleware` and override hooks: + +```python +from fastmcp.server.middleware import Middleware, MiddlewareContext, CallNext + +class AuthMiddleware(Middleware): + async def on_request(self, context: MiddlewareContext, call_next): + # Runs for all MCP requests (tool calls, resource reads, etc.) + # context.method: "tools/call", "resources/read", etc. + # context.fastmcp_context: FastMCP Context object + result = await call_next(context) + return result + + async def on_call_tool(self, context: MiddlewareContext, call_next): + # Runs only for tool calls + tool_name = context.message.name + args = context.message.arguments + result = await call_next(context) + return result +``` + +**Available hooks** (from general to specific): + +| Level | Hook | Fires On | +|-------|------|----------| +| Message | `on_message` | All MCP traffic (requests + notifications) | +| Type | `on_request` | Requests expecting a response | +| Type | `on_notification` | Fire-and-forget notifications | +| Operation | `on_call_tool` | Tool execution | +| Operation | `on_read_resource` | Resource reads | +| Operation | `on_get_prompt` | Prompt retrieval | +| Operation | `on_list_tools` | Tool listing | +| Operation | `on_list_resources` | Resource listing | +| Operation | `on_list_prompts` | Prompt listing | +| Operation | `on_initialize` | Client connection | + +**Registration:** + +```python +mcp = FastMCP("banksy") +mcp.add_middleware(AuthMiddleware()) +# or at construction: +mcp = FastMCP("banksy", middleware=[AuthMiddleware()]) +``` + +**Execution order:** First added = first in, last out: + +``` +Request → ErrorHandling → RateLimiting → Logging → Handler +Response ← ErrorHandling ← RateLimiting ← Logging ← Handler +``` + +**Built-in middleware:** LoggingMiddleware, StructuredLoggingMiddleware, TimingMiddleware, DetailedTimingMiddleware, ResponseCachingMiddleware, RateLimitingMiddleware, SlidingWindowRateLimitingMiddleware, ErrorHandlingMiddleware, RetryMiddleware, PingMiddleware, ResponseLimitingMiddleware, ToolInjectionMiddleware. + +### Layer 2: HTTP/ASGI Middleware + +**Scope:** All HTTP traffic — both MCP endpoints (`/mcp`) and custom routes (`/health`, `/auth/*`). + +**API:** Standard Starlette middleware, passed to `http_app()` or `run()`: + +```python +from starlette.middleware import Middleware +from starlette.middleware.cors import CORSMiddleware + +middleware = [ + Middleware( + CORSMiddleware, + allow_origins=["*"], + allow_methods=["*"], + allow_headers=["*"], + ) +] + +app = mcp.http_app(middleware=middleware) +# or +mcp.run(transport="http", middleware=middleware) +``` + +Custom ASGI middleware example: + +```python +from starlette.middleware.base import BaseHTTPMiddleware + +class RequestLoggingMiddleware(BaseHTTPMiddleware): + async def dispatch(self, request, call_next): + print(f"HTTP {request.method} {request.url.path}") + response = await call_next(request) + return response + +middleware = [Middleware(RequestLoggingMiddleware)] +``` + +### Selective Middleware Application + +MCP protocol middleware is inherently selective — hooks fire only for the relevant operation type. For HTTP/ASGI middleware, selective application requires one of: + +**Option 1: Path checking inside the middleware** + +```python +class AuthHTTPMiddleware(BaseHTTPMiddleware): + async def dispatch(self, request, call_next): + if request.url.path.startswith("/auth/") and request.url.path != "/auth/sign-in": + token = request.headers.get("authorization") + if not validate_token(token): + return JSONResponse({"error": "unauthorized"}, status_code=401) + return await call_next(request) +``` + +**Option 2: Outer app pattern with per-mount middleware** + +```python +from fastapi import FastAPI + +api = FastAPI(lifespan=mcp_app.lifespan) + +# Auth routes with auth middleware +auth_app = FastAPI() +auth_app.add_middleware(AuthHTTPMiddleware) +auth_app.include_router(auth_router) +api.mount("/auth", auth_app) + +# Health route — no auth +@api.get("/health") +async def health(): ... + +# MCP — uses its own MCP-level auth +api.mount("/mcp", mcp_app) +``` + +### Middleware Interaction: MCP vs HTTP + +For a tool call arriving over HTTP: + +1. HTTP/ASGI middleware runs first (on the raw HTTP request) +2. The MCP protocol layer parses the JSON-RPC message +3. MCP protocol middleware runs (on the parsed MCP message) +4. The tool handler executes +5. Response flows back through MCP middleware, then HTTP middleware + +Custom HTTP routes bypass step 2–4 entirely — they only go through HTTP/ASGI middleware. + +--- + +## 4. Dependency Injection and Request Context + +### MCP Tools: Full DI Support + +MCP tools have access to FastMCP's DI system (powered by Docket/uncalled-for): + +```python +from fastmcp import FastMCP +from fastmcp.dependencies import CurrentContext, CurrentRequest, Depends +from fastmcp.server.context import Context +from starlette.requests import Request + +mcp = FastMCP("banksy") + +async def get_db_session(): + session = await create_session() + try: + yield session + finally: + await session.close() + +@mcp.tool +async def list_murals( + workspace_id: str, + ctx: Context = CurrentContext(), + request: Request = CurrentRequest(), + db=Depends(get_db_session), +) -> str: + user_agent = request.headers.get("user-agent") + await ctx.info(f"Listing murals for workspace {workspace_id}") + # db is an async session, auto-closed after the tool returns + ... +``` + +Built-in injectable dependencies: + +| Dependency | Injection | Getter Function | +|-----------|-----------|-----------------| +| MCP Context | `CurrentContext()` or `Context` type hint | `get_context()` | +| FastMCP server | `CurrentFastMCP()` | `get_server()` | +| HTTP Request | `CurrentRequest()` | `get_http_request()` | +| HTTP Headers | `CurrentHeaders()` | `get_http_headers()` | +| Access Token | `CurrentAccessToken()` | `get_access_token()` | +| Token Claim | `TokenClaim("claim_name")` | N/A | +| Custom | `Depends(callable)` | N/A | + +DI parameters are automatically excluded from the MCP tool schema — clients never see them. + +### Custom Routes: No FastMCP DI + +Custom HTTP routes receive a raw Starlette `Request` and do not participate in FastMCP's DI system: + +```python +@mcp.custom_route("/auth/mural-link/code", methods=["POST"]) +async def create_code(request: Request) -> JSONResponse: + # No Depends(), no Context, no CurrentRequest() + # Must manually extract what you need from the request + body = await request.json() + user_id = body["userId"] + ... +``` + +### Sharing State Between Middleware and Handlers + +**For MCP tools:** Middleware sets state via `context.fastmcp_context.set_state()`, tools read via `ctx.get_state()`: + +```python +class UserMiddleware(Middleware): + async def on_request(self, context: MiddlewareContext, call_next): + from fastmcp.server.dependencies import get_http_headers + headers = get_http_headers() or {} + user_id = headers.get("x-user-id", "anonymous") + if context.fastmcp_context: + await context.fastmcp_context.set_state("user_id", user_id) + return await call_next(context) + +@mcp.tool +async def get_data(ctx: Context = CurrentContext()) -> str: + user_id = await ctx.get_state("user_id") + return f"Data for {user_id}" +``` + +**For custom routes:** Use Starlette's `request.state` or module-level singletons: + +```python +class InjectUserMiddleware(BaseHTTPMiddleware): + async def dispatch(self, request, call_next): + request.state.user_id = extract_user(request) + return await call_next(request) + +@mcp.custom_route("/auth/mural-link/code", methods=["POST"]) +async def create_code(request: Request) -> JSONResponse: + user_id = request.state.user_id + ... +``` + +### FastAPI Alternative + +If using the FastAPI outer app pattern, custom routes get full FastAPI DI: + +```python +from fastapi import Depends + +async def get_current_user(request: Request) -> User: + token = request.headers.get("authorization") + return await validate_and_get_user(token) + +@auth_router.post("/mural-link/code") +async def create_code(user: User = Depends(get_current_user)): + # Full DI, automatic validation, OpenAPI docs + ... +``` + +--- + +## 5. FastMCP's Underlying Web Framework + +### Stack + +- **ASGI framework:** Starlette +- **ASGI server:** Uvicorn +- **SSE:** `sse-starlette` library + +### `http_app()` Return Type + +`mcp.http_app()` returns a `StarletteWithLifespan` instance (a Starlette subclass) — a standard ASGI app that can be served by any ASGI server or mounted into any ASGI-compatible framework. + +### Dropping Down to Starlette + +You can access the underlying Starlette app directly: + +```python +app = mcp.http_app() +# app is a Starlette ASGI application +# Can be served with: uvicorn module:app +``` + +### Mounting FastMCP into External Apps + +**Into Starlette:** + +```python +from starlette.applications import Starlette +from starlette.routing import Mount + +mcp_app = mcp.http_app(path="/mcp") +app = Starlette( + routes=[ + Mount("/mcp-server", app=mcp_app), + ], + lifespan=mcp_app.lifespan, # Required for session manager +) +``` + +**Into FastAPI:** + +```python +from fastapi import FastAPI + +mcp_app = mcp.http_app(path="/") +api = FastAPI(lifespan=mcp_app.lifespan) + +@api.get("/health") +async def health(): + return {"status": "ok"} + +api.mount("/mcp", mcp_app) +``` + +The lifespan must be passed from the MCP app to the outer app for the session manager to initialize properly. + +### Trade-offs: Native Routing vs. Outer App + +| Aspect | `custom_route()` | FastAPI Outer App | +|--------|------------------|-------------------| +| Setup complexity | Minimal | Moderate | +| Route grouping | None | `APIRouter` with prefixes | +| DI for HTTP routes | None | Full `Depends()` | +| Per-route middleware | Manual path checks | Per-router middleware | +| OpenAPI for HTTP routes | No | Automatic | +| Extra dependency | None | `fastapi` | +| Best for | 1–3 simple routes | 5+ routes, auth flows | + +--- + +## 6. Testing Custom Routes + +### MCP Protocol Testing + +FastMCP's `Client` test utility covers MCP protocol operations (tool calls, resource reads, prompt retrieval): + +```python +import pytest +from fastmcp.client import Client + +@pytest.fixture +async def mcp_client(): + async with Client(transport=mcp) as client: + yield client + +async def test_echo_tool(mcp_client: Client): + result = await mcp_client.call_tool("echo", {"message": "hello"}) + assert result.data == "hello" +``` + +This does **not** test custom HTTP routes. + +### HTTP Route Testing + +Use `httpx.AsyncClient` with `httpx.ASGITransport` to test custom routes: + +```python +import httpx +import pytest + +@pytest.fixture +async def http_client(): + app = mcp.http_app() + transport = httpx.ASGITransport(app=app) + async with httpx.AsyncClient(transport=transport, base_url="http://test") as client: + yield client + +async def test_health_endpoint(http_client: httpx.AsyncClient): + response = await http_client.get("/health") + assert response.status_code == 200 + data = response.json() + assert data["status"] == "ok" +``` + +### Integration Testing Auth Flows + +For flows that span HTTP routes and MCP protocol (e.g., OAuth callback then MCP tool call), create both clients from the same app: + +```python +@pytest.fixture +async def app(): + return mcp.http_app() + +@pytest.fixture +async def http_client(app): + transport = httpx.ASGITransport(app=app) + async with httpx.AsyncClient(transport=transport, base_url="http://test") as client: + yield client + +@pytest.fixture +async def mcp_client(): + async with Client(transport=mcp) as client: + yield client + +async def test_auth_then_tool_call(http_client, mcp_client): + # Step 1: Create activation code via HTTP + resp = await http_client.post("/auth/mural-link/code", json={"userId": "u1"}) + assert resp.status_code == 200 + code = resp.json()["code"] + + # Step 2: Claim tokens via HTTP + resp = await http_client.post("/auth/mural-link/claim", json={"code": code}) + assert resp.status_code == 200 + + # Step 3: Call MCP tool (now authenticated) + result = await mcp_client.call_tool("list_murals", {"workspaceId": "w1"}) + assert result.data is not None +``` + +### Testing with FastAPI Outer App + +If using the FastAPI outer app pattern, the same `httpx.AsyncClient` approach works — just point at the FastAPI app instead: + +```python +@pytest.fixture +async def http_client(): + transport = httpx.ASGITransport(app=api) # FastAPI app + async with httpx.AsyncClient(transport=transport, base_url="http://test") as client: + yield client +``` + +--- + +## 7. Real-World Examples + +### canvas-mcp Prototype + +Located at `/Users/wkirkham/dev/canvas-mcp`, this prototype demonstrates: + +**Custom route — health endpoint:** + +```python +@mcp.custom_route("/health", methods=["GET"]) +async def health_route(request: Request) -> JSONResponse: + return JSONResponse(_health_payload()) +``` + +**MCP middleware — profiling:** + +```python +from fastmcp.server.middleware import Middleware, MiddlewareContext, CallNext + +class ProfilingMiddleware(Middleware): + async def on_message(self, context: MiddlewareContext, call_next: CallNext[object, object]): + if not _ENABLE_PROFILING: + return await call_next(context) + profiler = Profiler(async_mode="enabled") + profiler.start() + try: + return await call_next(context) + finally: + profiler.stop() + # Write HTML report... + +if _ENABLE_PROFILING: + mcp.add_middleware(ProfilingMiddleware()) +``` + +**Shared logic between MCP tool and HTTP route:** + +```python +def _health_payload(): + return {"status": "ok", "version": "0.1.0", "message": "..."} + +# MCP tool returns text +@mcp.tool(name="check_health") +def check_health() -> str: + p = _health_payload() + return f"Status: {p['status']}. {p['message']} Version: {p['version']}." + +# HTTP route returns JSON +@mcp.custom_route("/health", methods=["GET"]) +async def health_route(request: Request) -> JSONResponse: + return JSONResponse(_health_payload()) +``` + +### FastMCP Documentation Examples + +The docs provide examples for: + +- **Health checks** via `custom_route()` (shown in the HTTP deployment guide) +- **FastAPI integration** with `mcp.http_app(path="/")` mounted into a FastAPI app +- **Starlette mounting** with nested `Mount()` for complex routing +- **ASGI middleware** for CORS, request logging + +### No OAuth Callback Examples Found + +No public FastMCP projects with OAuth callback flows (code/claim pattern) were found. Banksy's session activation routes (`/auth/mural-link/code` and `/auth/mural-link/claim`) will be a novel pattern in the FastMCP ecosystem. The implementation should be straightforward since these are standard POST routes that happen to interact with the database and external APIs. + +--- + +## Patterns and Recommendations + +### Recommended Architecture for Banksy + +**Phase 1 (PR1–PR4): `custom_route()` only** + +During the bootstrap and OpenAPI phases, banksy needs only a `/health` endpoint. Use `custom_route()` directly: + +```python +# src/banksy/server.py +from fastmcp import FastMCP +from starlette.requests import Request +from starlette.responses import JSONResponse + +mcp = FastMCP("banksy") + +@mcp.custom_route("/health", methods=["GET"]) +async def health(request: Request) -> JSONResponse: + return JSONResponse({"status": "ok", "version": "0.1.0"}) +``` + +**Phase 2 (PR5–PR6): Evaluate FastAPI outer app** + +When auth routes land, evaluate whether to: +- (a) Keep using `custom_route()` with manual middleware path checks, or +- (b) Graduate to a FastAPI outer app for full routing, DI, and grouped middleware + +Decision criteria: +- If auth routes remain simple (2 POST handlers with basic validation) → stay with `custom_route()` +- If auth routes need shared DI (DB sessions, config), per-route middleware, or OpenAPI docs → switch to FastAPI outer app + +**Recommended server.py structure for Phase 2 (FastAPI outer app):** + +```python +# src/banksy/server.py +from fastapi import FastAPI +from fastmcp import FastMCP + +from banksy.auth.mural_session_routes import auth_router +from banksy.routes.health import health_router + +mcp = FastMCP("banksy", auth=auth_provider) +mcp_app = mcp.http_app(path="/") + +app = FastAPI(lifespan=mcp_app.lifespan) +app.include_router(health_router) +app.include_router(auth_router, prefix="/auth") +app.mount("/mcp", mcp_app) +``` + +### Middleware Strategy + +| Concern | Layer | Applies To | +|---------|-------|-----------| +| MCP auth (session validation) | MCP `on_request` hook | MCP tool/resource calls | +| MCP logging | MCP `LoggingMiddleware` | MCP traffic | +| MCP rate limiting | MCP `RateLimitingMiddleware` | MCP traffic | +| HTTP CORS | ASGI `CORSMiddleware` | All HTTP traffic | +| HTTP request logging | ASGI custom middleware | All HTTP traffic | +| Auth for `/auth/*` routes | ASGI middleware or FastAPI `Depends()` | Custom HTTP routes only | +| Health endpoint | No auth | `/health` only | + +### Shared State Pattern + +For services needed by both MCP tools and HTTP routes (e.g., database sessions, Mural API client): + +```python +# Module-level singleton (works for both MCP tools and HTTP routes) +from banksy.db.session import get_async_session + +# In MCP tools: use Depends() +@mcp.tool +async def list_murals(db=Depends(get_async_session)) -> str: ... + +# In custom routes: call directly +@mcp.custom_route("/auth/mural-link/claim", methods=["POST"]) +async def claim_tokens(request: Request) -> JSONResponse: + async with get_async_session() as db: + ... +``` + +If using FastAPI outer app, both can use `Depends()`. + +--- + +## Limitations and Workarounds + +### 1. No DI in `custom_route()` Handlers + +**Limitation:** Custom routes receive raw Starlette `Request` — no `Depends()`, no `CurrentContext()`, no automatic schema generation. + +**Workaround:** Use module-level helper functions or the FastAPI outer app pattern for full DI. + +### 2. No Route Grouping in FastMCP + +**Limitation:** All `custom_route()` registrations are flat — no prefix grouping, no per-group middleware. + +**Workaround:** Organize handler functions in separate modules, register them on the main server with a helper function. Or use the FastAPI outer app pattern with `APIRouter`. + +### 3. Mounted Server Routes Not Propagated + +**Limitation:** Custom routes on mounted sub-servers are not accessible via the parent server's HTTP app. + +**Workaround:** Register all custom routes on the main server, or use the outer app pattern where route registration is independent of MCP server composition. + +### 4. ASGI Middleware Is All-or-Nothing + +**Limitation:** ASGI middleware passed to `http_app()` applies to all HTTP traffic — there's no built-in way to apply it only to certain paths. + +**Workaround:** Check `request.url.path` inside the middleware, or use the outer app pattern to apply different middleware to different mounts. + +### 5. `custom_route()` Does Not Support Pydantic Request/Response Models + +**Limitation:** Unlike FastAPI, `custom_route()` handlers don't automatically validate request bodies against Pydantic models or generate OpenAPI schemas. + +**Workaround:** Validate manually using Pydantic inside the handler: + +```python +from pydantic import BaseModel, ValidationError + +class ClaimRequest(BaseModel): + code: str + nonce: str + +@mcp.custom_route("/auth/mural-link/claim", methods=["POST"]) +async def claim_tokens(request: Request) -> JSONResponse: + try: + body = ClaimRequest(**(await request.json())) + except ValidationError as e: + return JSONResponse({"error": e.errors()}, status_code=400) + ... +``` + +Or use the FastAPI outer app pattern where Pydantic validation is automatic. + +### 6. Testing Requires Separate Clients + +**Limitation:** FastMCP's `Client` only covers MCP protocol. HTTP routes need a separate `httpx.AsyncClient`. + +**Workaround:** Create both clients in test fixtures, as shown in the testing section above. + +--- + +## Impact on Migration Plan + +### PR1: Project Bootstrap + +- Add `/health` via `@mcp.custom_route()` on the main FastMCP server +- No changes to the migration plan needed — this is a simple addition to `src/banksy/server.py` +- Test with `httpx.AsyncClient` + `ASGITransport` + +### PR5: Auth (IDE to Banksy) + +- **MCP auth:** Use FastMCP's MCP middleware (`on_request` hook) to validate sessions on MCP protocol traffic +- **HTTP auth:** Use Starlette ASGI middleware for auth on `/auth/*` HTTP routes +- Document the two-layer middleware architecture in `docs/AUTH.md` +- Decision point: if auth routes need shared DI or grow beyond 2–3 endpoints, consider switching to FastAPI outer app + +### PR6: Auth (Banksy to Mural) + +- Session activation routes (`/auth/mural-link/code`, `/auth/mural-link/claim`) are standard POST routes +- These need database access (token storage) and HTTP client access (Mural API) — evaluate if `custom_route()` with manual service construction is acceptable or if FastAPI `Depends()` is needed +- If graduating to FastAPI outer app: add `fastapi` to `pyproject.toml` dependencies, restructure `server.py` to use `FastAPI` as the outer app with `mcp.http_app()` mounted at `/mcp` + +### PR7: Testing Infrastructure + +- Add `httpx` test fixtures for HTTP route testing alongside FastMCP `Client` fixtures for MCP testing +- Pattern: shared ASGI app fixture, separate `httpx.AsyncClient` and `Client` fixtures +- Integration tests for auth flows should use both clients (HTTP for auth callbacks, MCP for tool calls) + +### General: Documentation + +- Add a section to the migration docs explaining the two middleware layers (MCP protocol vs. HTTP/ASGI) +- Document the architectural decision: when to use `custom_route()` vs. FastAPI outer app +- Update `docs/AUTH.md` to reflect the Python middleware architecture (replacing the xmcp/Express middleware described in the current TS codebase) + +### New Dependency Consideration + +If the FastAPI outer app pattern is adopted (likely at PR5 or PR6), add `fastapi` to `pyproject.toml`: + +```toml +dependencies = [ + "fastmcp>=3.1", + "fastapi>=0.115", # Only if using outer app pattern + ... +] +``` + +FastAPI is built on Starlette (which FastMCP already depends on) and adds minimal overhead. The version constraint should match FastMCP's Starlette dependency range. diff --git a/fastmcp-migration/banksy-research/fastmcp-react-spa-serving-research.md b/fastmcp-migration/banksy-research/fastmcp-react-spa-serving-research.md new file mode 100644 index 0000000..261ef2a --- /dev/null +++ b/fastmcp-migration/banksy-research/fastmcp-react-spa-serving-research.md @@ -0,0 +1,815 @@ +# Serving a React SPA from FastMCP (Starlette/ASGI) + +## 1. Executive Summary + +Serving a Vite-built React SPA from a FastMCP server is feasible and straightforward. FastMCP's HTTP layer is built on Starlette, which provides `StaticFiles` for serving static assets and `Mount` for composing sub-applications into a larger routing tree. Starlette does not have a built-in SPA catch-all mode, but a well-established community pattern — subclassing `StaticFiles` to fall back to `index.html` for unrecognized paths — solves this in under ten lines of Python. The resulting `SpaStaticFiles` class has been the standard Starlette SPA solution since 2019 and works reliably with any frontend framework that produces static build output. + +The key architectural insight is that FastMCP's `http_app()` returns a standard Starlette application instance. After obtaining this app, you can call `app.mount()` to attach the SPA's static file serving at any path prefix. Because Starlette evaluates routes in declaration order, the MCP protocol endpoint, auth routes, and `custom_route()` API endpoints are all registered before the SPA mount — guaranteeing they take precedence. The SPA mount acts as a catch-all for everything else, serving `index.html` for any path that does not match an API route or a static asset file on disk. + +The main complexity is not in the serving layer but in the build pipeline. Banksy becomes a hybrid project: a Python server with a Node.js-built frontend. This requires a `ui/` directory with its own `package.json` and Vite configuration, a build step that runs `npm run build` before the Python server can serve the output, and a Docker multi-stage build that uses a Node.js stage for the SPA and a Python stage for the server. The Mural Design System npm packages (`@muraldevkit/ui-toolkit`, `@muraldevkit/ds-foundation`, etc.) are published to a private registry and work independently of any parent TypeScript monorepo, so decoupling the SPA from the current xmcp workspace is clean. For development, the simplest approach is to run Vite's dev server and the FastMCP server as two separate processes, with Vite proxying API requests to FastMCP — the same pattern used by most React + Python projects. + +--- + +## 2. Detailed Findings + +### 2.1 Q1: Static File Serving from Starlette/ASGI + +Starlette provides `StaticFiles`, an ASGI sub-application purpose-built for serving files from a directory on disk. When mounted into a Starlette (or FastMCP) app, it handles GET requests by mapping the URL path to a filesystem path within the configured directory. If the file exists, it returns it with correct MIME types, ETags, and caching headers. If the file does not exist, it returns a 404 response — which is important because it means unmatched paths don't leak into other routes. + +The `html=True` parameter adds a useful behavior: when a request targets a directory path, `StaticFiles` looks for an `index.html` file in that directory and serves it automatically. This handles the root `/` path but does not handle SPA client-side routing (e.g., `/auth/connect` resolving to `index.html` when no `auth/connect` file exists on disk). + +For SPA routing, the standard pattern is to subclass `StaticFiles` and override `lookup_path()` to fall back to `index.html` when the requested path does not match any file: + +```python +from starlette.staticfiles import StaticFiles + + +class SpaStaticFiles(StaticFiles): + """Serve static files with SPA catch-all fallback. + + When a requested path does not match any file on disk, serve index.html + so the client-side router (React Router) can handle the route. + """ + + async def lookup_path(self, path: str): + full_path, stat_result = await super().lookup_path(path) + if stat_result is None: + return await super().lookup_path("./index.html") + return full_path, stat_result +``` + +This pattern is widely used in the Starlette ecosystem. A PR to add native `fallback_file` support to Starlette was opened in 2024 (#2591) but closed without merging — the maintainers consider the subclass approach sufficient and prefer not to add SPA-specific behavior to the core library. + +#### Mounting on a FastMCP App + +FastMCP's `http_app()` method returns a `StarletteWithLifespan` instance (a thin subclass of `Starlette`). This is a full Starlette application with standard `.mount()`, `.add_route()`, and other routing APIs. There are two approaches to adding static file serving: + +**Approach A: Mount after `http_app()` (recommended).** Call `http_app()` to get the Starlette app, then mount the SPA on it. This is the simplest approach and keeps SPA concerns outside of FastMCP's internal route assembly. + +```python +from fastmcp import FastMCP +from starlette.routing import Mount + +mcp = FastMCP("banksy") + +# Register custom_route() endpoints (health, Session Activation, etc.) +# ... (these are registered on the mcp instance) + +# Build the ASGI app +app = mcp.http_app(transport="streamable-http") + +# Mount the SPA with lowest precedence (after all MCP and custom routes) +app.mount("/", SpaStaticFiles(directory="ui/dist", html=True), name="spa") +``` + +**Approach B: Pass via `routes` parameter.** Both `create_streamable_http_app()` and `create_sse_app()` accept a `routes: list[BaseRoute]` parameter. You can pass a `Mount` with `SpaStaticFiles` through this parameter. However, this requires importing FastMCP's internal `create_streamable_http_app` function and bypassing the higher-level `http_app()` API — less clean for something that should be a simple extension. + +Approach A is recommended because it requires no knowledge of FastMCP internals, works with any transport type, and keeps the SPA concern clearly separated from MCP protocol configuration. + +#### Route Precedence + +Starlette evaluates routes in the order they appear in the routes list. `http_app()` builds the route list as: + +1. Auth routes (if auth is enabled) +2. MCP transport route (`/mcp`) +3. Extra routes passed via `routes` parameter +4. `custom_route()` endpoints (Session Activation, health, etc.) + +When you call `app.mount("/", ...)` after `http_app()`, the mount is appended to the end of the route list. Starlette's `Mount` with `StaticFiles` only matches if no earlier route matched first. If the request matches `/mcp`, a `custom_route()` path, or an auth endpoint, those handle it. If nothing matches, the SPA mount catches the request and either serves a static file or falls back to `index.html`. This is exactly the precedence order we need. + +### 2.2 Q2: Vite Build Integration with a Python Project + +The current banksy project is a TypeScript monorepo where the SPA lives in `packages/banksy-core/ui/` and shares its `package.json` with the server code. In the Python migration, the SPA needs its own isolated Node.js project because the Python server has no `package.json` of its own. + +#### Recommended Directory Structure + +Place the SPA source in a `ui/` directory at the banksy project root, as a sibling to `src/` (where the Python source will live): + +``` +banksy/ +├── src/ +│ └── banksy/ # Python FastMCP server code +│ ├── __init__.py +│ ├── server.py +│ ├── auth.py +│ ├── mural_session.py +│ └── spa.py # SpaStaticFiles class +├── ui/ # React SPA (standalone Node.js project) +│ ├── package.json +│ ├── package-lock.json +│ ├── vite.config.ts +│ ├── tailwind.config.mjs +│ ├── tsconfig.json +│ ├── index.html +│ ├── public/ +│ │ └── favicon.svg +│ └── src/ +│ ├── main.tsx # SPA entry point +│ ├── index.css +│ └── components/ +│ ├── home-page.tsx +│ ├── claude-landing.tsx +│ ├── session-activation.tsx +│ ├── completion-page.tsx +│ ├── error-page.tsx +│ └── shared/ +│ ├── AuthPageLayout.tsx +│ ├── ClientLandingLayout.tsx +│ ├── ConnectionCard.tsx +│ ├── CopyUrlBox.tsx +│ ├── IdeIcons.tsx +│ ├── LoadingState.tsx +│ ├── MuralLogo.tsx +│ └── SuccessState.tsx +├── pyproject.toml # Python project config +├── Dockerfile +└── Makefile # Build orchestration +``` + +The `ui/` directory has its own `package.json` with all npm dependencies (React, Vite, Tailwind, Mural DS packages). It is completely independent of the Python project — `pip install` does not touch it, and `npm install` inside `ui/` does not affect the Python environment. + +#### Vite Configuration + +The current auth SPA uses `base: '/auth/'` because it is mounted under the `/auth/` prefix in Express. In the new architecture, the SPA serves both the root landing page (`/`) and the auth-related pages (`/auth/connect`, `/auth/complete`, `/auth/error`). The simplest approach is to set `base: '/'` and have a single SPA that handles all routes: + +```typescript +// ui/vite.config.ts +import { defineConfig } from 'vite'; +import react from '@vitejs/plugin-react'; +import path from 'path'; + +export default defineConfig({ + plugins: [react()], + build: { + outDir: 'dist', + emptyOutDir: true, + rollupOptions: { + output: { + assetFileNames: 'assets/[name].[hash][extname]', + chunkFileNames: 'assets/[name].[hash].js', + entryFileNames: 'assets/[name].[hash].js', + }, + }, + }, + resolve: { + alias: { + '@': path.resolve(__dirname, './src'), + }, + }, + server: { + port: 3002, + proxy: { + '/mcp': 'http://localhost:8000', + '/auth/mural-link': 'http://localhost:8000', + '/auth/mural-oauth': 'http://localhost:8000', + '/health': 'http://localhost:8000', + }, + }, +}); +``` + +Note the home page no longer needs `vite-plugin-singlefile`. In the current TS setup, the home page was built as a single-file HTML blob that xmcp inlined at build time and served via `res.send()`. That complexity existed because xmcp's `homePage` config required an HTML string. With Starlette's `SpaStaticFiles`, the home page is just another React Router route — the SPA mount serves `index.html` for `/`, and React Router renders the `HomePage` component. No special build step needed. + +#### Build Integration + +The Vite build produces a `ui/dist/` directory containing `index.html` and an `assets/` subdirectory with hashed JS and CSS bundles. The Python server references this directory at startup: + +```python +# src/banksy/spa.py +import os +from pathlib import Path +from starlette.staticfiles import StaticFiles + +SPA_DIR = Path(os.environ.get( + "BANKSY_SPA_DIR", + str(Path(__file__).parent.parent.parent / "ui" / "dist"), +)) + + +class SpaStaticFiles(StaticFiles): + async def lookup_path(self, path: str): + full_path, stat_result = await super().lookup_path(path) + if stat_result is None: + return await super().lookup_path("./index.html") + return full_path, stat_result +``` + +The `BANKSY_SPA_DIR` environment variable allows overriding the path in production (Docker) vs development. The default resolves to `ui/dist/` relative to the Python source. + +A `Makefile` or script orchestrates both builds: + +```makefile +.PHONY: build build-ui build-server + +build: build-ui build-server + +build-ui: + cd ui && npm ci && npm run build + +build-server: + pip install -e . +``` + +#### pyproject.toml Considerations + +Python packaging tools like `setuptools` and `hatchling` support including non-Python data files in a package distribution. However, for banksy's deployment model (Docker image), this is unnecessary — the Dockerfile copies `ui/dist/` directly into the image. If banksy were distributed as a pip package, you would use `[tool.setuptools.package-data]` or hatch's `[tool.hatch.build.targets.wheel.shared-data]` to include the SPA build output. For now, Docker is the only deployment target, so the simpler approach of copying files in the Dockerfile is sufficient. + +### 2.3 Q3: SPA Client-Side Routing (Catch-All) + +The surviving SPA pages need both root-level routes (`/` for the home page) and prefixed routes (`/auth/connect` for Session Activation). React Router handles client-side navigation between these routes, but the server must return `index.html` for every route that the SPA owns — otherwise a browser refresh on `/auth/connect` returns a 404. + +#### Route Inventory + +The post-migration SPA has these routes: + +| Path | Component | Auth Required | +|------|-----------|---------------| +| `/` | `HomePage` | No | +| `/?client=claude-ai` | `ClaudeLanding` | No | +| `/auth/connect` | `SessionActivation` | No (code-based) | +| `/auth/complete` | `CompletionPage` | No | +| `/auth/error` | `ErrorPage` | No | + +The server-side API routes that must take precedence are: + +| Path | Method | Handler | +|------|--------|---------| +| `/mcp` | POST, GET, DELETE | FastMCP StreamableHTTP | +| `/auth/mural-link/code` | POST | `custom_route()` | +| `/auth/mural-link/claim` | POST | `custom_route()` | +| `/auth/mural-oauth/callback` | GET | `custom_route()` | +| `/health` | GET | `custom_route()` | +| `/.well-known/oauth-protected-resource` | GET | FastMCP auth (auto) | + +#### How Precedence Works + +When `SpaStaticFiles` is mounted at `/` as the last route, every request first checks all routes above it: + +1. FastMCP's auth routes match `/.well-known/*` paths. +2. FastMCP's MCP transport route matches `/mcp`. +3. `custom_route()` endpoints match their exact paths (`/auth/mural-link/code`, etc.). +4. Only if none of those match does the `SpaStaticFiles` mount handle the request. + +Within the SPA mount, `SpaStaticFiles` first checks if the requested path corresponds to an actual file in `ui/dist/`. If `/assets/main.abc123.js` is requested, it serves the file. If `/auth/connect` is requested and no such file exists, it falls back to `index.html`, and React Router takes over. + +There is one subtlety: Starlette's route matching for `Mount` uses prefix matching. A `Mount("/")` matches every path, but it only "wins" if no earlier route in the list matched. Because `custom_route()` endpoints are `Route` objects (which match exactly), they always take precedence over the prefix-based `Mount`. This is the correct behavior. + +#### Mounting Strategy: Root vs Prefix + +Two options exist: + +**Root mount (`/`).** The SPA is mounted at the root and handles all unmatched paths. The home page is at `/`, auth pages are at `/auth/*`. This is simpler because the SPA owns the entire URL space except for the carved-out API routes. + +**Prefix mount (`/app/` or `/auth/`).** The SPA is mounted under a specific prefix. This isolates the SPA's URL space but requires the Vite build to use `base: '/app/'` and complicates routing for the home page (which should be at `/`, not `/app/`). + +The root mount is recommended. The SPA has few routes, and the API endpoints have distinct path patterns that don't collide. Having the home page at `/` is important for the user experience — it is the first page visitors see. + +```python +# In server.py, after building the ASGI app: +from banksy.spa import SpaStaticFiles, SPA_DIR + +app = mcp.http_app(transport="streamable-http") +app.mount("/", SpaStaticFiles(directory=str(SPA_DIR), html=True), name="spa") +``` + +#### Future Extensibility + +Adding a new SPA route (e.g., `/settings`) requires only adding the route to React Router's configuration. No server-side changes are needed because `SpaStaticFiles` already falls back to `index.html` for any unrecognized path. Adding a new API route requires registering it as a `custom_route()` — it automatically takes precedence over the SPA mount. + +### 2.4 Q4: Dev Mode with Vite HMR + +During development, Vite provides a dev server with Hot Module Replacement (HMR) — code changes are reflected in the browser instantly without a full page reload. The question is how to run both Vite's dev server (for the SPA) and FastMCP's server (for the MCP protocol and API routes) simultaneously. + +#### Current Setup + +The current TS banksy runs Vite on port 3002 and xmcp on port 3001 as completely separate processes. There is no proxy between them — developers must know which port to use for which purpose. This works but is awkward: the SPA calls `window.location.origin` to build API URLs, so the SPA served by Vite on port 3002 sends API requests to port 3002, which does not have the API endpoints. + +#### Recommended: Vite Proxies to FastMCP + +The simplest and most standard approach is to configure Vite's dev server to proxy API requests to the FastMCP server. Vite has built-in proxy support via the `server.proxy` configuration: + +```typescript +// ui/vite.config.ts (dev server section) +export default defineConfig({ + server: { + port: 3002, + proxy: { + '/mcp': { + target: 'http://localhost:8000', + changeOrigin: true, + }, + '/auth/mural-link': { + target: 'http://localhost:8000', + changeOrigin: true, + }, + '/auth/mural-oauth': { + target: 'http://localhost:8000', + changeOrigin: true, + }, + '/health': { + target: 'http://localhost:8000', + changeOrigin: true, + }, + '/.well-known': { + target: 'http://localhost:8000', + changeOrigin: true, + }, + }, + }, +}); +``` + +With this configuration, the developer workflow is: + +1. **Terminal 1:** Start FastMCP server — `fastmcp dev` or `uvicorn banksy.server:app --port 8000 --reload` +2. **Terminal 2:** Start Vite dev server — `cd ui && npm run dev` +3. Open `http://localhost:3002` in the browser. Vite serves the SPA with HMR. API requests are proxied to port 8000. + +The SPA code uses `window.location.origin` (which is `http://localhost:3002` in dev), and the proxy transparently forwards API requests. No code changes are needed between dev and production. + +#### Alternative: FastMCP Proxies to Vite + +The reverse direction — FastMCP proxies non-API requests to Vite's dev server — is possible but more complex. It would require adding an `httpx`-based reverse proxy middleware to the Starlette app that forwards unmatched requests (including WebSocket connections for HMR) to Vite. This is significantly more code, harder to debug, and fragile with WebSocket upgrades. It also couples the Python server to a dev-only concern. Not recommended. + +#### Alternative: Run Separately Without Proxy + +This is the current TS approach. Developers access the SPA on port 3002 and the API on port 8000 separately. The SPA's `window.location.origin` calls would point to port 3002, which has no API. You would need to hardcode the API base URL (e.g., `http://localhost:8000`) during development, adding environment-specific configuration. This works but is less ergonomic than the Vite proxy approach. + +#### WebSocket HMR Compatibility + +Vite's HMR uses WebSocket connections to push updates to the browser. The Vite proxy approach keeps HMR traffic entirely within Vite's domain (port 3002) — WebSocket connections go directly to Vite, not through any proxy. Only HTTP API requests are proxied to FastMCP. This means HMR works without any special WebSocket configuration. + +### 2.5 Q5: Auth Interaction Between SPA and FastMCP Server + +The SPA pages have fundamentally different auth requirements, and the recommended patterns differ by page type. + +#### Unauthenticated Pages + +The home page (`/`), client-specific landings (`/?client=claude-ai`), and error pages (`/auth/error`) are public. They display static content — connection instructions, copy-to-clipboard snippets, error messages. No API calls are made from these pages; the only dynamic data is `window.location.origin` to compute the MCP server URL. These pages need no auth mechanism whatsoever. The SPA serves them directly from `index.html` without any server-side auth check. + +#### Session Activation Flow + +Session Activation is the process by which a user links their Mural account to their authenticated MCP session. The flow involves a browser-based page (`/auth/connect`) that makes API calls to the FastMCP server. The challenge is that the browser does not have the user's Bearer token — only the IDE does. + +The current TS implementation uses a "code is the auth" pattern, and this pattern should be preserved. Here is how it works: + +1. The IDE (authenticated via Bearer token) calls `POST /auth/mural-link/code` to generate a short-lived session activation code. This endpoint requires Bearer auth. +2. The IDE opens the browser to `/auth/connect`. The browser has no Bearer token and no session cookie. +3. The SPA on `/auth/connect` displays an email form. When the user submits, the SPA calls `POST /auth/mural-link/code` — but this time from the browser, without auth. + +There is a design decision here about how the browser-initiated code generation works. In the current TS implementation, the SPA calls `/auth/mural-link/code` with `credentials: 'include'` (session cookies from Better Auth). With FastMCP, there are no session cookies. Two approaches: + +**Option 1: IDE passes a short-lived token via URL parameter.** The IDE opens `http://banksy/auth/connect?token=abc123`. The SPA reads the token from the URL and includes it in the `POST /auth/mural-link/code` request as an `Authorization: Bearer abc123` header. This is simple but exposes the token in browser history and server access logs. The token should be very short-lived (5 minutes) and single-use. + +**Option 2: Two-phase code generation (recommended).** The IDE calls `POST /auth/mural-link/code` (authenticated with Bearer token) to pre-generate the activation code. The IDE then opens the browser to `/auth/connect?code=XXXX`. The SPA reads the code from the URL and skips the code-generation step — it already has the code. The SPA only needs to collect the user's email, call the Mural realm API, and redirect to Mural OAuth. No banksy API call from the browser requires auth. + +Option 2 is recommended because it avoids sending any auth credential to the browser. The activation code itself is the credential, and it was generated by the authenticated IDE. The SPA uses the code to complete the Mural OAuth redirect, and the server-side callback claims the Mural tokens using the code. + +#### Mural OAuth Callback + +After the user authenticates with Mural, the OAuth redirect returns to a server-side `custom_route()` handler — not to the SPA. The server-side handler performs the code exchange (getting Mural access and refresh tokens), stores them in the database keyed by the activation code, and then redirects the browser to a SPA completion page: + +```python +@mcp.custom_route("/auth/mural-oauth/callback", methods=["GET"]) +async def mural_oauth_callback(request: Request) -> Response: + code = request.query_params.get("code") + state = request.query_params.get("state") + + # Validate state, exchange code for tokens, store tokens + try: + await exchange_and_store_mural_tokens(code, state) + return RedirectResponse("/auth/complete?status=success") + except Exception as e: + logger.error(f"Mural OAuth callback failed: {e}") + return RedirectResponse(f"/auth/error?message={quote(str(e))}") +``` + +The SPA completion page (`/auth/complete`) reads the `status` query parameter and displays a success or error message. It makes no API calls — the server already did all the work. This pattern keeps the authorization code and tokens entirely server-side, matching the recommendation from the auth migration research (no reflected XSS risk, no code in browser history). + +#### Summary of Auth by Page + +| Page | Auth Mechanism | API Calls from Browser | +|------|----------------|----------------------| +| Home page (`/`) | None | None | +| Client landing (`/?client=claude-ai`) | None | None | +| Session Activation (`/auth/connect`) | Code passed via URL from IDE | `POST {muralApiHost}/api/v0/user/realm` (no banksy auth needed) | +| Completion (`/auth/complete`) | None | None (reads query params only) | +| Error (`/auth/error`) | None | None (reads query params only) | + +### 2.6 Q6: Mural Design System Dependencies + +The current SPA uses several `@muraldevkit` npm packages for consistent UI with the Mural product: + +- `@muraldevkit/ui-toolkit` (^4.61.3) — React components (MrlButton, MrlTextInput, MrlText, etc.) +- `@muraldevkit/ds-foundation` (^2.16.1) — Design tokens, CSS custom properties +- `@muraldevkit/ds-icons` — SVG icon components +- `@muraldevkit/ds-icons-animated` — Animated icon components +- `@muraldevkit/ds-pictograms` — Pictogram assets + +These are all published npm packages from Mural's private registry. They have no build-time dependency on the parent TypeScript monorepo — they are consumed as standard npm dependencies, the same way any external React project would use a component library. + +#### Standalone `ui/` Feasibility + +Moving the SPA to a standalone `ui/` directory with its own `package.json` is clean. The only requirement is that the `ui/` directory's `.npmrc` (or the CI/CD environment) has access to the private npm registry where `@muraldevkit` packages are published. A typical `.npmrc`: + +```ini +@muraldevkit:registry=https://npm.pkg.github.com +//npm.pkg.github.com/:_authToken=${NPM_TOKEN} +``` + +The SPA's `package.json` dependencies are self-contained: + +```json +{ + "dependencies": { + "@muraldevkit/ui-toolkit": "^4.61.3", + "@muraldevkit/ds-foundation": "^2.16.1", + "@muraldevkit/ds-icons": "^1.0.0", + "@muraldevkit/ds-icons-animated": "^1.0.0", + "react": "^18.3.1", + "react-dom": "^18.3.1", + "react-router-dom": "^7.10.1" + }, + "devDependencies": { + "@vitejs/plugin-react": "^4.7.0", + "autoprefixer": "^10.4.22", + "postcss": "^8.5.6", + "tailwindcss": "^3.4.18", + "typescript": "^5.0.0", + "vite": "^5.4.21" + } +} +``` + +#### Build-Time Dependencies + +The Mural DS packages ship compiled CSS and JS — they do not require the `ui-toolkit` repository to be co-located. The `@muraldevkit/ds-foundation` package provides CSS custom properties (e.g., `--mrl-color-text-primary`, `--mrl-gray-90`) that the SPA uses in inline styles and Tailwind classes. These are loaded via a CSS import in the SPA's `index.css`: + +```css +@import '@muraldevkit/ds-foundation/dist/foundation.css'; +``` + +Vite resolves this import from `node_modules` during build — no special configuration needed. + +#### CI/CD Registry Configuration + +The Docker build stage that runs `npm ci` needs the `NPM_TOKEN` secret for accessing the private registry. This is the same pattern used in the current Dockerfile, which already passes `NPM_TOKEN` as a build secret: + +```dockerfile +RUN --mount=type=secret,id=npm_token_ro,env=NPM_TOKEN \ + npm config set '//registry.npmjs.org/:_authToken=${NPM_TOKEN}' \ + && npm ci +``` + +The exact registry URL may need adjustment depending on whether `@muraldevkit` packages are on GitHub Packages or a custom registry. The `.npmrc` in the `ui/` directory handles this. + +### 2.7 Q7: Docker and Deployment Considerations + +Banksy currently deploys as a Docker container with a multi-stage build: a dependency stage, a builder stage, and a production runner stage. The migration to Python + SPA requires a similar multi-stage approach, but with a Node.js stage for the SPA and a Python stage for the server. + +#### Multi-Stage Dockerfile + +```dockerfile +# Stage 1: Build SPA +FROM node:22-alpine AS spa-builder +WORKDIR /app/ui + +# Copy only package files first for layer caching +COPY ui/package.json ui/package-lock.json ./ +COPY ui/.npmrc ./ + +RUN --mount=type=secret,id=npm_token_ro,env=NPM_TOKEN \ + npm ci --ignore-scripts + +# Copy source and build +COPY ui/ ./ +RUN npm run build + +# Stage 2: Python dependencies +FROM python:3.12-slim AS python-deps +WORKDIR /app + +COPY pyproject.toml ./ +RUN pip install --no-cache-dir . + +# Stage 3: Production image +FROM python:3.12-slim AS runner +WORKDIR /app + +# Create non-root user +RUN addgroup --system banksy && adduser --system --group banksy + +# Copy Python dependencies from deps stage +COPY --from=python-deps /usr/local/lib/python3.12/site-packages /usr/local/lib/python3.12/site-packages +COPY --from=python-deps /usr/local/bin /usr/local/bin + +# Copy application source +COPY src/ ./src/ + +# Copy SPA build output +COPY --from=spa-builder /app/ui/dist ./ui/dist + +USER banksy + +ENV BANKSY_SPA_DIR=/app/ui/dist +EXPOSE 8000 + +CMD ["uvicorn", "banksy.server:app", "--host", "0.0.0.0", "--port", "8000"] +``` + +#### Key Design Decisions + +**No Node.js in production.** The final image contains only the Python runtime and the pre-built static files from the SPA. Node.js is used only in the `spa-builder` stage and is not present in the production image. This keeps the image size reasonable. + +**Layer caching.** The SPA build layer is cached based on `ui/package.json`, `ui/package-lock.json`, and `ui/` source files. If only Python code changes, Docker reuses the cached SPA build layer. Conversely, if only the SPA changes, the Python dependency layer is reused. This dramatically reduces CI/CD build times for the common case where only one side changes. + +**Image size.** The `python:3.12-slim` base image is approximately 130MB. The SPA build output (Vite-built React app) is typically 1-5MB. The total image size should be in the 200-300MB range — smaller than the current Node.js-based image (which includes the full `node_modules` tree). + +**SPA directory override.** The `BANKSY_SPA_DIR` environment variable allows the production image to point to `/app/ui/dist` while development uses the relative path from the Python source. This avoids hardcoding paths. + +#### CI/CD Build Caching + +For CI/CD systems (GitHub Actions, Azure Pipelines), the Docker build cache can be preserved between runs. The two independent build stages (SPA and Python) cache independently: + +- SPA stage: Invalidated only when `ui/` files change. The `npm ci` layer is cached separately from the `npm run build` layer, so dependency installs are reused when only source code changes. +- Python stage: Invalidated only when `pyproject.toml` or `src/` files change. + +### 2.8 Q8: Alternatives Evaluation + +Before committing to serving the SPA from FastMCP, three alternatives deserve consideration. Each was ultimately rejected in favor of the same-origin SPA approach, but they offer trade-offs worth documenting. + +#### Alternative A: Separate SPA Host + +Deploy the SPA independently on Azure Static Web Apps, Cloudflare Pages, or a CDN. The SPA makes cross-origin API calls to the FastMCP server. + +**Pros:** +- Complete separation of frontend and backend deployment +- CDN-level caching and global distribution for static assets +- Frontend team can deploy independently + +**Cons:** +- CORS configuration required on the FastMCP server — every `custom_route()` must include CORS headers +- Session Activation's Mural OAuth callback redirect must point to the SPA host, not the API server — complicating the server-side callback pattern +- Two deployment targets to manage, two CI/CD pipelines +- Cookie-based auth (if ever needed) becomes complex with cross-origin SameSite restrictions +- Overkill for 5 static pages that change infrequently + +**Verdict:** The complexity of CORS, separate deployment, and cross-origin auth outweighs the benefits for banksy's small number of pages. Same-origin serving is simpler. + +#### Alternative B: HTMX or Alpine.js + +Replace the React SPA with lightweight server-rendered HTML pages enhanced with HTMX (for AJAX) or Alpine.js (for client-side interactivity). The server renders pages using Jinja2 templates with Tailwind CSS for styling. + +**Pros:** +- No Node.js build step — pages are Python templates +- Simpler deployment (Python only) +- Smaller payload to the browser + +**Cons:** +- Loses access to Mural Design System components (`MrlButton`, `MrlTextInput`, `MrlText`, etc.), which are React components +- Would need to manually replicate the Mural DS look and feel with raw HTML/CSS — essentially rebuilding components +- Different technology from the rest of Mural's frontend (React), creating a maintenance burden +- Team is experienced with React, not with HTMX/Alpine patterns +- The home page's `ConnectionCard` grid, copy-to-clipboard, and Cursor deeplink behavior would need to be reimplemented + +**Verdict:** Rejected. The Mural DS is React-based, and reimplementing its components in another paradigm creates more work than the Node.js build step saves. + +#### Alternative C: Jinja2 + Tailwind (Server-Side Rendering) + +Use Python's Jinja2 template engine with Tailwind CSS (compiled at build time) for server-rendered pages. No client-side framework. + +**Pros:** +- No Node.js runtime needed at dev time (Tailwind CLI is a standalone binary) +- Server-side rendering means no SPA routing concerns +- Each page is a simple template + +**Cons:** +- Same Mural DS problem as Alternative B — React components are unavailable +- No client-side routing, so every navigation is a full page load +- The Session Activation page's multi-step flow (email input → realm check → OAuth redirect) would need to be implemented as a multi-page form or with inline JavaScript +- Tailwind alone does not provide the Mural design tokens — `@muraldevkit/ds-foundation` CSS custom properties would still need to be imported somehow + +**Verdict:** Rejected for the same reason as Alternative B. The Mural DS dependency on React makes a non-React approach impractical without significant rework. + +#### Comparison Matrix + +| Criterion | Same-Origin SPA (recommended) | Separate SPA Host | HTMX/Alpine | Jinja2 + Tailwind | +|-----------|------------------------------|-------------------|-------------|-------------------| +| Mural DS support | Full | Full | None | None | +| Build complexity | Medium (Node + Python) | Medium (separate deploys) | Low (Python only) | Low (Python only) | +| Dev experience | Good (Vite HMR) | Good (Vite HMR) | Different paradigm | Different paradigm | +| Deployment | Single container | Two targets | Single container | Single container | +| CORS concerns | None | Required | None | None | +| Team familiarity | High (React) | High (React) | Low | Low | + +--- + +## 3. Recommended Architecture + +### End-State Directory Layout + +``` +banksy/ +├── src/banksy/ # Python FastMCP server +│ ├── __init__.py +│ ├── server.py # FastMCP app + SPA mount +│ ├── auth.py # RemoteAuthProvider / JWTVerifier +│ ├── mural_session.py # Session Activation routes +│ ├── mural_tokens.py # Mural token storage/refresh +│ ├── spa.py # SpaStaticFiles class +│ └── tools/ # MCP tools +├── ui/ # React SPA (standalone Node.js project) +│ ├── package.json +│ ├── package-lock.json +│ ├── .npmrc # Private registry config +│ ├── vite.config.ts +│ ├── tailwind.config.mjs +│ ├── postcss.config.mjs +│ ├── tsconfig.json +│ ├── index.html +│ ├── public/ +│ │ └── favicon.svg +│ └── src/ +│ ├── main.tsx # Single entry point with React Router +│ ├── index.css # Tailwind + Mural DS foundation import +│ └── components/ +│ ├── home-page.tsx +│ ├── claude-landing.tsx +│ ├── session-activation.tsx +│ ├── completion-page.tsx +│ ├── error-page.tsx +│ └── shared/ # (copied from current codebase) +├── pyproject.toml +├── Dockerfile +├── Makefile +└── docs/ +``` + +### Server Integration Code + +The core integration is minimal — roughly 15 lines added to `server.py`: + +```python +# src/banksy/server.py +from fastmcp import FastMCP +from banksy.spa import SpaStaticFiles, SPA_DIR + +mcp = FastMCP("banksy") + +# ... tool registrations, custom_route() endpoints ... + +# Build the ASGI app +app = mcp.http_app(transport="streamable-http") + +# Mount SPA as catch-all (lowest precedence) +if SPA_DIR.exists(): + app.mount("/", SpaStaticFiles(directory=str(SPA_DIR), html=True), name="spa") +``` + +The `if SPA_DIR.exists()` guard allows the server to start without the SPA during early migration PRs or when running MCP-only tests. The SPA is optional — the MCP protocol and API endpoints work without it. + +### SPA Entry Point + +The unified `main.tsx` combines the current home page and auth SPA into a single React Router application: + +```tsx +// ui/src/main.tsx +import { createBrowserRouter, RouterProvider } from 'react-router-dom'; +import ReactDOM from 'react-dom/client'; +import { HomePage } from './components/home-page'; +import { ClaudeLanding } from './components/claude-landing'; +import { SessionActivation } from './components/session-activation'; +import { CompletionPage } from './components/completion-page'; +import { ErrorPage } from './components/error-page'; +import './index.css'; + +const router = createBrowserRouter([ + { + path: '/', + element: , + }, + { + path: '/auth/connect', + element: , + }, + { + path: '/auth/complete', + element: , + }, + { + path: '/auth/error', + element: , + }, +]); + +ReactDOM.createRoot(document.getElementById('root')!).render( + +); +``` + +The `HomePage` component internally checks `window.location.search` for `?client=claude-ai` and renders `ClaudeLanding` conditionally — preserving the current behavior without needing a separate route. + +### Build Pipeline + +``` + ┌─────────────┐ + │ ui/ source │ + └──────┬──────┘ + │ + npm ci && npm run build + │ + ┌──────▼──────┐ + │ ui/dist/ │ + │ index.html │ + │ assets/ │ + └──────┬──────┘ + │ + ┌────────────┴────────────┐ + │ │ + ┌────────▼────────┐ ┌─────────▼─────────┐ + │ Development │ │ Production │ + │ Vite dev server│ │ SpaStaticFiles │ + │ proxies API to │ │ serves from │ + │ FastMCP :8000 │ │ /app/ui/dist │ + └─────────────────┘ └────────────────────┘ +``` + +### Dev Workflow + +```bash +# Terminal 1: Start FastMCP server +cd banksy +uvicorn banksy.server:app --port 8000 --reload + +# Terminal 2: Start Vite dev server +cd banksy/ui +npm run dev + +# Open http://localhost:3002 in browser +# SPA pages served by Vite with HMR +# API requests proxied to FastMCP on port 8000 +``` + +--- + +## 4. Impact on Migration Plan + +### Changes from Prior Research + +The auth migration research (`docs/fastmcp-auth-migration-research.md`) recommended eliminating the SPA entirely, replacing all browser-facing pages with inline `HTMLResponse` strings. This research supersedes that recommendation. The revised approach preserves the React SPA and adds a build pipeline to serve it from the FastMCP server. + +Specific impacts on the migration PR plan: + +| Migration PR | Prior Plan | Revised Plan | +|-------------|------------|--------------| +| **PR5 (Auth — IDE to Banksy)** | "SPA callback pages → Eliminated" | SPA callback pages are still eliminated (OAuth callbacks are server-side). But the SPA itself survives for the home page, Session Activation, and completion/error pages. | +| **PR6 (Auth — Banksy to Mural)** | "SPA-based Session Activation UI → Server-side Starlette routes" | Session Activation UI remains a SPA page, modified to receive the activation code via URL parameter instead of generating it from the browser. The server-side callback handler (`/auth/mural-oauth/callback`) redirects to the SPA completion page. | +| **New: SPA PR** | Not planned | A new PR is needed to set up the `ui/` directory, migrate surviving components, configure Vite, and add the `SpaStaticFiles` mount. This can be done early (PR2 or PR3) since it has no dependency on auth implementation. | + +### Migration Sequence + +The SPA work can be done independently of the auth migration, which is a significant advantage: + +1. **PR-SPA (can be early, e.g., PR2):** Create `ui/` directory, copy surviving components from the TS codebase, configure Vite with Tailwind and Mural DS dependencies, add `SpaStaticFiles` to the Python server. The home page and static content pages work immediately. +2. **PR5 (Auth):** Add `RemoteAuthProvider` / `JWTVerifier`. Auth endpoints appear in the route list before the SPA mount — no SPA changes needed. +3. **PR6 (Session Activation):** Add `custom_route()` endpoints for `/auth/mural-link/code`, `/auth/mural-link/claim`, and `/auth/mural-oauth/callback`. Modify the `SessionActivation` SPA component to accept the activation code via URL parameter. Add the `CompletionPage` component. + +### Component Migration Checklist + +| Component | Source | Destination | Changes Needed | +|-----------|--------|-------------|----------------| +| `HomePage` | `ui/src/components/home-page.tsx` | `ui/src/components/home-page.tsx` | None (copy as-is) | +| `ClaudeLanding` | `ui/src/components/claude-landing.tsx` | `ui/src/components/claude-landing.tsx` | None (copy as-is) | +| `MuralLoginPage` | `ui/src/components/sso-proxy-connect.tsx` | `ui/src/components/session-activation.tsx` | Remove `credentials: 'include'` from fetch. Accept activation code via URL param instead of calling `/auth/mural-link/code` from the browser. | +| Shared components | `ui/src/components/shared/*` | `ui/src/components/shared/*` | None (copy as-is) | +| `CompletionPage` | New | `ui/src/components/completion-page.tsx` | New component. Reads `?status=success\|error` from URL. | +| `ErrorPage` | Inline HTML in provider.ts | `ui/src/components/error-page.tsx` | New component. Reads `?message=...` from URL. | +| `SignInPage` | `ui/src/components/sign-in.tsx` | Eliminated | Better Auth sign-in is replaced by IdP-handled login. | +| `GoogleCallback` | `ui/src/components/sso-proxy-google-callback.tsx` | Eliminated | SSO proxy Google callback is replaced by server-side handling. | +| `MuralCallbackPage` | `ui/src/components/sso-proxy-connect-callback.tsx` | Eliminated | Client-side callback handling is replaced by server-side redirect to `/auth/complete`. | +| `MuralOAuthCallbackPage` | `ui/src/components/oauth-callback.tsx` | Eliminated | Mural OAuth callback is handled server-side. | + +--- + +## 5. Risk Assessment + +### Low Risks + +**`SpaStaticFiles` is unofficial.** The subclass pattern is not part of Starlette's public API — it overrides `lookup_path()`, which is an internal method. However, this pattern has been stable since Starlette 0.12 (2019), is documented in blog posts and Stack Overflow answers, and is used in production by many projects. Starlette's maintainers are aware of it and have declined to add native SPA support, effectively endorsing the subclass approach. The risk of a breaking change is low, and if it happens, the fix would be trivial (updating the method signature). + +**Dev mode requires two processes.** Developers must run both the FastMCP server and the Vite dev server. This is standard for any React + Python project (e.g., Django + React, FastAPI + React) and is well-understood. A `Makefile` target or `docker-compose` setup can simplify startup to a single command. + +### Medium Risks + +**Build pipeline complexity.** Banksy becomes a hybrid Node.js + Python project. CI/CD must install both runtimes, run both build steps, and manage both dependency caches. Docker multi-stage builds handle this cleanly, but developers need both Node.js and Python installed locally. This is a modest increase in setup complexity. + +**Mural DS version compatibility.** When the SPA is decoupled from the TS monorepo, Mural DS package versions are managed independently. If `@muraldevkit/ui-toolkit` releases a breaking change, the banksy SPA must be updated separately. This is the standard npm dependency management workflow and is not unique to this architecture — but it is a new operational concern that did not exist when the SPA lived inside the monorepo. + +**SPA build output not present during early migration.** Until the SPA PR lands, the Python server will not have `ui/dist/`. The `if SPA_DIR.exists()` guard ensures the server starts without it, but developers should be aware that the home page will return a 404 until the SPA is built. During early PRs, the MCP protocol and API endpoints are accessible without the SPA. + +### Mitigations + +- Pin Mural DS package versions in `ui/package.json` with exact versions (not ranges) to avoid unexpected breakage. +- Add a CI step that builds the SPA and verifies the output directory exists before building the Docker image. +- Document the two-process dev workflow in the README with copy-pasteable commands. +- Consider a `Procfile` or `docker-compose.dev.yml` that starts both processes together for development convenience. diff --git a/fastmcp-migration/banksy-research/fastmcp-starlette-routing-research.md b/fastmcp-migration/banksy-research/fastmcp-starlette-routing-research.md new file mode 100644 index 0000000..83396dd --- /dev/null +++ b/fastmcp-migration/banksy-research/fastmcp-starlette-routing-research.md @@ -0,0 +1,475 @@ +# Research: Starlette App vs. custom_route() for Non-MCP HTTP Endpoints + +## 1. Executive Summary + +FastMCP's `custom_route()` is a transparent pass-through to Starlette routing — it creates a standard `starlette.routing.Route` with zero wrapping, no middleware injection, and no DI. The handler signature is identical to a raw Starlette route handler. Because FastMCP's `BearerAuthBackend` is registered as app-wide middleware that **permits** unauthenticated requests (setting `scope["user"] = UnauthenticatedUser()` rather than returning 401), both `custom_route()` handlers and raw Starlette routes share the exact same auth behavior. Given that banksy has only five non-MCP HTTP routes (health, three auth endpoints, SPA catch-all), all simple JSON request/response with no route grouping or per-route middleware needs, `custom_route()` is the right layer. It keeps route definitions co-located with the MCP server, matches FastMCP documentation and community patterns, and introduces no abstraction cost. The SPA mount uses `app.mount()` on the Starlette app returned by `http_app()`, which is the only operation that requires touching the app object directly. The prior research's "graduate to FastAPI" framing is eliminated — Starlette (already underneath FastMCP) is the graduation path if complexity ever grows. + +## 2. How `custom_route()` Works Internally + +### Source: `fastmcp/server/mixins/transport.py` + +```python +def custom_route( + self: FastMCP, + path: str, + methods: list[str], + name: str | None = None, + include_in_schema: bool = True, +) -> Callable[ + [Callable[[Request], Awaitable[Response]]], + Callable[[Request], Awaitable[Response]], +]: + def decorator( + fn: Callable[[Request], Awaitable[Response]], + ) -> Callable[[Request], Awaitable[Response]]: + self._additional_http_routes.append( + Route( + path, + endpoint=fn, + methods=methods, + name=name, + include_in_schema=include_in_schema, + ) + ) + return fn + + return decorator +``` + +Key observations: + +- **Storage**: Routes accumulate in `self._additional_http_routes`, a plain list initialized as `[]` in `FastMCP.__init__`. +- **No wrapping**: The handler `fn` is passed directly as `endpoint=fn` to `starlette.routing.Route`. No middleware, error handling, or context injection is applied. +- **Handler signature**: `Callable[[Request], Awaitable[Response]]` — identical to what `Route(endpoint=...)` expects. A `custom_route()` handler and a raw Starlette handler are interchangeable. +- **Timing**: Routes are stored at decoration time but only composed into the Starlette app when `http_app()` is called. + +### Route retrieval: `_get_additional_http_routes()` + +```python +def _get_additional_http_routes(self: FastMCP) -> list[BaseRoute]: + return list(self._additional_http_routes) +``` + +Returns only the current server's routes. Custom routes on mounted child servers are **not included** (see section 4.4). + +## 3. Auth Middleware Behavior + +### 3.1 The Two-Layer Auth Architecture + +FastMCP's auth protection operates at two distinct layers. Understanding this distinction is critical for banksy's auth routes. + +**Layer 1 — App-wide `AuthenticationMiddleware`:** + +Registered via `auth.get_middleware()` in `fastmcp/server/auth/auth.py`: + +```python +def get_middleware(self) -> list: + return [ + Middleware( + AuthenticationMiddleware, + backend=BearerAuthBackend(self), + ), + Middleware(AuthContextMiddleware), + ] +``` + +This middleware runs on **every** HTTP request to the Starlette app — MCP transport routes, `custom_route()` handlers, raw Starlette routes added via `app.add_route()`, and SPA mount routes. + +**Layer 2 — Per-route `RequireAuthMiddleware`:** + +Applied only to MCP transport routes in `create_streamable_http_app()`: + +```python +server_routes.append( + Route( + streamable_http_path, + endpoint=RequireAuthMiddleware( + streamable_http_app, + auth.required_scopes, + resource_metadata_url, + ), + methods=http_methods, + ) +) +``` + +Custom routes are added as plain routes without this wrapper: + +```python +server_routes.extend(server._get_additional_http_routes()) +``` + +### 3.2 What Happens When No Token Is Present + +**`BearerAuthBackend.authenticate()`** (from `mcp/server/auth/middleware/bearer_auth.py`): + +```python +async def authenticate(self, conn: HTTPConnection): + auth_header = next( + (conn.headers.get(key) for key in conn.headers if key.lower() == "authorization"), + None, + ) + if not auth_header or not auth_header.lower().startswith("bearer "): + return None # Does NOT raise, does NOT return 401 + + token = auth_header[7:] + auth_info = await self.token_verifier.verify_token(token) + + if not auth_info: + return None # Invalid/expired token: still None + + if auth_info.expires_at and auth_info.expires_at < int(time.time()): + return None + + return AuthCredentials(auth_info.scopes), AuthenticatedUser(auth_info) +``` + +When no token is present, `authenticate()` returns `None`. + +**Starlette's `AuthenticationMiddleware`** handles the `None` return: + +```python +if auth_result is None: + auth_result = AuthCredentials(), UnauthenticatedUser() +scope["auth"], scope["user"] = auth_result +await self.app(scope, receive, send) # Request continues +``` + +The request proceeds with `scope["user"] = UnauthenticatedUser()`. No 401 is returned. + +**`RequireAuthMiddleware`** is the enforcement layer: + +```python +async def __call__(self, scope, receive, send): + auth_user = scope.get("user") + if not isinstance(auth_user, AuthenticatedUser): + await self._send_auth_error( + send, status_code=401, error="invalid_token", + description="Authentication required" + ) + return + # ... scope check, then forward + await self.app(scope, receive, send) +``` + +This returns 401 when the user is not authenticated — but it wraps **only** the MCP transport route. + +### 3.3 Implications for Banksy's Routes + +| Route | Auth Required? | Behavior | +|---|---|---| +| `GET /health` | No | `request.user` is `UnauthenticatedUser()` — handler proceeds normally | +| `POST /auth/mural-link/code` | Yes | Handler must check `isinstance(request.user, AuthenticatedUser)` | +| `POST /auth/mural-link/claim` | Yes | Handler must check `isinstance(request.user, AuthenticatedUser)` | +| `GET /auth/mural-oauth/callback` | No (server-side) | OAuth callback from Mural; validates via `state` parameter, not Bearer token | +| `GET / (SPA catch-all)` | No | Static file serving, no auth needed | + +The auth guard pattern for protected `custom_route()` handlers: + +```python +from mcp.server.auth.middleware.bearer_auth import AuthenticatedUser + +@mcp.custom_route("/auth/mural-link/code", methods=["POST"]) +async def generate_code(request: Request) -> Response: + if not isinstance(request.user, AuthenticatedUser): + return JSONResponse({"error": "unauthorized"}, status_code=401) + access_token = request.user.token # The verified Bearer token + # ... handler logic +``` + +This guard is identical whether the route is registered via `custom_route()` or via `app.add_route()` — both go through the same app-wide `AuthenticationMiddleware`. + +### 3.4 Auth Behavior Is Identical Across Approaches + +The middleware stack applies to the Starlette app, not to individual route registration methods: + +``` +HTTP Request + └── RequestContextMiddleware (outermost, always first) + └── AuthenticationMiddleware (BearerAuthBackend) + └── Token present & valid: scope["user"] = AuthenticatedUser(auth_info) + └── Token missing/invalid: scope["user"] = UnauthenticatedUser() + └── AuthContextMiddleware + └── Stores user in context var for MCP SDK + └── Router (first-match-wins) + └── MCP transport route → RequireAuthMiddleware → 401 if unauthenticated + └── custom_route() handlers → no RequireAuthMiddleware + └── app.add_route() handlers → no RequireAuthMiddleware + └── app.mount() SPA → no RequireAuthMiddleware +``` + +A route registered via `custom_route()` and a route registered via `app.add_route()` receive the same `request.user` object with the same authentication state. + +## 4. Side-by-Side Comparison + +### 4.1 Approach A — All `custom_route()` + `app.mount()` for SPA + +```python +# server.py +from fastmcp import FastMCP +from starlette.requests import Request +from starlette.responses import JSONResponse, Response, RedirectResponse +from mcp.server.auth.middleware.bearer_auth import AuthenticatedUser + +mcp = FastMCP("banksy", auth=RemoteAuthProvider(...)) + +# --- Non-MCP HTTP routes --- + +@mcp.custom_route("/health", methods=["GET"]) +async def health(request: Request) -> JSONResponse: + return JSONResponse({"status": "ok"}) + +@mcp.custom_route("/auth/mural-link/code", methods=["POST"]) +async def generate_code(request: Request) -> Response: + if not isinstance(request.user, AuthenticatedUser): + return JSONResponse({"error": "unauthorized"}, status_code=401) + body = await request.json() + # ... generate activation code + return JSONResponse({"code": activation_code}) + +@mcp.custom_route("/auth/mural-link/claim", methods=["POST"]) +async def claim_tokens(request: Request) -> Response: + if not isinstance(request.user, AuthenticatedUser): + return JSONResponse({"error": "unauthorized"}, status_code=401) + body = await request.json() + # ... claim Mural tokens with activation code + return JSONResponse({"status": "claimed"}) + +@mcp.custom_route("/auth/mural-oauth/callback", methods=["GET"]) +async def oauth_callback(request: Request) -> Response: + code = request.query_params.get("code") + state = request.query_params.get("state") + # ... exchange code for Mural tokens, store them + return RedirectResponse("/auth/complete") + +# --- App assembly --- + +app = mcp.http_app(transport="streamable-http") +app.mount("/", SpaStaticFiles(directory=str(SPA_DIR), html=True), name="spa") +``` + +### 4.2 Approach B — All Raw Starlette + +```python +# server.py +mcp = FastMCP("banksy", auth=RemoteAuthProvider(...)) +app = mcp.http_app(transport="streamable-http") + +async def health(request: Request) -> JSONResponse: + return JSONResponse({"status": "ok"}) + +async def generate_code(request: Request) -> Response: + if not isinstance(request.user, AuthenticatedUser): + return JSONResponse({"error": "unauthorized"}, status_code=401) + body = await request.json() + return JSONResponse({"code": activation_code}) + +async def claim_tokens(request: Request) -> Response: + if not isinstance(request.user, AuthenticatedUser): + return JSONResponse({"error": "unauthorized"}, status_code=401) + body = await request.json() + return JSONResponse({"status": "claimed"}) + +async def oauth_callback(request: Request) -> Response: + code = request.query_params.get("code") + state = request.query_params.get("state") + return RedirectResponse("/auth/complete") + +app.add_route("/health", health, methods=["GET"]) +app.add_route("/auth/mural-link/code", generate_code, methods=["POST"]) +app.add_route("/auth/mural-link/claim", claim_tokens, methods=["POST"]) +app.add_route("/auth/mural-oauth/callback", oauth_callback, methods=["GET"]) +app.mount("/", SpaStaticFiles(directory=str(SPA_DIR), html=True), name="spa") +``` + +### 4.3 Approach C — Mixed + +```python +# server.py +mcp = FastMCP("banksy", auth=RemoteAuthProvider(...)) + +# Auth routes on mcp instance +@mcp.custom_route("/auth/mural-link/code", methods=["POST"]) +async def generate_code(request: Request) -> Response: + if not isinstance(request.user, AuthenticatedUser): + return JSONResponse({"error": "unauthorized"}, status_code=401) + # ... + +@mcp.custom_route("/auth/mural-link/claim", methods=["POST"]) +async def claim_tokens(request: Request) -> Response: + if not isinstance(request.user, AuthenticatedUser): + return JSONResponse({"error": "unauthorized"}, status_code=401) + # ... + +@mcp.custom_route("/auth/mural-oauth/callback", methods=["GET"]) +async def oauth_callback(request: Request) -> Response: + # ... + +# Non-auth routes on raw Starlette +app = mcp.http_app(transport="streamable-http") + +async def health(request: Request) -> JSONResponse: + return JSONResponse({"status": "ok"}) + +app.add_route("/health", health, methods=["GET"]) +app.mount("/", SpaStaticFiles(directory=str(SPA_DIR), html=True), name="spa") +``` + +### 4.4 Comparison Matrix + +| Criterion | A (custom_route) | B (raw Starlette) | C (mixed) | +|---|---|---|---| +| Auth behavior | Identical | Identical | Identical | +| Handler signature | Identical | Identical | Identical | +| Code clarity | Routes co-located with MCP server definition | Routes split between `mcp` and `app` objects | Two patterns for functionally identical behavior | +| Testability | `httpx.AsyncClient` + `ASGITransport` | Same | Same | +| Future flexibility | Can drop to raw Starlette anytime | Already there | Already mixed | +| Community alignment | Matches FastMCP docs and canvas-mcp | No real-world examples found | No real-world examples found | +| Route precedence | Before SPA mount (guaranteed by `http_app()` composition) | After custom_routes in route list, still before `Mount` | Mixed ordering across two registration points | +| Mounted child routes | N/A — banksy's HTTP routes are on the root server | Same | Same | + +### 4.5 Testability + +Both approaches test identically with `httpx.AsyncClient`: + +```python +import httpx + +async def test_health(): + async with httpx.AsyncClient( + transport=httpx.ASGITransport(app=app), + base_url="http://testserver", + ) as client: + resp = await client.get("/health") + assert resp.status_code == 200 + assert resp.json() == {"status": "ok"} + +async def test_auth_route_requires_token(): + async with httpx.AsyncClient( + transport=httpx.ASGITransport(app=app), + base_url="http://testserver", + ) as client: + # Without token + resp = await client.post("/auth/mural-link/code", json={"mural_id": "123"}) + assert resp.status_code == 401 + + # With valid token + resp = await client.post( + "/auth/mural-link/code", + json={"mural_id": "123"}, + headers={"Authorization": "Bearer valid-test-token"}, + ) + assert resp.status_code == 200 +``` + +No difference in fixture setup, test client configuration, or assertion patterns between the three approaches. FastMCP's `Client` class covers MCP protocol testing only; HTTP routes always use `httpx.AsyncClient`. + +## 5. Recommendation + +**Approach A: All routes via `custom_route()`, SPA via `app.mount()`.** + +Rationale: + +1. **No abstraction cost.** `custom_route()` is a one-line pass-through to `starlette.routing.Route`. Using it costs nothing and gains code co-location — all route definitions live on the `mcp` instance alongside tools, resources, and prompts. + +2. **Auth works automatically.** The app-wide `AuthenticationMiddleware` populates `request.user` for all routes. Protected handlers add a one-liner `isinstance` guard. Unauthenticated handlers (health, SPA) work without any special configuration. + +3. **Community alignment.** FastMCP documentation recommends `custom_route()` for non-MCP endpoints. The canvas-mcp prototype uses it for `/health`. No real-world examples of raw Starlette routing instead of `custom_route()` were found. + +4. **No premature complexity.** Approach B splits route definitions across two objects (`mcp` and `app`) for zero functional benefit. Approach C introduces two patterns for the same thing, increasing cognitive load. + +5. **Escape hatch preserved.** If banksy ever needs Starlette features that `custom_route()` doesn't expose (route grouping via `Router`, sub-applications via `Mount`, scoped middleware), the `app` object from `http_app()` is always available. This is a direct path to Starlette's full routing capabilities — no additional framework needed. + +### What `custom_route()` Doesn't Give You (and Why It Doesn't Matter) + +| Missing capability | Starlette alternative | Banksy need? | +|---|---|---| +| Route grouping / prefixes | `starlette.routing.Router` | No — 4 routes, flat structure is fine | +| Pydantic request validation | Manual `await request.json()` + validate | No — simple JSON bodies | +| OpenAPI schema generation | Not needed for internal endpoints | No | +| Per-route middleware | Wrap handler or use `RequireAuthMiddleware` pattern | No — auth guard is a one-liner | +| Dependency injection | Module-level singletons (db pool, config) | No — banksy uses module-level imports | + +## 6. Impact on Migration Plan + +The current migration plan (`willik-notes/fastmcp-migration/execution-strategy-research/00-migration-execution-strategy.plan.md`) already recommends `custom_route()` for banksy's HTTP routes. This research validates that choice and identifies two corrections: + +### 6.1 Remove the "Graduate to FastAPI" Framing + +The plan states: + +> If auth routes grow beyond 5 endpoints or need shared DI, route grouping, or per-route middleware, graduate to a **FastAPI outer app** with `APIRouter` and `Depends()`, mounting `mcp.http_app()` at `/mcp`. + +This framing is wrong. FastAPI is built on Starlette, and FastMCP already provides a Starlette app. Adding FastAPI means adding a framework on top of the framework that's already there. The correct graduation path is: + +1. **Start**: `custom_route()` for all HTTP endpoints (current plan, validated) +2. **If route grouping needed**: Use `starlette.routing.Router` with prefix, pass as `routes` parameter to `http_app()` or compose with the app directly +3. **If per-route middleware needed**: Use Starlette's `Route` with middleware wrapper (same pattern as `RequireAuthMiddleware`) +4. **If sub-applications needed**: Use `starlette.routing.Mount` with a sub-app +5. **If Pydantic request validation + OpenAPI generation needed for HTTP endpoints**: Only then consider FastAPI + +Banksy's current and foreseeable needs (health, 3 auth endpoints, SPA) fall squarely in step 1. Steps 2-4 are available without adding any dependency. Step 5 is unlikely to be needed. + +### 6.2 Document the Auth Guard Pattern Explicitly + +The plan says: + +> Session Activation routes use `custom_route()` with auth validation via `request.scope["user"]`. + +This is correct but incomplete. The plan should clarify that `custom_route()` handlers are **not** wrapped with `RequireAuthMiddleware` — the handler must explicitly check for an authenticated user. The recommended pattern: + +```python +from mcp.server.auth.middleware.bearer_auth import AuthenticatedUser + +def require_auth(request: Request) -> AuthenticatedUser: + """Extract authenticated user or raise.""" + if not isinstance(request.user, AuthenticatedUser): + raise HTTPException(status_code=401, detail="Authentication required") + return request.user + +@mcp.custom_route("/auth/mural-link/code", methods=["POST"]) +async def generate_code(request: Request) -> Response: + user = require_auth(request) + # ... use user.token for the verified Bearer token +``` + +This `require_auth` helper should be defined in the auth module and reused across all protected `custom_route()` handlers. Note: since we're in Starlette (not FastAPI), `HTTPException` won't be caught automatically — the helper should return a `JSONResponse` with 401 instead: + +```python +def get_authenticated_user(request: Request) -> AuthenticatedUser | None: + """Return the authenticated user, or None if the request is unauthenticated.""" + if isinstance(request.user, AuthenticatedUser): + return request.user + return None + +@mcp.custom_route("/auth/mural-link/code", methods=["POST"]) +async def generate_code(request: Request) -> Response: + user = get_authenticated_user(request) + if user is None: + return JSONResponse({"error": "unauthorized"}, status_code=401) + # ... use user.token +``` + +### 6.3 No Other Changes Needed + +The rest of the migration plan's HTTP route guidance is accurate: + +- Route precedence with the SPA mount is correct (`custom_route()` routes are composed before `Mount`) +- The SPA `app.mount("/", SpaStaticFiles(...))` pattern is correct +- The two middleware layers (MCP protocol vs. HTTP/ASGI) distinction is correct +- The mounted child server caveat (custom routes don't propagate) is correct + +## Appendix: Source References + +| Component | Location | Key behavior | +|---|---|---| +| `custom_route()` | `fastmcp/server/mixins/transport.py:96-144` | Stores `Route` in `_additional_http_routes`, no wrapping | +| `_get_additional_http_routes()` | `fastmcp/server/mixins/transport.py:146-156` | Returns only current server's routes | +| `http_app()` | `fastmcp/server/mixins/transport.py:279-338` | Delegates to `create_streamable_http_app()` | +| `create_streamable_http_app()` | `fastmcp/server/http.py:265-382` | Composes routes: auth → MCP → optional → custom | +| `create_base_app()` | `fastmcp/server/http.py:109-135` | Adds `RequestContextMiddleware` outermost | +| `auth.get_middleware()` | `fastmcp/server/auth/auth.py:318-331` | Returns `AuthenticationMiddleware` + `AuthContextMiddleware` | +| `BearerAuthBackend.authenticate()` | `mcp/server/auth/middleware/bearer_auth.py` | Returns `None` when no token (never raises) | +| `RequireAuthMiddleware` | `mcp/server/auth/middleware/bearer_auth.py` | Returns 401 if user not `AuthenticatedUser`; wraps MCP route only | +| `AuthenticationMiddleware` | `starlette/middleware/authentication.py` | Sets `UnauthenticatedUser()` when backend returns `None` | +| canvas-mcp `/health` | `canvas-mcp/main.py:71-75` | Uses `@mcp.custom_route("/health", methods=["GET"])` | diff --git a/fastmcp-migration/banksy-research/monorepo-layout-agent-harness-research.md b/fastmcp-migration/banksy-research/monorepo-layout-agent-harness-research.md new file mode 100644 index 0000000..aaa4eab --- /dev/null +++ b/fastmcp-migration/banksy-research/monorepo-layout-agent-harness-research.md @@ -0,0 +1,1118 @@ +# Monorepo Layout for MCP Servers and Agent Harness + +## 1. Executive Summary + +- **Use uv workspaces from Phase 1** with three workspace members: `server` (FastMCP MCP server), `harness` (MCP client / agent orchestration), and `shared` (Pydantic models, database access, auth utilities). This avoids a mid-stream restructuring that the "start single, graduate later" approach would require. +- **Absorb canvas-mcp tools into the server package** under `domains/canvas/`, not as a separate workspace member. The canvas-mcp assessment already recommends copying tools directly; the domain concept within the server handles organizational separation. +- **The agent harness must be a separate workspace member** because it depends on LLM provider SDKs (`openai`, `anthropic`) that the MCP server does not need, and it may deploy independently. Colocating it as a module within the server package would force LLM dependencies into the server's install. +- **Python workspace members live under `pypackages/`** to coexist cleanly with the existing `packages/` directory (TypeScript npm workspaces) during the transition period. After TS cleanup, `pypackages/` can optionally be renamed to `packages/`. +- **The `domains/` concept remains valid inside the server package.** Package boundaries (workspace members) handle dependency isolation and deployment; domains handle tool organization within the server. + +--- + +## 2. uv Workspaces Deep Dive + +### 2.1 Core Mechanics + +A uv workspace is a collection of Python packages managed together in a single repository. It is declared via `[tool.uv.workspace]` in a root `pyproject.toml`: + +```toml +[tool.uv.workspace] +members = ["pypackages/*"] +``` + +Every directory matched by the `members` glob must contain its own `pyproject.toml`. The root `pyproject.toml` is also a workspace member (it can be an application or just a virtual root). + +### 2.2 Cross-Member Dependencies + +Workspace members declare dependencies on each other using `[tool.uv.sources]` with `workspace = true`: + +```toml +# pypackages/server/pyproject.toml +[project] +name = "banksy-server" +dependencies = ["banksy-shared"] + +[tool.uv.sources] +banksy-shared = { workspace = true } +``` + +This tells uv to resolve `banksy-shared` from the workspace rather than PyPI. Workspace member dependencies are installed as editable by default, meaning changes to `banksy-shared` source code are immediately visible to `banksy-server` without reinstallation. + +Sources defined in the workspace root `pyproject.toml` propagate to all members unless a member overrides the same source in its own `[tool.uv.sources]`. + +### 2.3 Single Lockfile + +The workspace produces a single `uv.lock` at the repository root. This lockfile resolves dependencies for all members together, guaranteeing version consistency across the entire workspace. If `banksy-server` and `banksy-harness` both depend on `httpx`, they will always get the same version. + +`uv lock` operates on the entire workspace at once. There is no per-member locking. + +### 2.4 Targeting Specific Members + +Commands can target a specific member from any directory in the workspace: + +```bash +# Run server tests +uv run --package banksy-server pytest + +# Sync only harness dependencies +uv sync --package banksy-harness + +# Run a script defined in server +uv run --package banksy-server banksy-server +``` + +Without `--package`, commands operate on the workspace root. + +### 2.5 `requires-python` Constraint + +All workspace members must share a compatible `requires-python`. uv takes the intersection of all members' `requires-python` values. For this project, all members will use `>=3.14`, which is trivially compatible. + +If a member needed a different Python version, it would have to be extracted from the workspace and use path dependencies instead. + +### 2.6 Docker Builds + +uv workspaces have first-class Docker support. The key pattern uses intermediate layers to cache dependency installation: + +```dockerfile +FROM python:3.14-slim +COPY --from=ghcr.io/astral-sh/uv:0.10 /uv /uvx /bin/ + +WORKDIR /app + +# Layer 1: Install dependencies only (no workspace member source code). +# --frozen skips lockfile validation (member pyproject.tomls not yet copied). +# --no-install-workspace skips installing workspace members themselves. +RUN --mount=type=cache,target=/root/.cache/uv \ + --mount=type=bind,source=uv.lock,target=uv.lock \ + --mount=type=bind,source=pyproject.toml,target=pyproject.toml \ + uv sync --frozen --no-install-workspace --no-dev + +# Layer 2: Copy all source and install workspace members. +COPY . /app +RUN --mount=type=cache,target=/root/.cache/uv \ + uv sync --locked --no-dev +``` + +For deploying a specific member (e.g., only the server), you can add `--package banksy-server` to both `uv sync` commands to install only that member and its dependencies, excluding unrelated members like `banksy-harness`. + +For multi-image builds (server image vs harness image), each Dockerfile targets a different `--package`. + +### 2.7 CI Patterns + +```yaml +# Per-package test runs (each member's pyproject.toml defines testpaths = ["tests"]) +- name: Test server + run: uv run --package banksy-server pytest + +- name: Test harness + run: uv run --package banksy-harness pytest + +- name: Test shared + run: uv run --package banksy-shared pytest + +# Shared linting (whole workspace) +- name: Lint + run: uv run ruff check . + +- name: Type check + run: uv run pyright +``` + +Each member's `[tool.pytest.ini_options]` sets `testpaths = ["tests"]`, so `uv run --package X pytest` discovers the correct co-located tests automatically. For stricter dependency isolation, use `uv run --exact --package X pytest` to ensure only that member's declared dependencies are available. + +`uv sync` and `uv lock --check` validate the lockfile for the entire workspace in one operation. + +### 2.8 Limitations + +1. **No import boundary enforcement.** Python's module system does not prevent workspace member A from importing a transitive dependency installed for member B. Workspace boundaries are a build-time and dependency declaration concept, not a runtime isolation boundary. Linting rules or import restrictions (e.g., `ruff` import conventions) can partially enforce this. +2. **Single `requires-python`.** All members must target compatible Python versions. +3. **Single lockfile resolution.** If two members need incompatible versions of the same transitive dependency, the workspace cannot resolve. In practice this is rare and usually signals a design problem. +4. **All `pyproject.toml` files needed for `--locked`.** The initial Docker layer trick requires `--frozen` (skip validation) because member `pyproject.toml` files aren't copied yet. The second `uv sync` with `--locked` validates correctness after all files are present. + +--- + +## 3. Package Boundary Analysis + +### 3.1 Candidates + +Four potential workspace members were evaluated: + +| Candidate | Role | Key Dependencies | +|-----------|------|-----------------| +| **banksy-server** | FastMCP MCP server serving tools to IDE clients | `fastmcp`, `httpx`, `starlette`, `sqlalchemy`, `pydantic-settings` | +| **banksy-harness** | MCP client, agent orchestration loop | `openai`, `anthropic`, `mcp` (Python SDK client), `pydantic` | +| **banksy-shared** | Shared library consumed by server and harness | `pydantic`, `sqlalchemy`, `httpx` (Mural API client) | +| **canvas-mcp** | Second MCP server (canvas tools) | `fastmcp` | + +### 3.2 Evaluation + +#### banksy-server (separate member: yes) + +This is the primary deployment unit. It owns: +- FastMCP server instantiation and configuration +- Tool definitions organized by domain (internal, public, canvas, shared) +- Auth providers (SSO proxy, Mural OAuth) +- HTTP routes (health, session activation, OAuth callback) +- SPA static file serving + +It depends on `banksy-shared` for database models and auth utilities, but has no need for LLM provider SDKs. Keeping it as its own member ensures the server image stays lean. + +#### banksy-harness (separate member: yes) + +The agent harness is an MCP **client** that connects to MCP servers (banksy-server, external servers) and orchestrates agent loops with LLM providers. Its dependency profile diverges significantly from the server: + +- **LLM SDKs**: `openai`, `anthropic`, `azure-identity` (for Azure OpenAI). These are heavy dependencies (the `openai` package alone pulls in `httpx`, `pydantic`, `distro`, `jiter`, `sniffio`). The server does not need any of them. +- **MCP client SDK**: Uses `mcp` (the official Python SDK) or `fastmcp.Client` for connecting to servers. This is a different usage of the MCP protocol than the server side. +- **No FastMCP server-side deps**: The harness does not import `fastmcp`'s server machinery, middleware, or transport layer. + +This dependency divergence is the primary motivation for a separate workspace member. Without it, deploying the MCP server would install OpenAI and Anthropic SDKs unnecessarily. + +#### banksy-shared (separate member: yes) + +Code consumed by both server and harness: +- **Pydantic models**: Shared request/response types, configuration models +- **SQLAlchemy models**: Database table definitions, session factories +- **Auth utilities**: Token management helpers, token verification logic +- **Mural API client**: httpx-based wrapper for calling Mural's REST APIs +- **Observability helpers**: Structured logging, metrics utilities + +Without a shared package, this code would either be duplicated or live in the server (forcing the harness to depend on the server). A shared library avoids both problems. + +The alternative of inlining shared code into each consumer was rejected because: +- Duplication leads to drift +- Harness depending on server couples their deployments and install footprints + +#### canvas-mcp (separate member: no) + +The canvas-mcp alignment assessment concluded that canvas-mcp's tools should be **copied into the server** rather than mounted as a sub-server. The reasons: + +1. canvas-mcp has only two trivial tools (`check_health`, `canvas_haiku`). The overhead of a separate workspace member for two small functions is not justified. +2. `mount()` in FastMCP ignores child auth providers — the parent's auth applies. This means a mounted canvas-mcp sub-server wouldn't have its own auth boundary anyway. +3. Custom routes and middleware from canvas-mcp would need to be re-registered on the parent server regardless. +4. The tool visibility research already designates `domains/canvas/` as the organizational home for canvas-related tools. + +Canvas-mcp tools will be absorbed into `banksy-server` under `domains/canvas/` with `register_canvas_tools(mcp)`. + +### 3.3 Dependency Flow + +``` +banksy-server ──depends-on──> banksy-shared +banksy-harness ──depends-on──> banksy-shared +``` + +Neither `banksy-server` nor `banksy-harness` depends on each other. Both depend only on `banksy-shared`. This creates a clean DAG with no cycles and clear isolation. + +### 3.4 Granularity Assessment + +Three members is the right granularity for this project: + +- **Too few (1-2)**: A single package forces LLM SDKs into the server. Two packages (server + harness, no shared) either duplicates code or couples the harness to the server. +- **Too many (4+)**: Splitting `banksy-shared` into sub-packages (e.g., `banksy-db`, `banksy-auth`, `banksy-models`) adds `pyproject.toml` overhead without meaningful isolation benefit — these sub-packages would almost always be installed together. +- **Three is the sweet spot**: Clear dependency divergence between server and harness, with a shared library that is small enough to be a single package but large enough to justify extraction. + +--- + +## 4. Agent Harness Workspace Constraints + +This section establishes the workspace-relevant constraints for the agent harness. The internal architecture of the agent system is out of scope — the question is whether it needs its own workspace member. + +### 4.1 Why a Separate Workspace Member + +**Dependency divergence is the deciding factor.** + +The MCP server's core dependencies are: +``` +fastmcp >= 3.1.0 +httpx +sqlalchemy[asyncio] +pydantic-settings +alembic +``` + +The agent harness's core dependencies are: +``` +openai +anthropic +azure-identity # Azure OpenAI auth +mcp # Official MCP Python SDK (client usage) +``` + +These two dependency sets are almost completely disjoint. The `openai` package alone brings in 6+ transitive dependencies. Including it in the server package means: +- Server Docker images grow by ~30-50MB of unnecessary packages +- Potential version conflicts between httpx versions (FastMCP and OpenAI both depend on httpx) +- Security surface area increases for a component that doesn't use the packages + +**Deployment flexibility is the secondary factor.** + +The agent harness may run as: +- A separate process in the same container (different entry point) +- A sidecar container in the same pod +- A completely separate deployment + +All three models are naturally supported if the harness is its own workspace member with its own entry point. If it's a module inside the server package, extracting it later requires restructuring imports and moving files. + +### 4.2 What the Harness Shares with the Server + +Despite being a separate package, the harness and server share code via `banksy-shared`: + +| Shared Code | Used By Server | Used By Harness | +|-------------|---------------|-----------------| +| SQLAlchemy models (token store) | Yes — stores OAuth tokens | Yes — reads token state | +| Pydantic config models | Yes — server settings | Yes — harness settings | +| Mural API client (httpx wrapper) | Yes — `from_openapi()` | Possibly — for context enrichment | +| Auth token utilities | Yes — token verification | Yes — token acquisition | +| Observability helpers | Yes | Yes | + +This shared code lives in `banksy-shared`, not duplicated in either consumer. + +### 4.3 Multiple Agents Over Time + +The prompt notes that multiple agents with different focuses may exist over time. The key design insight is: + +> Multiple agents will likely share the same base dependencies and differ in configuration/prompts, not in code structure. Think of them as instances of a harness, not separate packages. + +This means the `banksy-harness` package is a **single** workspace member that supports multiple agent configurations. Different agents are: +- Different prompt/instruction files +- Different configuration objects (which MCP servers to connect to, which LLM model to use) +- Potentially different entry points or CLI subcommands within the same package + +If a future agent has truly different dependencies (e.g., a computer-vision agent needing `torch`), it can be added as a new workspace member at that time. Adding a member to a uv workspace is trivial: create a directory, add `pyproject.toml`, and the workspace glob picks it up. + +--- + +## 5. Layout Options + +### 5.1 Option A: 3-Member Workspace with `pypackages/` (Recommended) + +#### Directory Tree + +``` +banksy/ +├── packages/ # EXISTING TS (untouched during transition) +│ ├── banksy-core/ +│ ├── banksy-mural-api/ +│ └── banksy-public-api/ +├── pypackages/ # Python workspace members +│ ├── server/ +│ │ ├── pyproject.toml +│ │ ├── src/ +│ │ │ └── banksy_server/ +│ │ │ ├── __init__.py +│ │ │ ├── server.py +│ │ │ ├── config.py +│ │ │ ├── mural_api.py +│ │ │ ├── spa.py +│ │ │ ├── auth/ +│ │ │ │ ├── __init__.py +│ │ │ │ ├── providers.py +│ │ │ │ ├── sso_proxy.py +│ │ │ │ ├── mural_oauth.py +│ │ │ │ ├── token_manager.py +│ │ │ │ └── token_verifier.py +│ │ │ ├── domains/ +│ │ │ │ ├── __init__.py +│ │ │ │ ├── internal/ +│ │ │ │ │ ├── __init__.py +│ │ │ │ │ └── tools.py +│ │ │ │ ├── public/ +│ │ │ │ │ ├── __init__.py +│ │ │ │ │ └── tools.py +│ │ │ │ ├── canvas/ +│ │ │ │ │ ├── __init__.py +│ │ │ │ │ └── tools.py +│ │ │ │ └── shared/ +│ │ │ │ ├── __init__.py +│ │ │ │ └── tools.py +│ │ │ ├── routes/ +│ │ │ │ ├── __init__.py +│ │ │ │ ├── health.py +│ │ │ │ ├── session_activation.py +│ │ │ │ └── mural_oauth_callback.py +│ │ │ ├── middleware/ +│ │ │ │ ├── __init__.py +│ │ │ │ ├── logging.py +│ │ │ │ └── metrics.py +│ │ │ └── db/ +│ │ │ ├── __init__.py +│ │ │ ├── engine.py +│ │ │ └── token_store.py +│ │ └── tests/ +│ │ ├── conftest.py +│ │ ├── test_tools/ +│ │ ├── test_auth/ +│ │ └── test_integration/ +│ ├── harness/ +│ │ ├── pyproject.toml +│ │ ├── src/ +│ │ │ └── banksy_harness/ +│ │ │ ├── __init__.py +│ │ │ ├── agent.py +│ │ │ ├── config.py +│ │ │ ├── llm/ +│ │ │ │ ├── __init__.py +│ │ │ │ ├── openai.py +│ │ │ │ └── anthropic.py +│ │ │ └── mcp_client/ +│ │ │ ├── __init__.py +│ │ │ └── client.py +│ │ └── tests/ +│ │ ├── conftest.py +│ │ ├── test_agent/ +│ │ └── test_llm/ +│ └── shared/ +│ ├── pyproject.toml +│ ├── src/ +│ │ └── banksy_shared/ +│ │ ├── __init__.py +│ │ ├── models/ +│ │ │ ├── __init__.py +│ │ │ └── tokens.py +│ │ ├── auth/ +│ │ │ ├── __init__.py +│ │ │ └── token_utils.py +│ │ ├── mural_client/ +│ │ │ ├── __init__.py +│ │ │ └── client.py +│ │ └── observability/ +│ │ ├── __init__.py +│ │ └── logging.py +│ └── tests/ +│ ├── conftest.py +│ ├── test_models/ +│ └── test_auth/ +├── conftest.py # Shared test fixtures (root level) +├── ui/ # React SPA (standalone Node.js project) +│ ├── package.json +│ └── src/ +├── migrations/ # Alembic (shared DB schema) +│ ├── alembic.ini +│ ├── env.py +│ └── versions/ +├── pyproject.toml # Workspace root +├── uv.lock # Single lockfile +├── .python-version # 3.14 +├── package.json # EXISTING TS workspace root +├── package-lock.json # EXISTING npm lockfile +└── .github/workflows/ + ├── build.yml # EXISTING TS Docker builds + ├── quality.yml # EXISTING TS lint/test + └── python.yml # NEW Python CI +``` + +#### Workspace Root `pyproject.toml` + +```toml +[project] +name = "banksy-workspace" +version = "0.0.0" +description = "Banksy monorepo workspace root" +requires-python = ">=3.14" + +[tool.uv.workspace] +members = ["pypackages/*"] + +[build-system] +requires = ["hatchling"] +build-backend = "hatchling.build" + +[dependency-groups] +dev = [ + "pyright>=1.1.0", + "pytest>=8.0.0", + "pytest-asyncio>=0.24.0", + "ruff>=0.8.0", + "pre-commit>=4.0.0", +] + +[tool.ruff] +target-version = "py314" +src = ["pypackages/*/src", "pypackages/*/tests"] + +[tool.ruff.lint] +select = ["E", "F", "W", "I", "UP", "B", "SIM", "TCH"] + +[tool.pyright] +pythonVersion = "3.14" +typeCheckingMode = "strict" +include = ["pypackages/*/src", "pypackages/*/tests"] + +[tool.pytest.ini_options] +asyncio_mode = "auto" +``` + +#### Server `pyproject.toml` + +```toml +[project] +name = "banksy-server" +version = "0.1.0" +description = "Banksy MCP server" +requires-python = ">=3.14" +dependencies = [ + "banksy-shared", + "fastmcp>=3.1.0", + "httpx>=0.28.0", + "pydantic-settings>=2.0.0", + "sqlalchemy[asyncio]>=2.0.0", + "asyncpg>=0.30.0", + "alembic>=1.14.0", +] + +[tool.uv.sources] +banksy-shared = { workspace = true } + +[project.scripts] +banksy-server = "banksy_server.server:main" + +[build-system] +requires = ["hatchling"] +build-backend = "hatchling.build" + +[tool.pytest.ini_options] +testpaths = ["tests"] +``` + +#### Harness `pyproject.toml` + +```toml +[project] +name = "banksy-harness" +version = "0.1.0" +description = "Banksy agent harness (MCP client)" +requires-python = ">=3.14" +dependencies = [ + "banksy-shared", + "openai>=1.0.0", + "anthropic>=0.40.0", + "azure-identity>=1.19.0", + "mcp>=1.0.0", +] + +[tool.uv.sources] +banksy-shared = { workspace = true } + +[project.scripts] +banksy-harness = "banksy_harness.agent:main" + +[build-system] +requires = ["hatchling"] +build-backend = "hatchling.build" + +[tool.pytest.ini_options] +testpaths = ["tests"] +``` + +#### Shared `pyproject.toml` + +```toml +[project] +name = "banksy-shared" +version = "0.1.0" +description = "Shared models, auth, and utilities for banksy" +requires-python = ">=3.14" +dependencies = [ + "pydantic>=2.0.0", + "sqlalchemy[asyncio]>=2.0.0", + "httpx>=0.28.0", +] + +[build-system] +requires = ["hatchling"] +build-backend = "hatchling.build" + +[tool.pytest.ini_options] +testpaths = ["tests"] +``` + +#### Docker Builds + +**Server Dockerfile** (`Dockerfile.server`): + +```dockerfile +FROM python:3.14-slim AS builder +COPY --from=ghcr.io/astral-sh/uv:0.10 /uv /uvx /bin/ +WORKDIR /app + +# Copy workspace root config and lockfile +COPY pyproject.toml uv.lock ./ + +# Copy only the member pyproject.toml files needed for resolution +COPY pypackages/server/pyproject.toml pypackages/server/pyproject.toml +COPY pypackages/shared/pyproject.toml pypackages/shared/pyproject.toml + +# Install dependencies (no source code yet) +RUN --mount=type=cache,target=/root/.cache/uv \ + uv sync --frozen --no-install-workspace --no-dev --package banksy-server + +# Copy source code +COPY pypackages/server/src pypackages/server/src +COPY pypackages/shared/src pypackages/shared/src + +# Install workspace members (non-editable for production) +RUN --mount=type=cache,target=/root/.cache/uv \ + uv sync --locked --no-dev --no-editable --package banksy-server + +# SPA build stage (Node.js) +FROM node:22 AS spa-builder +WORKDIR /app/ui +COPY ui/package.json ui/package-lock.json ./ +RUN npm ci +COPY ui/ . +RUN npm run build + +# Production image +FROM python:3.14-slim +COPY --from=builder /app/.venv /app/.venv +COPY --from=spa-builder /app/ui/dist /app/ui/dist +COPY migrations /app/migrations +ENV PATH="/app/.venv/bin:$PATH" +WORKDIR /app +CMD ["banksy-server"] +``` + +**Harness Dockerfile** (`Dockerfile.harness`): + +```dockerfile +FROM python:3.14-slim AS builder +COPY --from=ghcr.io/astral-sh/uv:0.10 /uv /uvx /bin/ +WORKDIR /app + +COPY pyproject.toml uv.lock ./ +COPY pypackages/harness/pyproject.toml pypackages/harness/pyproject.toml +COPY pypackages/shared/pyproject.toml pypackages/shared/pyproject.toml + +RUN --mount=type=cache,target=/root/.cache/uv \ + uv sync --frozen --no-install-workspace --no-dev --package banksy-harness + +COPY pypackages/harness/src pypackages/harness/src +COPY pypackages/shared/src pypackages/shared/src + +RUN --mount=type=cache,target=/root/.cache/uv \ + uv sync --locked --no-dev --no-editable --package banksy-harness + +FROM python:3.14-slim +COPY --from=builder /app/.venv /app/.venv +ENV PATH="/app/.venv/bin:$PATH" +WORKDIR /app +CMD ["banksy-harness"] +``` + +The harness image does not include FastMCP, the SPA, or Alembic migrations. The server image does not include OpenAI or Anthropic SDKs. Both share `banksy-shared`. + +#### CI + +```yaml +# .github/workflows/python.yml +name: Python CI + +on: [push, pull_request] + +jobs: + lint: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + - uses: astral-sh/setup-uv@v4 + - run: uv sync --frozen + - run: uv run ruff check . + - run: uv run ruff format --check . + - run: uv run pyright + + test-server: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + - uses: astral-sh/setup-uv@v4 + - run: uv sync --frozen --package banksy-server + - run: uv run --package banksy-server pytest + + test-harness: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + - uses: astral-sh/setup-uv@v4 + - run: uv sync --frozen --package banksy-harness + - run: uv run --package banksy-harness pytest + + test-shared: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + - uses: astral-sh/setup-uv@v4 + - run: uv sync --frozen --package banksy-shared + - run: uv run --package banksy-shared pytest +``` + +#### Transition Period Coexistence + +- `packages/` holds TypeScript npm workspace members (read-only reference during migration) +- `pypackages/` holds Python uv workspace members (new code) +- `package.json` (npm workspace root) and `pyproject.toml` (uv workspace root) coexist at the repo root +- `.github/workflows/quality.yml` handles TS CI; `.github/workflows/python.yml` handles Python CI +- No naming collision: npm workspaces use `packages/*`, uv workspaces use `pypackages/*` + +--- + +### 5.2 Option B: 2-Member Workspace (Server + Harness, No Shared Package) + +#### Directory Tree + +``` +banksy/ +├── packages/ # EXISTING TS +├── pypackages/ +│ ├── server/ +│ │ ├── pyproject.toml +│ │ └── src/banksy_server/ +│ │ ├── ... # Same as Option A +│ │ ├── shared/ # Shared code lives inside server +│ │ │ ├── models/ +│ │ │ ├── auth/ +│ │ │ ├── mural_client/ +│ │ │ └── observability/ +│ │ └── ... +│ └── harness/ +│ ├── pyproject.toml # depends on banksy-server +│ └── src/banksy_harness/ +├── tests/ +├── ui/ +├── migrations/ +├── pyproject.toml +└── uv.lock +``` + +#### Configuration + +The harness would depend on the server package to access shared code: + +```toml +# pypackages/harness/pyproject.toml +[project] +dependencies = [ + "banksy-server", # for shared models, auth, etc. + "openai>=1.0.0", + "anthropic>=0.40.0", +] + +[tool.uv.sources] +banksy-server = { workspace = true } +``` + +#### Tradeoffs + +| Aspect | Assessment | +|--------|-----------| +| Fewer packages | Positive: less `pyproject.toml` overhead | +| Harness depends on server | **Negative**: harness `pip install` pulls FastMCP and all server deps | +| Coupling | **Negative**: harness imports from `banksy_server.shared.*` — tight coupling to server internals | +| Deployment | **Negative**: cannot deploy harness without server package installed | +| Testing | Neutral: still separate test dirs, but harness tests install server deps | + +**Verdict**: The coupling and deployment problems outweigh the reduced overhead. Rejected. + +--- + +### 5.3 Option C: Single Package with Optional Dependency Groups + +#### Directory Tree + +``` +banksy/ +├── packages/ # EXISTING TS +├── src/ +│ └── banksy/ +│ ├── server/ +│ ├── harness/ +│ ├── shared/ +│ └── ... +├── tests/ +├── ui/ +├── migrations/ +├── pyproject.toml # Single pyproject, no workspace +└── uv.lock +``` + +#### Configuration + +```toml +[project] +name = "banksy" +dependencies = [ + "fastmcp>=3.1.0", + "httpx>=0.28.0", + "pydantic-settings>=2.0.0", + "sqlalchemy[asyncio]>=2.0.0", +] + +[project.optional-dependencies] +harness = [ + "openai>=1.0.0", + "anthropic>=0.40.0", + "mcp>=1.0.0", +] +``` + +Install server only: `uv sync`. Install with harness: `uv sync --extra harness`. + +#### Tradeoffs + +| Aspect | Assessment | +|--------|-----------| +| Simplest setup | Positive: one `pyproject.toml`, no workspace config | +| No import isolation | **Negative**: `banksy.harness` can import `banksy.server` internals freely | +| No independent deployment | **Negative**: always one package, one image, one install | +| Optional deps not enforced | **Negative**: code that uses `openai` won't fail at import time if the extra isn't installed — it fails at runtime | +| Transition from migration plan | Positive: matches the original single-`pyproject.toml` plan | + +**Verdict**: Adequate if the harness were a small, tightly-coupled add-on. It isn't — the harness has distinct deployment needs and significant dependency divergence. Rejected. + +--- + +## 6. Domains vs. Packages Resolution + +### 6.1 Two Orthogonal Concepts + +The tool visibility research proposed `src/banksy/domains/` to organize tool groups within the MCP server. uv workspace members organize Python packages for dependency isolation and deployment. These are different axes of organization: + +| Concern | Mechanism | Granularity | +|---------|-----------|-------------| +| Tool organization within the server | `domains/` directories with `register_*_tools(mcp)` | Per-tool-group | +| Dependency isolation between services | uv workspace members | Per-deployable | + +They coexist without conflict. The server workspace member contains `domains/` internally. The workspace boundary wraps the server (and its domains) as a single unit. + +### 6.2 Canvas Domain, Not Canvas Package + +The canvas-mcp assessment recommended absorbing canvas-mcp tools directly into the server rather than mounting a sub-server. In the workspace layout, this means: + +- canvas-mcp tools live at `pypackages/server/src/banksy_server/domains/canvas/tools.py` +- They are registered via `register_canvas_tools(mcp)` in the server's entry point +- They receive the server's auth provider (per FastMCP mount semantics, child auth is ignored anyway) +- They are tagged with `domain:canvas` per the tag taxonomy from the tool visibility research + +Canvas is a domain (organizational grouping within the server), not a package (workspace member with its own dependencies). + +### 6.3 `from_openapi()` Generated Tools + +The `FastMCP.from_openapi()` tools are generated at import time from OpenAPI specs. They are not hand-written files in a directory. In the domain structure, they fit as: + +```python +# banksy_server/domains/internal/__init__.py +from banksy_server.mural_api import create_mural_api_tools + +def register_internal_tools(mcp: FastMCP) -> None: + # Hand-written composite tools + from .tools import my_composite_tool + mcp.add_tool(my_composite_tool) + + # Generated tools from OpenAPI spec + mural_api = create_mural_api_tools() + mcp.mount("mural", mural_api) +``` + +The `from_openapi()` call is a runtime operation that produces a FastMCP sub-server. It can be mounted or its tools can be registered individually. Either way, it's invoked from within a domain's registration function, not represented as a file in the directory tree. + +### 6.4 Summary + +| Question | Answer | +|----------|--------| +| Does `domains/` still make sense inside a workspace member? | Yes. Domains organize tools within the server. | +| Is canvas a domain or a package? | A domain inside banksy-server. | +| Where do `from_openapi()` tools go? | Called from domain registration functions; not directory-based. | +| Do workspace boundaries replace domains? | No. They operate at different levels. | + +--- + +## 7. Migration Path Recommendation + +### 7.1 Recommendation: Start with Workspaces from Phase 1 + +The migration strategy's escape hatch says: "If a second Python service is needed in the same repo, graduate to uv workspaces." That trigger has arrived. The agent harness is a second Python service with distinct dependencies and potential for independent deployment. + +Rather than starting with a single `pyproject.toml` and restructuring later, **start with uv workspaces in Phase 1**. + +### 7.2 Cost Comparison + +#### Cost of starting with workspaces (option b) + +- Create 3 `pyproject.toml` files instead of 1 (~30 minutes) +- Add `[tool.uv.workspace]` to root `pyproject.toml` (~5 minutes) +- Use `pypackages/server/src/banksy_server/` instead of `src/banksy/` (different path, same effort) +- Docker builds use `--package` flag (trivial addition) +- CI uses `--package` for targeted test runs (trivial addition) + +**Total incremental cost: ~1 hour.** + +#### Cost of restructuring later (option a: start single, graduate) + +- Move `src/banksy/` to `pypackages/server/src/banksy_server/` +- Extract shared code into `pypackages/shared/src/banksy_shared/` +- Update every import statement in both server and extracted shared code +- Restructure tests from `tests/` to `pypackages/server/tests/`, `pypackages/shared/tests/` +- Update `pyproject.toml` from single to workspace (rewrite, not edit) +- Update Dockerfile(s) for workspace-aware builds +- Update CI workflows for per-package targeting +- Update any developer scripts, pre-commit config, IDE settings +- Risk of import breakage during restructuring + +**Total cost: 2-4 hours of churn plus regression risk.** + +### 7.3 Why Not Wait + +The "start single" recommendation from `05-toolchain-project-mgmt-monorepo.md` was based on the assumption that a second Python service was uncertain: "For 1-2 packages, a workspace is extra setup for little gain." That assumption no longer holds. The agent harness is a confirmed requirement, not a hypothetical. + +Starting with workspaces also has a compounding benefit: every file, test, import, and Docker layer created during Phase 1 through Phase 8 will already be in the right place. The restructuring cost grows linearly with the amount of code written before the switch. + +### 7.4 Phasing Within Workspaces + +Workspaces don't require all members to exist from day one. The recommended approach: + +1. **Phase 1**: Create the workspace structure with `server` and `shared` as members. `harness` can be an empty placeholder or omitted entirely. +2. **Phase 2-8**: Migration proceeds as planned, but code goes into `pypackages/server/` instead of `src/banksy/`. +3. **When agent work begins**: Add `harness/` to `pypackages/` (or flesh out the placeholder). The workspace glob `pypackages/*` picks it up automatically. + +The workspace infrastructure is in place from the start, but the harness member is populated on its own timeline. + +--- + +## 8. Recommended Layout + +**Option A (3-Member Workspace with `pypackages/`)** is the recommended layout. + +### 8.1 Full Configuration + +The complete directory tree and all `pyproject.toml` configurations are provided in Section 5.1. Here is a summary of the key structural decisions: + +| Decision | Choice | Rationale | +|----------|--------|-----------| +| Workspace member directory | `pypackages/` | Avoids collision with TS `packages/` during transition | +| Number of members | 3 (server, harness, shared) | Matches dependency divergence exactly | +| Python package names | `banksy_server`, `banksy_harness`, `banksy_shared` | Underscore convention per PEP 8; `banksy-*` for distribution names | +| Test location | Co-located `tests/` inside each workspace member | Ecosystem norm for uv workspaces; tests move with code; per-member pytest config | +| Migrations location | Top-level `migrations/` | Shared DB schema, not member-specific | +| SPA location | Top-level `ui/` | Standalone Node.js project, not a Python package | +| Build backend | hatchling | Matches canvas-mcp alignment decision | +| Docker strategy | Separate Dockerfiles per deployable | Server and harness have different images | + +### 8.2 Import Examples + +```python +# From banksy-server code: +from banksy_shared.models.tokens import TokenRecord +from banksy_shared.auth.token_utils import verify_token +from banksy_server.domains.internal.tools import get_mural_content +from banksy_server.config import ServerSettings + +# From banksy-harness code: +from banksy_shared.models.tokens import TokenRecord +from banksy_shared.mural_client.client import MuralApiClient +from banksy_harness.llm.openai import call_openai +from banksy_harness.config import HarnessSettings +``` + +The harness never imports from `banksy_server`. The server never imports from `banksy_harness`. Both import from `banksy_shared`. + +### 8.3 Adding a New Workspace Member + +When a future service is needed (e.g., a webhook receiver, a data pipeline): + +```bash +cd pypackages +uv init new-service --package +``` + +This creates `pypackages/new-service/pyproject.toml` and `pypackages/new-service/src/new_service/__init__.py`. The workspace glob `pypackages/*` picks it up automatically. Add `banksy-shared` as a dependency if shared code is needed. + +--- + +## 9. Impact on Migration Strategy + +Adopting this recommendation requires updates to specific sections of `00-migration-execution-strategy.plan.md`. + +### 9.1 Repo Layout (Lines 97-196) + +**Current**: Shows `src/banksy/` as a single Python package at the repo root. + +**Updated**: Replace with the workspace layout from Section 5.1. Key changes: +- `src/banksy/` becomes `pypackages/server/src/banksy_server/` +- Add `pypackages/shared/src/banksy_shared/` for extracted shared code +- Add `pypackages/harness/src/banksy_harness/` (placeholder or populated later) +- `pyproject.toml` at root becomes workspace root instead of single project +- `tests/` moves from top-level to co-located inside each workspace member (e.g., `pypackages/server/tests/`) + +### 9.2 Escape Hatches (Lines 888-895) + +**Current**: Lists "Single pyproject.toml → uv workspaces: If a second Python service is needed in the same repo, graduate to uv workspaces." + +**Updated**: Remove this escape hatch. It has been exercised — we are starting with workspaces. Replace with: "3-member workspace → additional members: If a new Python service with distinct dependencies is needed, add a directory under `pypackages/` with its own `pyproject.toml`." + +### 9.3 Phase 1 Bootstrap (pyproject.toml Section) + +**Current**: Shows a single `pyproject.toml` with project metadata, dependencies, and build config for `banksy`. + +**Updated**: Show the workspace root `pyproject.toml` plus the server member's `pyproject.toml`. The server member's dependencies match what was previously in the single `pyproject.toml`. The shared member's `pyproject.toml` extracts the subset of deps needed by both server and harness. + +### 9.4 Dockerfile Section + +**Current**: Single Dockerfile builds one Python image. + +**Updated**: Two Dockerfiles (`Dockerfile.server`, `Dockerfile.harness`) as shown in Section 5.1. Both use the workspace-aware `--no-install-workspace` and `--package` flags. + +### 9.5 CI Section + +**Current**: Single `python.yml` workflow runs tests for the one Python package. + +**Updated**: `python.yml` runs per-package test jobs in parallel (server, harness, shared) plus a shared lint/typecheck job. See Section 5.1 CI configuration. + +### 9.6 "After Cleanup" Layout (Lines 166-196) + +**Current**: Shows `src/banksy/` as the final structure. + +**Updated**: Replace with workspace layout. After TS cleanup: + +``` +banksy/ +├── pypackages/ # Optionally renamed to packages/ +│ ├── server/ +│ │ ├── pyproject.toml +│ │ ├── src/banksy_server/ +│ │ └── tests/ +│ ├── harness/ +│ │ ├── pyproject.toml +│ │ ├── src/banksy_harness/ +│ │ └── tests/ +│ └── shared/ +│ ├── pyproject.toml +│ ├── src/banksy_shared/ +│ └── tests/ +├── conftest.py # Shared test fixtures +├── ui/ +├── migrations/ +├── pyproject.toml # Workspace root +├── uv.lock +├── .python-version +└── .github/workflows/ +``` + +Once `packages/` (TS) is deleted, `pypackages/` can optionally be renamed to `packages/` for simplicity, updating the workspace glob accordingly. + +### 9.7 Sections Not Affected + +The following sections of the migration strategy remain unchanged: +- Phase sequencing (Phases 1-9 order and scope) +- Auth architecture (SSO proxy, Mural OAuth, Session Activation) +- Tool migration approach (from_openapi, hand-written composites) +- SPA architecture (Vite, React, served via StaticFiles) +- Database schema (PostgreSQL, Alembic, token storage) +- Risk matrix (risks remain the same; workspace adds no new risks) + +The change is structural (where files live, how they're packaged) not architectural (what the code does). + +--- + +## 10. Test Location Analysis + +### 10.1 Context + +The original layout in Section 5.1 placed tests in a top-level `tests/` directory with per-member subdirectories (`tests/server/`, `tests/harness/`, `tests/shared/`). This section evaluates whether co-locating tests inside each workspace member is a better fit. + +### 10.2 Two Layouts Compared + +**Layout A: Top-level tests (original)** + +``` +banksy/ +├── pypackages/ +│ ├── server/ +│ │ ├── pyproject.toml +│ │ └── src/banksy_server/ +│ ├── harness/ +│ └── shared/ +├── tests/ +│ ├── server/ +│ ├── harness/ +│ └── shared/ +├── pyproject.toml +``` + +**Layout B: Co-located tests (revised recommendation)** + +``` +banksy/ +├── pypackages/ +│ ├── server/ +│ │ ├── pyproject.toml +│ │ ├── src/banksy_server/ +│ │ └── tests/ +│ │ ├── conftest.py +│ │ └── test_tools/ +│ ├── harness/ +│ │ ├── pyproject.toml +│ │ ├── src/banksy_harness/ +│ │ └── tests/ +│ └── shared/ +│ ├── pyproject.toml +│ ├── src/banksy_shared/ +│ └── tests/ +├── conftest.py # Shared root fixtures +├── pyproject.toml +``` + +### 10.3 Tradeoffs + +| Aspect | Layout A (top-level) | Layout B (co-located) | +|--------|---------------------|-----------------------| +| Cohesion | Tests divorced from code they verify | Tests live next to their package | +| Move/rename/delete a member | Must update test paths separately | Tests move with the member automatically | +| pytest config | Single root `[tool.pytest.ini_options]` | Per-member `testpaths = ["tests"]` in each `pyproject.toml`; root defines shared settings (markers, asyncio_mode) | +| Shared fixtures | Root `tests/conftest.py` | Root `conftest.py` + per-member `conftest.py` (pytest discovers up the tree) | +| CI targeting | `pytest tests/server/` | `pytest` (per-member `testpaths` handles it) or explicit `pytest pypackages/server/tests/` | +| Dependency isolation in CI | `uv run --package X pytest tests/X/` | `uv run --exact --package X pytest` catches undeclared deps | +| Docker impact | No impact — tests not in `src/` | No impact — Dockerfiles COPY only `src/` subdirectories | +| IDE/tooling (pyright, ruff) | `include` references `tests/` | `include` references `pypackages/*/tests` | +| Ecosystem alignment | Minority pattern for uv workspaces | Dominant pattern; matches Cargo convention | +| Grep/search scope | All tests in one subtree | Tests distributed across members | + +### 10.4 Ecosystem Evidence + +**pytest official docs** describe two layouts for single-package projects: "tests outside application code" and "tests as part of application code." The "outside" recommendation exists to prevent accidentally importing the working-directory copy instead of the installed package. With `src` layout, this concern is already resolved regardless of where tests sit — `pypackages/server/tests/` is outside `pypackages/server/src/banksy_server/`, so the pytest guidance is satisfied by both layouts. + +**Real-world uv workspace repos:** + +| Repo | Stars | Test Layout | +|------|-------|-------------| +| fedragon/uv-workspace-example | — | Co-located per member | +| AndreuCodina/python-monorepo | — | Co-located per member | +| gafda/uv-workspaces-a-python-example | 1 | Co-located per member | +| JasperHG90/uv-monorepo | 159 | Top-level `tests/` | +| FastMCP (not a workspace) | — | Top-level `tests/` | + +The co-located pattern is the norm for multi-member uv workspaces. The top-level pattern appears in single-package projects (FastMCP) and in one workspace repo that treats the root as the only real application. uv workspaces are modeled after Cargo, where each crate owns its `tests/` directory. + +**uv issue #8302** (best-practice workspace template) and **uv issue #10630** (per-package testing) both show community members using co-located `tests/` per member. The `--exact` flag was introduced specifically to support isolated per-member test runs. + +### 10.5 Shared Fixture Strategy + +With co-located tests, shared fixtures use a two-level `conftest.py` approach: + +- **Root `conftest.py`** (at the workspace root): Shared fixtures available to all members — database factories, mock Mural API responses, common test configuration, `pytest` plugins. +- **Per-member `conftest.py`** (e.g., `pypackages/server/tests/conftest.py`): Fixtures specific to that member — mock FastMCP server instances, domain-specific test data. + +pytest auto-discovers `conftest.py` files up the directory tree when invoked from the workspace root. Running `uv run --package banksy-server pytest pypackages/server/tests/` from the repo root makes both the root and member-level `conftest.py` files available. + +### 10.6 Docker Implications + +No change needed. The existing Dockerfiles already COPY only `src/` subdirectories: + +```dockerfile +COPY pypackages/server/src pypackages/server/src +``` + +Co-located `tests/` directories are never included in production images. Even with a broader COPY, `--no-install-workspace` and `--no-editable` only install from `src/`. The `tests/` directory is not a Python package that gets installed. + +### 10.7 Recommendation + +**Layout B (co-located tests)** is the revised recommendation. It aligns with the dominant uv workspace ecosystem pattern, keeps tests coupled to the code they verify, simplifies workspace member lifecycle operations, and has no technical downsides with the existing Docker, CI, or tooling strategy. + +The directory trees, `pyproject.toml` configurations, CI patterns, and summary table in Sections 5.1, 2.7, 8.1, and 9 have been updated accordingly. diff --git a/fastmcp-migration/banksy-research/pyright-strict-dependency-typing-research.md b/fastmcp-migration/banksy-research/pyright-strict-dependency-typing-research.md new file mode 100644 index 0000000..6cc35c9 --- /dev/null +++ b/fastmcp-migration/banksy-research/pyright-strict-dependency-typing-research.md @@ -0,0 +1,597 @@ +# Pyright Strict Mode Typing Burden: Banksy Dependency Assessment + +## Context + +We are deciding whether to adopt Pyright `typeCheckingMode = "strict"` for banksy, a new FastMCP-based Python MCP server. A colleague on pdf-import (which uses strict with `pyrightconfig.json`) strongly recommends it. Strict mode flags not just missing annotations in your own code, but also unresolved/unknown types flowing in from third-party libraries. + +This document evaluates the typing quality of every production dependency banksy will use and assesses the real-world burden of strict mode. + +--- + +## 1. Per-Dependency Type Support Assessment + +### fastmcp `>=3.1` + +**Usage in banksy**: Core framework — server, tools, `from_openapi()`, middleware, `Client` testing. + +- **Inline types or stubs?** Ships `py.typed` marker at `src/fastmcp/py.typed`. Inline annotations throughout. +- **Stub completeness**: No external stubs needed — inline types provided. However, many public APIs use `Any` (see Section 2). +- **Pyright strict compatibility**: FastMCP does not use Pyright itself. Their codebase uses **ty** (`[tool.ty]` in `pyproject.toml`), with no `pyrightconfig.json` and no Pyright step in CI. Tests contain `# pyright: ignore[...]` comments in 3 files (`test_mcp_config.py`, `test_context.py`, `test_uv_transport.py`), suggesting some awareness of Pyright but not full compliance. +- **Known workarounds**: Typed wrapper functions around `from_openapi()` and `tool()` to constrain return types. For `Client` result access, a typed helper to extract and validate `CallToolResult.data`. + +### httpx `>=0.27` + +**Usage in banksy**: HTTP client for Mural API calls, passed to `from_openapi()`. + +- **Inline types or stubs?** Ships `py.typed` marker. Inline annotations with explicit field-level types added in PR [encode/httpx#2469](https://github.com/encode/httpx/pull/2469). +- **Stub completeness**: N/A — all types inline. +- **Pyright strict compatibility**: Type completeness improved from 26.9% to ~100% via targeted PRs ([#2435](https://github.com/encode/httpx/pull/2435), [#2469](https://github.com/encode/httpx/pull/2469), [#2840](https://github.com/encode/httpx/pull/2840)). EventHooks properly typed with explicit `Callable` parameters ([#2266](https://github.com/encode/httpx/pull/2266)). The Pyright issue [microsoft/pyright#2473](https://github.com/microsoft/pyright/issues/2473) ("Incomplete types for installed package httpx") was addressed by these PRs. +- **Known workarounds**: None expected for current versions. + +### httpx-retries `>=0.1` + +**Usage in banksy**: Retry transport wrapping httpx. + +- **Inline types or stubs?** Ships `py.typed` marker. Small focused API with inline annotations. +- **Stub completeness**: N/A — all types inline. +- **Pyright strict compatibility**: `RetryTransport` uses proper typed parameters including `Optional[Union[BaseTransport, AsyncBaseTransport]]` and `Optional[Retry]`. Active type-fix PRs ([will-ockmore/httpx-retries#6](https://github.com/will-ockmore/httpx-retries/pull/6)) indicate the maintainer uses type checking tooling. +- **Known workarounds**: None expected. + +### pydantic `>=2.0` + +**Usage in banksy**: Data validation, model definitions, used pervasively. + +- **Inline types or stubs?** Ships `py.typed` marker. Full inline annotations. +- **Stub completeness**: N/A — all types inline. Pydantic v2 maintains a dedicated `tests/pyright` directory for Pyright validation ([pydantic/pydantic#10078](https://github.com/pydantic/pydantic/issues/10078)). +- **Pyright strict compatibility**: Core `BaseModel` field definitions, validators, and serialization are well-typed. Known edge cases: + - `TypeAdapter` with `Sequence` types triggers `reportAbstractUsage` ([microsoft/pyright#7680](https://github.com/microsoft/pyright/issues/7680)) — addressed upstream. + - `cast()` with `Annotated` types triggers `reportUnnecessaryCast` ([microsoft/pyright#7294](https://github.com/microsoft/pyright/issues/7294)). + - Future improvements planned via PEP 747 (`TypeExpr`) for better type hint precision. +- **Known workarounds**: Standard field definitions and validation work cleanly. Edge cases with `TypeAdapter` may require occasional `# pyright: ignore`. + +### pydantic-settings `>=2.0` + +**Usage in banksy**: `BaseSettings` for env var config. + +- **Inline types or stubs?** Ships `py.typed` marker (pydantic ecosystem). Inline annotations. +- **Stub completeness**: N/A — all types inline. +- **Pyright strict compatibility**: Known issues: + - Constructor parameters like `_env_file`, `_case_sensitive`, `_env_prefix` are not recognized as valid init arguments by Pyright ([pydantic/pydantic-settings#334](https://github.com/pydantic/pydantic-settings/issues/334)). Pyright reports unexpected keyword argument errors, while mypy handles them via its pydantic plugin. + - Complex field types like `HttpUrl` or `PurePosixPath` with string defaults trigger type errors ([pydantic/pydantic-settings#514](https://github.com/pydantic/pydantic-settings/issues/514)). + - No Pyright plugin equivalent to the mypy pydantic plugin. +- **Known workarounds**: 1–2 `# type: ignore` comments on `BaseSettings()` instantiation calls that pass `_env_file` etc. Alternatively, use the `model_config = SettingsConfigDict(...)` class variable pattern (which avoids constructor kwargs entirely). + +### sqlalchemy[asyncio] `>=2.0` + +**Usage in banksy**: ORM models, async engine/session, token storage. + +- **Inline types or stubs?** Ships `py.typed` marker. Native inline types since SQLAlchemy 2.0 — the separate `sqlalchemy2-stubs` package is deprecated. +- **Stub completeness**: N/A — all types inline. The mypy plugin is deprecated and will be removed in SQLAlchemy 2.1 ([docs](https://docs.sqlalchemy.org/en/20/orm/extensions/mypy.html)). +- **Pyright strict compatibility**: See Section 3 for detailed analysis. Summary of issues: + - `Mapped[]` field validation on `__init__` is NOT enforced by Pyright — by design ([microsoft/pyright#9741](https://github.com/microsoft/pyright/issues/9741)). + - `async_sessionmaker` generic inference fixed in [sqlalchemy/sqlalchemy#8842](https://github.com/sqlalchemy/sqlalchemy/pull/8842). + - Relationship definitions with complex join conditions may produce `Unknown` member types. + - `--verifytypes` with `--ignoreexternal` can inconsistently report SQLAlchemy symbols as partially unknown ([microsoft/pyright#11196](https://github.com/microsoft/pyright/issues/11196)). +- **Known workarounds**: Accept that Pyright won't validate model constructors (use runtime validation). Occasional `# type: ignore` for complex relationship patterns. + +### asyncpg `>=0.30` + +**Usage in banksy**: PostgreSQL async driver (used by SQLAlchemy, not directly). + +- **Inline types or stubs?** Does NOT ship `py.typed`. Relies on external `asyncpg-stubs`. +- **Stub completeness**: `asyncpg-stubs` v0.31.2 (released Feb 2026) on PyPI. Actively maintained by [bryanforbes/asyncpg-stubs](https://github.com/bryanforbes/asyncpg-stubs), version-matched to asyncpg releases. Checked with both mypy and Pyright. +- **Pyright strict compatibility**: Range/RangeValue protocol typing issue fixed in [MagicStack/asyncpg#1196](https://github.com/MagicStack/asyncpg/pull/1196). Native typing PRs ([#577](https://github.com/MagicStack/asyncpg/pull/577), [#1199](https://github.com/MagicStack/asyncpg/pull/1199)) are in progress but not merged. +- **Known workarounds**: Install `asyncpg-stubs` as a dev dependency. Since banksy uses asyncpg only as a SQLAlchemy driver (not calling asyncpg APIs directly), exposure to typing gaps is minimal. + +### alembic `>=1.15` + +**Usage in banksy**: Database migrations. + +- **Inline types or stubs?** Ships `py.typed` since Alembic 1.7.0. Legacy `alembic-stubs` v1.0.0 package exists for versions < 1.7. +- **Stub completeness**: Native types are incomplete. See Section 4 for details. +- **Pyright strict compatibility**: See Section 4. Multiple open issues: + - `begin_transaction()` returns partially unknown type ([sqlalchemy/alembic#1201](https://github.com/sqlalchemy/alembic/issues/1201)). + - `revision()` in `alembic.command` uses bare `Callable` ([sqlalchemy/alembic#1110](https://github.com/sqlalchemy/alembic/issues/1110)). + - Re-export modules trigger missing export warnings ([sqlalchemy/alembic#1377](https://github.com/sqlalchemy/alembic/issues/1377)). + - mypy suite is not run in strict mode and many typing errors remain ([sqlalchemy/alembic#1377](https://github.com/sqlalchemy/alembic/issues/1377)). +- **Known workarounds**: Per-file Pyright overrides in `env.py` and migration template. Exclude `alembic/versions/` from strict checking. + +--- + +## 2. FastMCP Deep Dive + +FastMCP is the most critical dependency — banksy's entire server, tool registration, OpenAPI integration, middleware, and test harness flow through it. + +### Does FastMCP pass Pyright strict? + +**No.** FastMCP does not use Pyright: + +- No `pyrightconfig.json` in the repository. +- No `[tool.pyright]` section in `pyproject.toml`. +- Type checking is configured for **ty** (via `[tool.ty.src]` and `[tool.ty.terminal]` in `pyproject.toml`). +- CI runs **prek** (j178/prek-action) in `.github/workflows/run-static.yml`, not Pyright. +- Tests contain `# pyright: ignore[...]` in 3 files, suggesting the maintainers are aware of Pyright but don't enforce compliance. + +### ty vs Pyright: What FastMCP's choice means for banksy + +FastMCP uses **ty** (formerly Red Knot), Astral's Rust-based type checker released in beta December 2025. ty is built by the same team behind Ruff and uv, and claims 10-100x faster performance than Pyright on full projects and up to 80x faster incremental updates (4.7ms vs 386ms on PyTorch edits). It includes advanced features like first-class intersection types, deep reachability analysis, and fine-grained incremental analysis designed for IDE responsiveness. + +**Why Pyright over mypy in the first place:** The same speed argument that now favors ty over Pyright is what originally motivated choosing Pyright over mypy. Pyright is dramatically faster than mypy (especially for incremental IDE feedback via Pylance), and its strict mode is more comprehensive. mypy's gradual-by-default philosophy means it silently accepts untyped code, while Pyright strict actively rejects it. + +**What this means for banksy's Pyright choice:** + +- **FastMCP's `Any` usage is not a ty-specific problem.** The `dict[str, Any]` and `Callable[..., Any]` patterns in FastMCP's API would produce the same `Unknown` type leakage under ty as under Pyright. FastMCP's choice of type checker doesn't explain or excuse the `Any` usage — those are API design decisions, not tool limitations. +- **Pyright remains the pragmatic choice for banksy today.** ty is beta software. Astral notes to "expect bugs, missing features, and fatal errors." Its stable 1.0 release is targeted for 2026. Pyright is mature, has deep VS Code/Pylance integration, and is what pdf-import uses. Choosing Pyright keeps banksy aligned with the team's existing tooling and avoids beta-tool risk. +- **Starting with Pyright does not lock you in.** Type annotations are portable across all Python type checkers — they're standard PEP 484/604 syntax. The non-portable parts are configuration files and inline suppression comments (`# pyright: ignore[...]`), which are a small fixed translation cost for a project banksy's size. +- **The type checker landscape is shifting.** Both ty (Astral) and Pyrefly (Meta) were announced at PyCon 2025's Typing Summit as Rust-based alternatives to Pyright and mypy. This competition is improving typing quality across the ecosystem, benefiting banksy regardless of which checker it uses. + +See **Appendix A** for a detailed assessment of what migrating from Pyright to ty would look like in ~1 year. + +### API-by-API Typing Audit + +#### `FastMCP()` constructor + +```python +# src/fastmcp/server/server.py lines 219-242 +def __init__( + self, + name: str | None = None, + instructions: str | None = None, + ... + tools: Sequence[Tool | Callable[..., Any]] | None = None, + ... + **kwargs: Any, +): +``` + +- `tools` parameter accepts `Callable[..., Any]` — any callable. +- `**kwargs: Any` is used for removed-argument checks. +- `StateValue.value: Any` in the state management API. +- **Strict impact**: `reportUnknownVariableType` will fire when accessing values from state or unpacking kwargs. + +#### `@server.tool()` decorator + +```python +# src/fastmcp/server/server.py lines 1347-1369 +def tool( + self, + name_or_fn: str | AnyFunction | None = None, + *, + output_schema: dict[str, Any] | NotSetT | None = NotSet, + annotations: ToolAnnotations | dict[str, Any] | None = None, + meta: dict[str, Any] | None = None, + app: AppConfig | dict[str, Any] | bool | None = None, + ... +) -> ( + Callable[[AnyFunction], FunctionTool] + | FunctionTool + | partial[Callable[[AnyFunction], FunctionTool] | FunctionTool] +): +``` + +- `output_schema`, `annotations`, `meta`, `app` all use `dict[str, Any]`. +- **Strict impact**: If you pass dicts to these params, the values will be `Any`-typed. However, since these are configuration dicts (not data flow), the impact is limited to registration-time code, not runtime tool handlers. + +#### `FastMCP.from_openapi()` + +```python +# src/fastmcp/server/server.py lines 1927-1939 +@classmethod +def from_openapi( + cls, + openapi_spec: dict[str, Any], + client: httpx.AsyncClient | None = None, + ... + **settings: Any, +) -> Self: +``` + +- `openapi_spec: dict[str, Any]` — inherently untyped (OpenAPI specs are JSON blobs). +- `**settings: Any` passes through to `FastMCP(**settings)`. +- **Strict impact**: The `dict[str, Any]` for `openapi_spec` is reasonable (OpenAPI specs are unstructured). The `**settings: Any` pass-through will cause `reportUnknownVariableType` if you try to inspect settings after the call. In practice, you call `from_openapi()` once and use the returned `Self`-typed server, so leakage is contained. + +#### `server.mount()` + +```python +def mount( + self, + server: FastMCP[LifespanResultT], + namespace: str | None = None, + as_proxy: bool | None = None, + tool_names: dict[str, str] | None = None, + prefix: str | None = None, +) -> None: +``` + +- **Well-typed.** No `Any` in the signature. Generic `LifespanResultT` properly propagated. + +#### `Client(transport=server)` for testing + +- `CallToolResult.data: Any` — the primary result accessor returns `Any`. +- `CallToolResult.structured_content: dict[str, Any] | None`. +- `CallToolResult.meta: dict[str, Any] | None`. +- **Strict impact**: Every test that calls a tool and inspects the result will need to cast or validate `result.data`. This is the highest-friction `Any` in the API for banksy, since tests use `Client` extensively. + +#### `Middleware` and `MiddlewareContext` + +```python +T = TypeVar("T", default=Any) +R = TypeVar("R", covariant=True, default=Any) +``` + +- Generic type vars default to `Any`. +- `MiddlewareContext.copy(**kwargs: Any)` uses `Any`. +- `CallNext[T, R]` and `MiddlewareContext[T]` use these defaults. +- **Strict impact**: If you don't explicitly parameterize the generics, everything defaults to `Any`. Banksy should explicitly parameterize middleware types where possible. + +#### `custom_route()` decorator + +```python +# src/fastmcp/server/mixins/transport.py lines 97-105 +def custom_route( + self: FastMCP, + path: str, + methods: list[str], + name: str | None = None, + include_in_schema: bool = True, +) -> Callable[ + [Callable[[Request], Awaitable[Response]]], + Callable[[Request], Awaitable[Response]], +]: +``` + +- **Well-typed.** No `Any` in the signature. Callable types are fully specified. + +### Summary of `Any` Leakage in FastMCP Public API + +| API | `Any` usage | Impact on banksy | +|---|---|---| +| `FastMCP.__init__()` | `**kwargs: Any`, `Callable[..., Any]` in tools | Low — called once at startup | +| `@server.tool()` | `dict[str, Any]` in config params | Low — registration-time only | +| `from_openapi()` | `dict[str, Any]` spec, `**settings: Any` | Low — called once, returns typed `Self` | +| `Client` results | `.data: Any`, `.structured_content: dict[str, Any]` | **Medium** — every test touches this | +| Middleware generics | `TypeVar` defaults to `Any` | Low — explicitly parameterize to avoid | +| `mount()` | None | None | +| `custom_route()` | None | None | + +--- + +## 3. SQLAlchemy 2.0 Async Typing + +### Does SQLAlchemy 2.0+ have full Pyright strict support? + +**Partially.** SQLAlchemy 2.0 ships `py.typed` and includes native inline type annotations, eliminating the need for the deprecated `sqlalchemy2-stubs` package. The mypy plugin is deprecated and slated for removal in 2.1 ([docs](https://docs.sqlalchemy.org/en/20/orm/extensions/mypy.html)). + +However, full Pyright strict compliance is not a stated goal. SQLAlchemy and Pyright maintain a "best effort" relationship, with the Pyright team noting that SQLAlchemy's metaclass-driven ORM patterns fall outside standard PEP 484 type checking ([microsoft/pyright#9741](https://github.com/microsoft/pyright/issues/9741)). + +### Async session/engine types + +- **`AsyncEngine`**: Well-typed. `create_async_engine()` returns `AsyncEngine` with proper type annotations. +- **`AsyncSession`**: Well-typed for standard usage patterns (query, add, commit, etc.). +- **`async_sessionmaker`**: Generic inference was broken and fixed in [sqlalchemy/sqlalchemy#8842](https://github.com/sqlalchemy/sqlalchemy/pull/8842). The fix required careful ordering of overload definitions. Current versions (2.0.30+) should work correctly. + +```python +# This now works without type errors: +async_session = async_sessionmaker(engine, class_=AsyncSession) +async with async_session() as session: + result = await session.execute(select(User)) # properly typed +``` + +### Do `Mapped[]` column annotations work cleanly? + +**With caveats.** The `Mapped[]` annotation system works for defining column types: + +```python +class User(Base): + __tablename__ = "users" + id: Mapped[int] = mapped_column(primary_key=True) + name: Mapped[str] = mapped_column(String(100)) +``` + +However, Pyright does NOT validate that required `Mapped` fields are provided during instantiation: + +```python +user = User() # Pyright won't flag missing 'name' — by design +user = User(name=123) # Pyright won't flag wrong type — by design +``` + +This is documented as "as designed" by the Pyright team ([microsoft/pyright#9741](https://github.com/microsoft/pyright/issues/9741)). Pyright follows standard PEP 484 rules and has no special knowledge of SQLAlchemy's metaclass-generated `__init__`. This is a limitation, not a strict-mode blocker — it just means constructor validation relies on runtime checks. + +### Relationship definitions + +Relationship typing with complex join conditions can produce `Unknown` member types: + +```python +# Simple relationships work: +posts: Mapped[list["Post"]] = relationship(back_populates="author") + +# Complex relationships may need type: ignore: +secondary_items: Mapped[list["Item"]] = relationship( + secondary=association_table, + lazy="selectin", +) # may produce reportUnknownMemberType in some configurations +``` + +### Query result typing + +`select()` patterns are generally well-typed: + +```python +result = await session.execute(select(User).where(User.id == 1)) +user = result.scalar_one() # typed as User +users = result.scalars().all() # typed as Sequence[User] +``` + +### SQLAlchemy CI type checking + +SQLAlchemy runs mypy in CI but does not run Pyright. Their mypy configuration uses the deprecated mypy plugin. They do not test against Pyright strict mode. + +--- + +## 4. Alembic Typing + +### Does Alembic ship `py.typed`? + +**Yes**, since Alembic 1.7.0. The legacy `alembic-stubs` v1.0.0 package on PyPI (last updated Feb 2023) is only needed for Alembic < 1.7. For banksy's target of `>=1.15`, native types are available. + +### Are there `types-alembic` stubs? + +No `types-alembic` package exists on PyPI or in typeshed. The `alembic-stubs` package is the legacy option; it is not needed for modern versions. + +### Known Pyright strict issues + +**`begin_transaction()` — partially unknown return type** + +The `EnvironmentContext.begin_transaction()` method returns `Union[_ProxyTransaction, ContextManager]` where `_ProxyTransaction` uses unspecified generics, causing Pyright to report "Type is partially unknown" ([sqlalchemy/alembic#1201](https://github.com/sqlalchemy/alembic/issues/1201)). + +```python +# In env.py — triggers reportUnknownMemberType: +with connectable.connect() as connection: + context.configure(connection=connection) + with context.begin_transaction(): # <-- partially unknown + context.run_migrations() +``` + +**`revision()` — incomplete callable annotation** + +The `revision()` function in `alembic.command` uses `Callable` without full signature definitions, resulting in "Type is partially unknown" errors ([sqlalchemy/alembic#1110](https://github.com/sqlalchemy/alembic/issues/1110)). + +**Re-export module issues** + +Alembic's `__init__.py` and utility modules re-export symbols without using the `import X as X` pattern required by Pyright's `--no-implicit-reexport`. This triggers `reportMissingModuleSource` and related warnings ([sqlalchemy/alembic#1377](https://github.com/sqlalchemy/alembic/issues/1377), [sqlalchemy/alembic#897](https://github.com/sqlalchemy/alembic/issues/897)). + +**mypy strict mode gap** + +Alembic's own mypy suite is not run in strict mode, and many typing errors remain upstream ([sqlalchemy/alembic#1377](https://github.com/sqlalchemy/alembic/issues/1377)). This means new Alembic releases may introduce additional typing gaps. + +### Would strict mode require annotating `env.py` and migration files? + +**Yes, partially.** The auto-generated `env.py` contains untyped boilerplate. Under strict mode: + +- Function parameters in `run_migrations_offline()` and `run_migrations_online()` need return type annotations. +- The `context.begin_transaction()` call triggers partially-unknown warnings. +- Migration files (`alembic/versions/*.py`) contain `upgrade()` and `downgrade()` functions with `op.*` calls that are generally well-typed, but the migration template would need `-> None` return annotations. + +**This is a one-time setup cost**, not ongoing. Once the `env.py` template and migration template are annotated and the per-file override is added, new migrations auto-generated by Alembic will follow the template. + +### Recommended approach + +Add a per-file Pyright override at the top of `env.py`: + +```python +# pyright: reportUnknownMemberType=false +``` + +And exclude auto-generated migration files from strict checking in `pyrightconfig.json`: + +```json +{ + "exclude": ["**/alembic/versions/**"] +} +``` + +--- + +## 5. Risk Summary + +| Dependency | Ships py.typed? | Stub package? | Pyright Strict Rating | Risk Level | Notes | +|---|---|---|---|---|---| +| **fastmcp** | Yes | N/A | Fair | Medium | `Any` in constructor, `tool()`, `from_openapi()`, `Client` results; no Pyright in CI | +| **httpx** | Yes | N/A | Good | Low | ~100% type completeness after PRs #2435, #2469, #2840 | +| **httpx-retries** | Yes | N/A | Good | Low | Small API, inline types, active maintenance | +| **pydantic** | Yes | N/A | Good | Low | Pyright test suite in repo; core `BaseModel` usage solid | +| **pydantic-settings** | Yes | N/A | Fair | Medium | `_env_file` init params unrecognized; no Pyright plugin | +| **sqlalchemy** | Yes | N/A (native since 2.0) | Fair | Medium | `Mapped[]` init not validated; async types fixed; relationship gaps | +| **asyncpg** | No | `asyncpg-stubs` 0.31.2 | Good | Low | External stubs actively maintained, version-matched | +| **alembic** | Yes (since 1.7) | `alembic-stubs` (legacy only) | Fair/Poor | Medium-High | Partially unknown types in core APIs; re-export issues; no strict CI | + +**Rating guide applied:** + +- **Excellent**: Full inline types, no known Pyright strict issues — *none qualify* +- **Good**: Inline types or complete stubs, minor gaps — httpx, httpx-retries, pydantic, asyncpg +- **Fair**: Partial types, some `Unknown` leakage, workarounds needed — fastmcp, pydantic-settings, sqlalchemy +- **Poor**: Missing types, significant stub work required — *none fully qualify, but alembic is borderline* + +--- + +## 6. Estimated Stub/Workaround Burden + +### Custom stub files needed: 0 + +All production dependencies either ship `py.typed` with inline types or have actively maintained stub packages on PyPI (`asyncpg-stubs`). No custom `typings/` directory is needed. + +### Comparison with pdf-import + +| Aspect | pdf-import | banksy (projected) | +|---|---|---| +| Custom stub files | 70 files across 6 libraries | 0 | +| Libraries needing stubs | cairosvg (15), svgpathtools (11), requests_toolbelt (27), pytesseract (2), lxml (13), fitz (3) | None | +| `typings/` directory | Yes, mandatory | No | +| `# type: ignore` comments | 14 across 8 files | ~5–10 estimated | +| External stub packages | `lxml-stubs` (dev dep) | `asyncpg-stubs` (dev dep) | + +Banksy's dependency stack is dramatically better-typed than pdf-import's. The ML/PDF processing libraries pdf-import uses (cairosvg, lxml, pymupdf, pytesseract) are classic "untyped C extension" libraries requiring hand-written stubs. Banksy's stack is modern Python with first-class typing support. + +### Projected `# type: ignore` / `# pyright: ignore` comments + +| Source | Count | Nature | +|---|---|---| +| pydantic-settings `BaseSettings()` constructor kwargs | 1–2 | One-time: settings class definition | +| Alembic `env.py` boilerplate | 1–2 | One-time: initial migration setup | +| FastMCP `Client` result access in tests | 2–3 | Ongoing if no typed wrapper; one-time if wrapper created | +| SQLAlchemy complex relationships | 1–2 | Occasional: depends on schema complexity | +| **Total** | **~5–10** | | + +### Per-file Pyright overrides + +| File | Override | Reason | +|---|---|---| +| `alembic/env.py` | `# pyright: reportUnknownMemberType=false` | `begin_transaction()` partially unknown type | +| Alembic migration template | `-> None` return annotations | Template auto-generates untyped functions | + +### Burden type: Mostly one-time + +- **One-time setup**: Alembic `env.py` override, migration template annotations, `asyncpg-stubs` dev dep, optional FastMCP typed wrappers. +- **Ongoing per-feature**: Occasional `# type: ignore` for complex SQLAlchemy relationships or new pydantic-settings patterns. Estimated at 0–1 per feature. + +--- + +## 7. Recommendation + +### (b) Adopt strict with caveats + +Banksy's dependency stack is well-suited for Pyright strict mode with minor, well-defined accommodations. The burden is dramatically lower than pdf-import's experience, and the benefits of strict mode (catching missing annotations, preventing `Unknown` type propagation, enforcing return types) justify the small upfront cost. + +### Specific actions + +1. **Add `asyncpg-stubs` to dev dependencies:** + + ```toml + [project.optional-dependencies] + dev = [ + "asyncpg-stubs>=0.30", + "pyright>=1.1.390", + # ... + ] + ``` + +2. **Configure `pyrightconfig.json` with strict mode and targeted exclusions:** + + ```json + { + "include": ["./src"], + "exclude": [ + "**/alembic/versions/**", + "**/__pycache__", + ".venv" + ], + "typeCheckingMode": "strict", + "pythonVersion": "3.12", + "reportMissingTypeStubs": true, + "reportMissingParameterType": true, + "reportMissingReturnType": true, + "reportUnknownParameterType": true, + "reportUnknownArgumentType": true, + "reportUnknownLambdaType": true, + "reportUnknownVariableType": true, + "reportUnknownMemberType": true + } + ``` + + This mirrors pdf-import's configuration. The `alembic/versions/**` exclusion avoids annotating auto-generated migration files. + +3. **Add per-file override in `alembic/env.py`:** + + ```python + # pyright: reportUnknownMemberType=false + ``` + +4. **Create typed helpers for FastMCP `Client` test results (optional but recommended):** + + ```python + from typing import TypeVar + from pydantic import TypeAdapter + + T = TypeVar("T") + + def parse_tool_result(result: CallToolResult, model: type[T]) -> T: + """Extract and validate Client tool result data with type safety.""" + adapter = TypeAdapter(model) + return adapter.validate_python(result.data) + ``` + +5. **Use `model_config` pattern for pydantic-settings to avoid constructor kwargs:** + + ```python + class Settings(BaseSettings): + model_config = SettingsConfigDict( + env_file=".env", + case_sensitive=False, + ) + database_url: str + api_key: str + ``` + + This avoids the `_env_file` constructor param that Pyright doesn't recognize. + +6. **Accept ~5–10 targeted `# type: ignore` comments** for the remaining edge cases identified above. This is well within the range that pdf-import found acceptable (14 comments). + +### Why not (a) "adopt with confidence"? + +FastMCP's `Any` leakage in the `Client` API and alembic's partially-unknown types mean strict mode is not completely frictionless. Acknowledging these caveats upfront prevents surprises during implementation. + +### Why not (c) "start with standard"? + +The dependency stack is too well-typed to justify deferring strict mode. Starting with standard and upgrading later means retrofitting annotations across existing code — exactly the problem pdf-import's colleague warned about. + +### Why not (d) "strict with targeted overrides"? + +Option (b) effectively includes the targeted overrides from (d) — the `alembic/versions/**` exclusion and `env.py` per-file override. The distinction is that (b) frames these as known caveats rather than a fundamentally different strategy. + +### Note on strict-to-standard reversibility + +Going from strict to standard is trivial — change one line in `pyrightconfig.json` and all constraints relax. Existing annotations remain, code still works. Going the other direction (standard to strict) is the painful path: every file written without annotations needs them, and every `Any` that silently propagated becomes a cascade of errors. This asymmetry means the downside risk of starting strict is low (worst case, relax later and lose nothing), while the downside risk of starting standard is high (accumulate untyped code that becomes expensive to fix). + +--- + +## Appendix A: Pyright to ty Migration Outlook (~1 Year) + +### What maps cleanly + +- **All type annotations.** Both tools read the same PEP 484/604 type syntax. Everything written for Pyright works under ty. Annotations are fully portable. +- **`# type: ignore` comments.** ty respects standard `# type: ignore` comments ([docs](https://docs.astral.sh/ty/suppression/)) and also supports its own `# ty: ignore[]` syntax with rule-specific suppression. +- **Editor integration.** ty has a full VS Code extension, Neovim/nvim-lspconfig support, built-in Zed support, and PyCharm integration (2025.3+). It provides full LSP with hover, completions, and auto-import. It can run alongside Pylance (`ty.disableLanguageServices: true` + `python.languageServer: "Pylance"`) or replace it entirely. +- **Configuration location.** ty uses `pyproject.toml` (`[tool.ty]`) or `ty.toml`, which is arguably cleaner than Pyright's separate `pyrightconfig.json`. The settings are simple enough to translate by hand. + +### What doesn't map cleanly + +- **No strict mode yet.** ty does not have an equivalent of `typeCheckingMode = "strict"`. There's an open issue ([astral-sh/ty#527](https://github.com/astral-sh/ty/issues/527)) milestoned to "Pre-stable 1". The maintainer says they plan to support opting into "stricter rule categories" but it will likely differ from a single `strict = true` toggle. The exact form is unknown. +- **Different rule names.** Pyright uses camelCase `reportUnknownMemberType`, `reportUnknownVariableType`, etc. ty uses kebab-case names like `possibly-unresolved-reference`, `unresolved-attribute`. There is no published 1:1 mapping. Inline `# pyright: ignore[reportUnknownMemberType]` comments would need to become `# ty: ignore[some-rule]` or generic `# type: ignore`. +- **Missing "Unknown" tracking rules.** ty does not yet have equivalents to Pyright's strict-mode `reportUnknown*` family. Issues [#2442](https://github.com/astral-sh/ty/issues/2442) (implicitly Unknown-specialized types) and [#2628](https://github.com/astral-sh/ty/issues/2628) (unknown length unpacking) are being added as strict diagnostics, but the full suite is not there yet. +- **Different type inference behavior.** ty has first-class intersection types and different narrowing semantics than Pyright. Code that type-checks clean under Pyright could produce new warnings under ty (or vice versa). This is not a bug — different type system interpretations. It does mean a migration produces a delta of diagnostics to triage. + +### Estimated migration effort for banksy-sized project + +| Task | Effort | +|---|---| +| Translate `pyrightconfig.json` to `[tool.ty]` in `pyproject.toml` | Minutes | +| Replace `# pyright: ignore[...]` with `# ty: ignore[...]` or `# type: ignore` | ~5-10 occurrences, minutes | +| Run ty, triage new/different diagnostics | Half day | +| Update CI to run `ty check` instead of `pyright` | Minutes | +| **Total** | **Half day to one day** | + +The annotations themselves — the vast majority of the typing work — need no changes at all. + +### The speed progression + +The same reasoning chain that justified Pyright over mypy will likely justify ty over Pyright: + +| Generation | Tool | Speed | Status | +|---|---|---|---| +| 1st | mypy | Baseline (slow) | Mature, declining | +| 2nd | Pyright | Much faster than mypy | Mature, current standard | +| 3rd | ty | 10-100x faster than Pyright | Beta, targeting stable 2026 | + +### Recommendation + +Start with Pyright. It's mature, it's what pdf-import uses, and it has deep VS Code/Pylance integration. Revisit ty when it reaches stable 1.0 and ships a strict mode equivalent. The migration cost will be small and the annotations banksy writes today will carry forward unchanged. diff --git a/fastmcp-migration/banksy-research/python-314-compatibility-research.md b/fastmcp-migration/banksy-research/python-314-compatibility-research.md new file mode 100644 index 0000000..feea0aa --- /dev/null +++ b/fastmcp-migration/banksy-research/python-314-compatibility-research.md @@ -0,0 +1,384 @@ +# Python 3.14 Compatibility Research for Banksy Migration + +**Date:** March 2026 +**Context:** The banksy migration plan currently specifies Python 3.12. A colleague on the pdf-import team recommended Python 3.14, citing GIL improvements for concurrency and a security ticket motivating their own 3.13 → 3.14 migration. This document evaluates whether banksy should adopt Python 3.14 (and potentially the free-threaded build) or stay on 3.12. + +--- + +## 1. GIL Background + +### What is the GIL? + +The Global Interpreter Lock (GIL) is a mutex in CPython that allows only one thread to execute Python bytecode at a time. Even on multi-core machines, Python threads cannot achieve true CPU parallelism — they must take turns holding the GIL. This matters for CPU-bound multi-threaded programs: they effectively run single-threaded despite using `threading`. I/O-bound threads are less affected because the GIL is released during blocking I/O operations (network calls, file reads, etc.). + +### PEP 703: Making the GIL Optional + +[PEP 703](https://peps.python.org/pep-0703/) defines a phased rollout to make the GIL optional in CPython: + +| Phase | Python Version | Status | +|-------|---------------|--------| +| Phase I | 3.13 | Free-threaded build available as **experimental** | +| Phase II | 3.14 | Free-threaded build **officially supported** (per [PEP 779](https://peps.python.org/pep-0779/)), but not the default | +| Phase III | Future | Free-threaded build becomes the default | + +The Python Steering Council accepted PEP 779 in June 2025, establishing clear criteria for Phase II: the free-threaded build must be desirable, stable, maintainable, and performant in both CPU and memory usage. + +### What does `python3.14t` change? + +The `t` suffix denotes the **free-threaded** CPython build. When the GIL is disabled: + +- **True parallelism**: Multiple Python threads can execute bytecode simultaneously on different cores. +- **Performance**: Near-linear scaling for CPU-bound multi-threaded workloads — up to ~7.5x speedup on 8 cores has been demonstrated. +- **Tradeoffs**: + - ~5-10% single-thread performance regression due to additional synchronization overhead replacing the GIL. + - Reduced C extension compatibility — libraries with C extensions must be updated to be thread-safe without the GIL. + - Smaller ecosystem of tested packages compared to the standard build. + +### Is free-threading relevant for banksy? + +**No, given banksy's planned architecture.** Banksy will be a FastMCP server. FastMCP's server runtime is asyncio-based: it depends on `uvicorn>=0.35` (an ASGI server built on `asyncio`) and the MCP Python SDK (`mcp>=1.24.0`), which uses Starlette internally for HTTP/SSE transports. This can be verified from FastMCP's [`pyproject.toml`](https://github.com/PrefectHQ/fastmcp/blob/main/pyproject.toml) and its quickstart documentation showing `async with client:` / `await client.call_tool()` patterns. + +By default, uvicorn runs one worker process with a single asyncio event loop thread. Tool functions registered with FastMCP can be either sync or async — FastMCP accepts both — but either way they execute within that single-threaded event loop. The GIL is not a bottleneck because: + +1. There is only one thread running Python bytecode (the event loop thread). The GIL only contends when multiple OS threads try to execute Python bytecode simultaneously, which doesn't happen here. +2. I/O operations (HTTP calls to Mural API, database queries via asyncpg) release the GIL while waiting, so even in a hypothetical multi-threaded scenario, the GIL wouldn't be the bottleneck for I/O-bound work. + +**Caveats:** + +- FastMCP **does** run sync tool handlers in a threadpool. The [FastMCP Tools docs](https://gofastmcp.com/servers/tools) explicitly state: *"This sync function won't block other concurrent requests"* with the comment `# Runs in threadpool, not on the event loop`. So the GIL technically applies to those threadpool threads. But for banksy's I/O-bound workload (making HTTP calls to the Mural API), this is not a meaningful bottleneck — the threads spend most of their time waiting on network I/O, not holding the GIL. For I/O-bound operations, FastMCP recommends using `async` tools instead since they're more efficient than threadpool dispatch. +- If banksy later adds CPU-bound work (e.g., heavy local computation) offloaded to threads, the GIL would become relevant. The assessment here assumes banksy remains an I/O-bound API orchestrator, which is what the migration plan describes. +- Running uvicorn with multiple workers (`--workers N`) uses separate processes (not threads), each with its own GIL, so the GIL is irrelevant in that configuration regardless. + +Free-threading is relevant for services like pdf-import that use `ProcessPoolExecutor`, multi-threading, and CPU-bound image/PDF processing — but not for an async I/O-bound MCP server like banksy. + +**Bottom line:** The GIL improvement in 3.14 is not a reason for banksy to upgrade. The standard (GIL-enabled) build of 3.14 is sufficient. + +--- + +## 2. Security Context from pdf-import + +### What the git history shows + +Commit [`234ff8f`](https://github.com/tactivos/pdf-import/commit/234ff8f) (March 3, 2026, by Guido Zibecchi) upgraded pdf-import: + +``` +Update python to version 3.14.3 and libraries +``` + +- **PR:** `#299` from branch `tactivos/update/python-3-14` +- **Files changed:** `docker/app.Dockerfile`, `pyproject.toml`, `src/api/start.py`, `uv.lock` +- **Dockerfile change:** `ARG PYTHON_VERSION=3.12.12` → `ARG PYTHON_VERSION=3.14.3` +- **pyproject.toml change:** `requires-python = "==3.12.12"` → `requires-python = "==3.14.3"`, added `base_python = "python3.14t"` + +### Security ticket identification + +Git log searches across the pdf-import repo for "security", "CVE", "vulnerability", and "GIL" found **no commits directly referencing a specific CVE** in connection with the Python version upgrade. The branch `beta-fix/apply-security-recommendations` exists but predates the 3.14 upgrade and contains unrelated changes. + +### Likely explanation + +The "security ticket" almost certainly refers to Python 3.12's **lifecycle status**, not a specific CVE: + +- Python 3.12 entered **security-only** mode after 3.12.10 (April 8, 2025). Per [PEP 693](https://peps.python.org/pep-0693/), it no longer receives bugfix releases — only critical security patches, on an as-needed basis, until EOL in October 2028. +- Python 3.12.12 and 3.12.13 (March 2026) are security-only releases addressing email handling, XML parsing, and denial-of-service vulnerabilities. +- Staying on a security-only branch means no fixes for non-security bugs, degrading runtime quality over time. +- Python 3.14 (released October 7, 2025) is in **active bugfix** mode with regular releases every ~2 months. Latest: 3.14.3 (February 3, 2026). + +Recent CPython CVEs affecting 3.12/3.13/3.14 include: +- **CVE-2025-4138**: `tarfile` extraction filter bypass (symlink targets escape destination directory). Python 3.14 changed the default filter to `"data"`, making users relying on the new default affected. +- **CVE-2025-13837**: `plistlib` OOM/DoS from malicious plist files. +- **CVE-2026-2297**: `SourcelessFileLoader` bypasses `io.open_code()` validation (low practical impact). + +None of these are specific to 3.12 — they affect multiple versions and are patched across all supported branches. The motivation for pdf-import's upgrade is about being on an **actively maintained** branch, not escaping a specific 3.12 vulnerability. + +### Why pdf-import chose `python3.14t` + +pdf-import is a CPU-bound service that processes PDFs and images using multiprocessing (`ProcessPoolExecutor`, `pebble`), multi-threading, OpenCV, and Playwright. GIL removal directly benefits this workload by enabling true thread-level parallelism for CPU-bound tasks like image processing and SVG rasterization. + +--- + +## 3. FastMCP Compatibility with Python 3.14 + +### Official support status + +From FastMCP's [`pyproject.toml`](https://github.com/PrefectHQ/fastmcp/blob/main/pyproject.toml) (as of March 2026): + +```toml +requires-python = ">=3.10" +``` + +PyPI classifiers list: +- `Programming Language :: Python :: 3.10` +- `Programming Language :: Python :: 3.11` +- `Programming Language :: Python :: 3.12` +- `Programming Language :: Python :: 3.13` + +**Python 3.14 is not listed in classifiers**, meaning it's not officially declared as supported, but `>=3.10` does not exclude it. + +### CI test matrix + +From [`run-tests.yml`](https://github.com/PrefectHQ/fastmcp/blob/main/.github/workflows/run-tests.yml): + +```yaml +matrix: + os: [ubuntu-latest, windows-latest] + python-version: ["3.10"] + include: + - os: ubuntu-latest + python-version: "3.13" +``` + +FastMCP CI tests on **3.10 and 3.13 only**. There is no 3.14 in the matrix, and no free-threaded testing. + +### Empirical evidence + +The [`canvas-mcp`](/Users/wkirkham/dev/canvas-mcp/) repo (a sibling project at our company) uses: + +```toml +requires-python = ">=3.14" +dependencies = ["fastmcp>=3.1.0"] +``` + +This is a working FastMCP server running on Python 3.14, providing direct empirical evidence that the combination works for an MCP server use case similar to banksy's. + +### PEP 649 and Pydantic: detailed gap analysis + +The biggest compatibility concern for 3.14 is [PEP 649](https://peps.python.org/pep-0649/) (deferred evaluation of annotations). In previous Python versions, annotations were evaluated eagerly at class definition time. In 3.14, they are stored as lazy "annotate functions" and only evaluated when accessed. This is a fundamental change for Pydantic, whose core feature is reading type annotations at runtime to build validators and JSON schemas. + +#### What Pydantic fixed + +[PR #11991](https://github.com/pydantic/pydantic/pull/11991) (merged July 10, 2025, shipped in Pydantic 2.12) added initial PEP 649 support. **Basic use cases work:** + +```python +from pydantic import BaseModel + +class Model(BaseModel): + a: Int # Forward reference — works after model_rebuild() + +Int = int +Model.model_rebuild() # OK +``` + +Standard `BaseModel` subclasses with simple types, `Annotated[...]`, `Optional[...]`, `list[str]`, etc. all work correctly on Python 3.14. + +#### What remains broken + +[Issue #12080](https://github.com/pydantic/pydantic/issues/12080) tracks the remaining gaps. As of March 2026, the known broken patterns are: + +**1. `Field()` in Pydantic dataclasses with forward references** + +```python +from pydantic import Field +from pydantic.dataclasses import dataclass + +@dataclass +class A: # Raises at declaration on 3.14 + a: Forward = Field(default=1) + +Forward = int +``` + +Pydantic's internal workaround writes directly to `__annotations__`, which is no longer safe under PEP 649. A different implementation is being tracked in [issue #12045](https://github.com/pydantic/pydantic/issues/12045). Note: this only affects `@pydantic.dataclasses.dataclass` with `Field()` *and* forward references. `BaseModel` subclasses are not affected. + +**2. Generic models with complex TypeVar/TypeAlias/Self patterns** + +```python +from typing import Generic, Self, TypeAlias, TypeVar, Union +from pydantic import BaseModel + +TBaseItem: TypeAlias = Union["Parent"] +TItem = TypeVar("TItem", bound=TBaseItem) + +class Parent(BaseModel, Generic[TItem]): + members: TItem | None = None + +class Child(Parent[TBaseItem]): + @classmethod + def new(cls) -> Self: + return cls() + +Child.new() # Fails on 3.14, works on 3.13 +``` + +This involves `TypeAlias` with forward-referencing unions, `TypeVar` bounds, and `Self` — a pattern found in type-heavy generic libraries. Reported March 2026, still open. + +**3. Models defined in deeply nested function scopes (resolved)** + +Classes defined inside nested functions where type names (`List`, `Dict`) are defined in outer scopes. This was initially broken but **was fixed in Python 3.14.1** per the Pydantic maintainer's comment (December 2025). Not a concern for banksy. + +**4. Pydantic V1 compatibility layer** + +`pydantic.v1` and legacy Pydantic V1 code does **not** work on Python 3.14. Minimal support was added in `pydantic==1.10.25` (December 2025), but the Pydantic team has explicitly stated they will not fully support V1 on 3.14. + +#### How this maps to banksy's planned Pydantic usage + +The migration plan identifies these Pydantic touch points for banksy: + +| Banksy use case | Pydantic pattern | Works on 3.14? | +|---|---|---| +| **`pydantic-settings` for config** (`config.py`) | `BaseSettings` subclass with simple field types (`str`, `int`, `bool`, `Optional[...]`) | Yes — basic `BaseModel` usage | +| **FastMCP tool parameter validation** | FastMCP inspects function type annotations, builds a Pydantic model from the signature, validates inputs. Types are `str`, `int`, `list[str]`, `dict`, `Optional[...]`. | Yes — FastMCP handles this internally, basic types | +| **FastMCP `from_openapi()` generated tools** (PR2) | Tool parameters generated from OpenAPI schema — straightforward Pydantic models | Yes — auto-generated, no exotic patterns | +| **Custom Pydantic models for tool parameters** | If banksy defines `BaseModel` subclasses for structured tool inputs (e.g., a `MuralFilter` model) | Yes — module-level `BaseModel` subclasses with simple types | +| **SQLAlchemy async models** (PR4) | `DeclarativeBase` models — these are SQLAlchemy, not Pydantic | N/A — SQLAlchemy addressed its own PEP 649 issues [separately](https://github.com/sqlalchemy/sqlalchemy/issues/12405) | +| **Auth token/session models** (PR5-6) | Likely `BaseModel` subclasses for JWT claims, token payloads | Yes — basic `BaseModel` usage | + +**Patterns banksy would need to avoid on 3.14:** + +| Pattern | Risk | Likelihood in banksy | +|---|---|---| +| `@pydantic.dataclasses.dataclass` with `Field()` + forward references | Broken | **Very low** — banksy will use `BaseModel`, not `@dataclass`. No reason to prefer Pydantic dataclasses for an MCP server. | +| Generic `BaseModel` with `TypeVar`/`TypeAlias`/`Self` | Broken | **Very low** — banksy is an application server, not a type-heavy library. Tool parameters and config models don't need parametric generics. | +| `pydantic.v1` imports or V1 compatibility shim | Broken | **Zero** — banksy is greenfield code, will use Pydantic V2 exclusively. | +| Models in nested function scopes | Was broken, **fixed in 3.14.1** | **Zero** — banksy models will be at module level. | + +#### Impact on FastMCP itself + +FastMCP uses Pydantic internally for tool registration, parameter validation, and schema generation. The relevant patterns are: +- Building a Pydantic model from a function signature (introspecting type annotations at runtime) +- Generating JSON Schema for tool parameters +- Validating incoming tool arguments against the schema + +These are all **basic Pydantic usage** — module-level models with standard types. FastMCP's own dev dependencies include `dirty-equals`, `inline-snapshot`, and `pytest-asyncio`, all running with Pydantic on 3.13 in CI. The canvas-mcp repo proves the full stack works on 3.14. + +#### Conclusion + +The Pydantic gaps on Python 3.14 are real but confined to patterns that banksy has no reason to use. The broken patterns are: (1) Pydantic dataclasses with `Field()` and forward references, (2) generic models with complex `TypeVar`/`TypeAlias`/`Self` combinations, and (3) Pydantic V1 code. Banksy is a greenfield MCP server using `BaseModel` subclasses, `pydantic-settings`, and FastMCP's built-in tool parameter validation — all of which work correctly. Pin `pydantic>=2.12` (not just `>=2.0`) to ensure PEP 649 support is present. + +### Free-threaded build compatibility + +Not tested in FastMCP CI. No known issues reported, but also no guarantees. **Not relevant for banksy** since we recommend the standard build. + +--- + +## 4. Banksy Toolchain Compatibility with Python 3.14 + +### Production dependencies + +| Dependency | Min Version | Python 3.14 Status | Notes | +|---|---|---|---| +| `fastmcp` | `>=3.1` | Works (`requires-python >=3.10`) | Proven on canvas-mcp. PEP 649 risk via Pydantic (see Section 3). | +| `httpx` | `>=0.27` | Works | Pure Python HTTP client. FastMCP itself depends on `httpx>=0.28.1`. | +| `httpx-retries` | `>=0.1` | Works | Thin wrapper around httpx. Pure Python. | +| `pydantic` | `>=2.0` | Works for banksy's patterns | PEP 649 gaps exist but only affect patterns banksy won't use (see Section 3 deep dive). Pin `>=2.12` for PEP 649 support. | +| `pydantic-settings` | `>=2.0` | Works | Follows Pydantic compatibility. | +| `sqlalchemy[asyncio]` | `>=2.0` | Works | PEP 649 issues [addressed](https://github.com/sqlalchemy/sqlalchemy/issues/12405) and closed (May 2025). | +| `asyncpg` | `>=0.30` | Likely works | C extension, actively maintained. No explicit 3.14 compatibility statement found, but no reported issues either. | +| `alembic` | `>=1.15` | Works | Pure Python migration tool. Depends on SQLAlchemy; follows its compatibility. | + +### Dev dependencies + +| Dependency | Min Version | Python 3.14 Status | Notes | +|---|---|---|---| +| `ruff` | `>=0.9` | Works | Rust binary, analyzes Python source. Python version doesn't affect execution. Supports `target-version = "py314"`. | +| `pyright` | `>=1.1` | Works | Node.js-based type checker. Python version doesn't affect execution. Supports `pythonVersion = "3.14"`. | +| `mypy` | `>=1.14` | Works | Actively maintained, adds new Python version support promptly. | +| `pytest` | `>=8.0` | Works | Actively maintained, 8.x series supports 3.14. | +| `pytest-asyncio` | `>=1.0` | Works | Pure Python pytest plugin. | +| `pytest-cov` | `>=6.0` | Works | Coverage measurement, actively maintained. | +| `inline-snapshot` | `>=0.15` | Works | FastMCP itself uses `inline-snapshot>=0.27.2` in dev deps on 3.13. | +| `dirty-equals` | `>=0.8` | Works | FastMCP itself uses `dirty-equals>=0.9.0` in dev deps. | + +### Infrastructure + +| Component | Python 3.14 Status | Notes | +|---|---|---| +| `python:3.14-slim` Docker image | **Available** | 3.14.3-slim on Docker Hub. Both bookworm and trixie variants. | +| `python:3.14t-slim` Docker image | **Not available** | [Issue #1082](https://github.com/docker-library/python/issues/1082) open on docker-library/python. No ETA. | +| `hatchling` build backend | Works | Pure Python, version-agnostic. | +| `uv` package manager | Works | Supports all Python versions. | +| `astral-sh/setup-uv` GitHub Action | Works | Version-agnostic, uses uv to install specified Python. | +| `pre-commit` / `pre-commit-uv` | Works | Version-agnostic tool runner. | + +--- + +## 5. Comparison: 3.12 vs 3.13 vs 3.14 for Banksy + +| Factor | 3.12 | 3.13 | 3.14 | +|--------|------|------|------| +| **Stability** | Mature, 2+ years in production | Stable, ~1.5 years old | Stable, ~5 months old (3.14.3) | +| **Support status** | Security-only since April 2025. EOL October 2028. | Active bugfix. EOL October 2029. | Active bugfix. EOL October 2030. | +| **GIL status** | Standard GIL, no free-threading | Experimental free-threaded build (`3.13t`) | Officially supported free-threaded build (`3.14t`, PEP 779) | +| **Docker `*-slim` image** | Available | Available | Available | +| **Docker `*t-slim` image** | N/A | Not available | Not available ([issue #1082](https://github.com/docker-library/python/issues/1082)) | +| **FastMCP CI-tested** | Not in matrix (3.10 and 3.13 tested) | Yes (in CI matrix) | Not in matrix | +| **FastMCP empirically verified** | Widely used | Yes | Yes (canvas-mcp) | +| **Pydantic support** | Full | Full | Works for standard patterns (see Section 3). Gaps in: Pydantic dataclasses + `Field()` + forward refs, complex generic `TypeVar`/`TypeAlias`/`Self` combos, V1 compat layer. | +| **SQLAlchemy support** | Full | Full | Full (PEP 649 issues resolved) | +| **PEP 649 (deferred annotations)** | N/A | N/A | Active — changes runtime annotation behavior | +| **Key new features** | Baseline | Exception groups, `TaskGroup` | Template strings, deferred annotations, subinterpreters, Zstandard | +| **Alignment with pdf-import** | No (they upgraded away from 3.12) | No | Yes (`==3.14.3`) | +| **Alignment with canvas-mcp** | No | No | Yes (`>=3.14`) | +| **Free-threading relevance to banksy** | N/A | N/A — experimental, and irrelevant for async servers | N/A — officially supported, but still irrelevant for async servers | + +--- + +## 6. Recommendation + +### Recommended: **(c) Move to Python 3.14 (standard build, not free-threaded)** + +Set `.python-version` to `3.14` and `requires-python = ">=3.14"` in `pyproject.toml`. Use the standard CPython build, not `python3.14t`. + +### Reasoning + +1. **Alignment with sibling projects.** Both canvas-mcp (`>=3.14`) and pdf-import (`==3.14.3`) are on Python 3.14. When banksy absorbs canvas-mcp's functionality, there will be no version mismatch. Staying on 3.12 would require canvas-mcp to downgrade or banksy to carry two Python version targets. + +2. **Active bugfix support.** Python 3.12 has been in security-only mode since April 2025. It receives no bugfix releases — only critical security patches. Python 3.14 is in active bugfix mode with releases every ~2 months (latest: 3.14.3, February 2026). This is the same concern that motivated pdf-import's migration. + +3. **Docker images are available.** `python:3.14-slim` (bookworm and trixie variants) is published on Docker Hub. No infrastructure blockers. + +4. **FastMCP works.** While not CI-tested on 3.14, FastMCP's `requires-python = ">=3.10"` permits it, and canvas-mcp provides empirical proof that `fastmcp>=3.1.0` works on 3.14 for an MCP server use case. + +5. **Pydantic works for our patterns.** The PEP 649 gaps in Pydantic are real but confined to patterns banksy won't use: Pydantic dataclasses with `Field()` + forward references, generic models with `TypeVar`/`TypeAlias`/`Self`, and V1 code. Banksy's use cases — `BaseModel` subclasses, `pydantic-settings`, FastMCP tool parameter validation, `from_openapi()` generated schemas — all work correctly. See Section 3 for the full analysis. + +6. **Longer support window.** Python 3.14 has security support until October 2030 (vs. 3.12 until October 2028), giving banksy a longer runway before requiring another version upgrade. + +### Why not 3.14t (free-threaded)? + +- Banksy's planned architecture is an asyncio-based FastMCP server (uvicorn + Starlette). The default runtime is a single event loop thread, so the GIL never contends. Free-threading provides no benefit for this I/O-bound workload. (See Section 1 caveats for scenarios where this assessment would change.) +- No official `python:3.14t-slim` Docker image exists ([issue #1082](https://github.com/docker-library/python/issues/1082)). +- FastMCP does not test against the free-threaded build in CI. +- Reduced library compatibility for C extensions (asyncpg, potential future dependencies). +- The standard build avoids the ~5-10% single-thread performance regression from free-threading overhead. + +### Why not 3.12 or 3.13? + +- **3.12:** Security-only mode. No bugfixes. Misaligned with canvas-mcp and pdf-import. Would require canvas-mcp to downgrade when absorbed. +- **3.13:** A reasonable middle ground, but doesn't align with either sibling project. If we're going to move past 3.12, there's no advantage to stopping at 3.13 when 3.14 is stable and aligned with the rest of the organization. + +### Caveats and mitigations + +| Risk | Mitigation | +|------|------------| +| PEP 649 (deferred annotations) — Pydantic gaps | Pin `pydantic>=2.12`. Use `BaseModel` (not `@pydantic.dataclasses.dataclass` with `Field()`). Avoid generic models with `TypeVar`/`TypeAlias`/`Self` combos. See Section 3 for full breakdown of which patterns work and which don't. | +| asyncpg C extension compatibility | Test database connectivity early in PR4 (DB schema PR). asyncpg is actively maintained and likely works. | +| FastMCP untested on 3.14 in CI | canvas-mcp proves it works. If issues arise, they'll be caught in PR1 when the echo tool is tested. | +| New Python version = less battle-tested | 3.14.3 is the third bugfix release. Major libraries have had 5+ months to adapt since 3.14.0 (October 2025). | + +### Concrete changes to the migration plan + +In PR1 (Python bootstrap), change: + +```toml +# .python-version +3.14 + +# pyproject.toml +requires-python = ">=3.14" + +# [tool.ruff] +target-version = "py314" + +# [tool.pyright] +pythonVersion = "3.14" + +# [tool.mypy] +python_version = "3.14" +``` + +In PR8 (Deploy), use: + +```dockerfile +FROM python:3.14-slim AS base +``` + +Do **not** use `python3.14t` or `base_python = "python3.14t"`. The standard build is correct for banksy. diff --git a/fastmcp-migration/banksy-research/tool-visibility-server-topology-research.md b/fastmcp-migration/banksy-research/tool-visibility-server-topology-research.md new file mode 100644 index 0000000..dc92151 --- /dev/null +++ b/fastmcp-migration/banksy-research/tool-visibility-server-topology-research.md @@ -0,0 +1,779 @@ +# Tool Visibility, Auth Modes, and Server Topology Research + +## 1. Executive Summary + +Banksy is a monorepo of MCP capabilities that serves different audiences with different +tool sets under different authentication modes. These three dimensions -- tool groups, +auth strategies, and server topology -- are deeply entangled and must be designed +together during the FastMCP migration. + +The current TypeScript/xmcp architecture solves this with two entirely separate +deployment modes: an internal mode (39 tools, SSO proxy + session JWTs) and a public mode +(87 tools, Mural OAuth tokens). Each mode is a separate Docker image, separate container, +separate Kubernetes deployment. The auth mode determines which tools are available because +it is a capability constraint: internal tools call mural-api's internal REST endpoints +with session JWTs, while public tools call mural-api's public REST API with OAuth access +tokens. These token types are incompatible across modes. + +Two hard constraints shape the FastMCP migration: + +1. **FastMCP enforces one auth provider per server instance.** The `auth=` parameter + accepts exactly one provider. All HTTP routes share the same auth middleware. When a + child server is mounted via `mount()`, the parent's auth applies -- the child's auth + is ignored. + +2. **The MCP protocol defines auth at the transport level, not per-tool.** There is no + mechanism in `tools/list` responses for a server to advertise per-tool auth + requirements or scopes. + +**Recommendation: Option E -- Deployment Mode Selection with Tag-Based Refinement.** +Build one Docker image. At runtime, `BANKSY_MODE` selects the auth provider and tool +set. Within each mode, tags provide finer-grained client-side filtering. This preserves +proven auth isolation, simplifies CI/CD to a single image, and provides a clean growth +path for new tool domains and auth modes. + + +## 2. Auth x Tool Group Matrix + +### Current State + +| Tool Group | Count | Auth Mode | Token Type | Backend Target | Why | +|---|---|---|---|---|---| +| Internal API tools | 39 | `sso-proxy` | Session JWT | `banksy-mural-api:5678` -> mural-api internal REST | Calls internal endpoints requiring corporate IdP session | +| Public API tools | 87 | `mural-oauth` | OAuth access token | `banksy-public-api:5679` -> mural-api `/api/public/v1` | Calls public API requiring user's Mural OAuth token | + +### Future Tool Groups + +| Tool Group | Expected Auth Mode | Rationale | +|---|---|---| +| canvas-mcp tools | `mural-oauth` (public) | Canvas operations use Mural's public API; no internal endpoints needed | +| Composite tools | Depends on downstream API | A tool calling both internal and public APIs would need both token types; no such tool exists today | +| Utility/diagnostic tools | None | Health checks, echo, version info need no auth | +| Machine-to-machine tools | `m2m` (future) | Worker-to-worker calls using `client_credentials` grant; already defined as a type in `auth-mode.ts` | + +### Cross-Mode Tool Overlap + +No tools currently work under both auth modes. The internal and public tool directories +are completely disjoint. Some tool names appear in both directories (e.g., +`get-mural-by-id`, `get-workspace`, `duplicate-mural`, `create-room`), but they import +from different generated clients (`client.banksy-mural-api` vs `client.banksy-public-api`) +and call different backend APIs. + +Both modes share infrastructure code: + +- `src/lib/call-mural-tool.ts` -- unified tool-calling function (backend URL differs per mode via `BANKSY_MURAL_API_URL`) +- `src/lib/auth/` -- Better Auth provider, session management, user context +- `src/lib/db/` -- PostgreSQL connection pool, token storage +- `src/lib/config.ts` -- centralized config with `AUTH_MODE` env var +- `src/lib/mural-session/` -- token refresh, content session creation +- `src/middleware.ts` -- Express middleware for auth and mural session + +### Auth Mode as Capability Constraint + +The auth mode is not merely a policy choice -- it is a capability constraint: + +- Internal tools call `banksy-mural-api` which proxies to mural-api's internal REST. + These endpoints require session JWTs obtained through Session Activation (SSO proxy -> + Google OAuth -> session creation). A Mural OAuth token cannot authenticate to these + endpoints. + +- Public tools call `banksy-public-api` which proxies to mural-api's `/api/public/v1`. + These endpoints require Mural OAuth access tokens with specific scopes + (`murals:read,murals:write,workspaces:read,...`). A session JWT cannot authenticate + to these endpoints. + +- The `callMuralTool` function retrieves the active token type via + `getActiveTokenType()`, which reads from `config().authMode`. The token type + (`session` vs `oauth`) determines which credentials are fetched from PostgreSQL and + sent to the backend. + +This means that even if both tool sets were registered on a single server, a tool from +the wrong auth mode would fail at the API call layer because the token type would be +wrong for the target endpoint. + + +## 3. FastMCP Capabilities and Constraints + +### 3.1 Single Auth Provider Per Server + +FastMCP accepts exactly one auth provider per server instance: + +```python +mcp = FastMCP("Banksy", auth=auth_provider) +``` + +The provider: +1. Registers discovery routes via `get_routes()` (e.g., `/.well-known/oauth-protected-resource`) +2. Registers HTTP middleware via `get_middleware()` (e.g., `BearerAuthBackend`, `AuthContextMiddleware`) +3. Supplies `verify_token()` for token validation + +All HTTP routes -- MCP transport, custom routes, and mounted sub-server routes -- share +the same auth middleware. There is no per-route or per-endpoint auth configuration built +into FastMCP. + +### 3.2 mount() Auth Inheritance + +Verified against FastMCP source (`server.py`, `transport.py`, `http.py`): + +- When `parent.mount(child)` is called, the child is wrapped as a `FastMCPProvider` + and added to the parent's provider list. +- Only the parent's `http_app()` is built. The child's `http_app()` is never + constructed. +- The parent's auth middleware applies to all traffic, including requests that invoke + child tools. +- The child's `auth=` parameter is completely ignored at runtime. + +| Scenario | Who controls auth? | Effect | +|---|---|---| +| `parent.mount(child)` where child has `auth=X` | Parent | Child's `auth=X` is ignored | +| `parent.mount(child)` where child has `auth=None` | Parent | Fine; parent's auth applies to all | +| Child tool has `tool.auth=AuthCheck` | Parent's token is used | `AuthCheck` runs against the parent's verified token | + +**Implication:** `mount()` cannot be used to create tool groups with different auth +strategies. All mounted sub-servers share the parent's single auth provider. + +### 3.3 Per-Tool Auth Checks (Component-Level) + +FastMCP supports `auth=AuthCheck` on individual tools: + +```python +@mcp.tool(auth=my_auth_check) +def admin_tool() -> str: ... +``` + +During `list_tools()` and `get_tool()`, the framework calls `run_auth_checks(tool.auth, +AuthContext(token=token))` where `token` is the token from the parent's auth middleware. + +This is a **visibility filter**, not a separate auth strategy. It can hide tools from +users who lack certain claims in their token, but it cannot change how the token was +obtained or validate a different token type. + +When tools are mounted from a child server, `FastMCPProviderTool.wrap()` does not copy +the child tool's `auth` field. The wrapped tool has `auth=None` on the parent. To add +per-tool auth checks to mounted tools, a custom Transform is needed. + +### 3.4 Transforms Pipeline + +Custom `Transform` subclasses filter or modify tools as they flow from providers to +clients: + +```python +class TagFilter(Transform): + def __init__(self, required_tags: set[str]): + self.required_tags = required_tags + + async def list_tools(self, tools: Sequence[Tool]) -> Sequence[Tool]: + return [t for t in tools if t.tags & self.required_tags] + + async def get_tool(self, name: str, call_next: GetToolNext) -> Tool | None: + tool = await call_next(name) + if tool and tool.tags & self.required_tags: + return tool + return None +``` + +Transforms can be applied at the server level (all providers) or provider level (specific +mount). They are a visibility mechanism, not a security boundary -- a client that knows a +tool's name can still attempt to call it unless a `get_tool` transform also blocks the +lookup. + +### 3.5 Server-Level Tag Filtering + +FastMCP provides `enable()` for startup-time tag filtering: + +```python +mcp = FastMCP("Production") +mcp.mount(api_server, namespace="api") +mcp.enable(tags={"production"}, only=True) +``` + +This applies tag filtering recursively to all mounted servers. Combined with +`tags={...}` on tool definitions, this provides a declarative way to control which +tools are visible at the server level. + +### 3.6 Custom Routes + +`custom_route()` registers non-MCP HTTP endpoints on the same Starlette app: + +```python +@mcp.custom_route("/health", methods=["GET"]) +async def health(request): + return JSONResponse({"status": "ok"}) +``` + +Custom routes from mounted sub-servers propagate to the parent (per current FastMCP +docs). Auth on custom routes comes from the parent's `AuthenticationMiddleware` but does +NOT get `RequireAuthMiddleware` (which only wraps the MCP transport route). Protected +custom routes must check `isinstance(request.user, AuthenticatedUser)` explicitly. + +### 3.7 MCP Protocol Constraints + +From the MCP specification (2025-03-26): + +- **Transport-level auth:** OAuth is defined at the HTTP transport layer. Every HTTP + request from client to server must include `Authorization: Bearer `. +- **No per-tool auth:** `tools/list` responses contain tool name, description, + inputSchema, and annotations. There is no field for auth requirements, scopes, or + permissions. +- **No per-tool scopes:** The protocol has no mechanism for a server to advertise that + different tools need different OAuth scopes. +- **Clients support multiple servers:** Cursor, Claude Desktop, and VS Code Copilot all + support connecting to multiple MCP servers simultaneously. Each connection has its own + auth flow. Separate servers per auth mode is transparent to the client. +- **No ecosystem precedent for multi-auth servers:** The MCP ecosystem uniformly assumes + one auth strategy per server. Multi-auth would require custom, non-standard solutions. + + +## 4. Architecture Options Analysis + +### Option A: Separate Servers per Auth Mode (Current Model Preserved) + +**Description:** One FastMCP server per auth mode, each with its own `auth=` provider. +Different Docker images, different deployments. Direct port of the current TS +architecture. + +| Dimension | Assessment | +|---|---| +| Auth handling | Clean -- each server has exactly one auth strategy | +| Tool visibility | Hard isolation -- a tool only exists in one server | +| Operational model | Two Docker images, two deployments, two scaling configs | +| Code sharing | Shared library code in the monorepo, no runtime sharing | +| Growth path | New auth mode = new Docker image, new deployment | +| CI/CD | Duplicated build pipelines | + +**Pros:** +- Zero risk of auth cross-contamination +- Proven model; direct migration of current architecture +- Independent scaling per auth mode + +**Cons:** +- Duplicated Docker images and deployment configurations +- Changes to shared code require rebuilding both images +- Adding a new tool domain requires deciding which image it belongs to +- Operational overhead scales linearly with auth modes + +### Option B: Single Server with Auth Multiplexer + +**Description:** One FastMCP server with a custom `TokenVerifier` that inspects the +token format and dispatches to different verification logic (SSO JWT vs Mural OAuth +token). + +| Dimension | Assessment | +|---|---| +| Auth handling | Complex -- custom verifier must sniff token type | +| Tool visibility | Middleware-enforced -- verified auth mode determines callable tools | +| Security | Fragile -- token type detection by inspection is not reliable | +| Operational model | One deployment | + +**Pros:** +- Single deployment, single image +- All tools visible to clients that can authenticate + +**Cons:** +- Token type sniffing is fragile and a security risk. SSO JWTs and Mural OAuth tokens + are both JWTs; distinguishing them by format inspection is unreliable. +- A bug in the multiplexer could grant access to the wrong tool set +- FastMCP's auth middleware is designed for one verification path; subverting it adds + maintenance burden +- No ecosystem precedent; would be a novel, unsupported pattern + +**Verdict: Not recommended.** The security risk of token-sniffing outweighs the +operational simplicity. Auth mode is a capability constraint, not a policy choice, and +must be enforced by architecture, not middleware heuristics. + +### Option C: Single Image, Auth Mode as Runtime Config + +**Description:** One Docker image with a `BANKSY_MODE` environment variable +(`internal`, `public`, or future values). At startup, the server reads the mode, +configures the matching auth provider, and registers only the tools for that mode. + +| Dimension | Assessment | +|---|---| +| Auth handling | Clean -- each deployment instance has one auth provider | +| Tool visibility | Startup-time registration -- unregistered tools don't exist | +| Operational model | One Docker image, multiple deployments with different env vars | +| Code sharing | Full runtime sharing of infrastructure code | +| Growth path | New auth mode = new enum value, new registration function | +| CI/CD | Single build pipeline | + +**Pros:** +- Single Docker image simplifies CI/CD +- Auth isolation preserved per deployment +- Clean startup-time tool registration +- Shared infrastructure code (DB, config, middleware) runs once + +**Cons:** +- Composite tools that need multiple auth modes cannot run in any single mode +- Mode validation must happen at startup to fail fast +- Slightly more complex entrypoint logic + +### Option D: Server-per-Mount with Shared Infrastructure + +**Description:** Multiple FastMCP server instances in one process, each with its own +`auth=`. Shared database connections, HTTP clients, config. Different ports or path +prefixes behind a reverse proxy. + +| Dimension | Assessment | +|---|---| +| Auth handling | Clean -- each ASGI app has its own auth | +| Tool visibility | Physical isolation per ASGI app | +| Operational model | One container, multiple ASGI apps, reverse proxy routing | +| Complexity | High -- reverse proxy config, port management, health checks per app | + +**Pros:** +- Shared infrastructure (DB pool, config) within one process +- Auth isolation per server instance +- Single container + +**Cons:** +- Significant operational complexity (reverse proxy, multi-port management) +- Health monitoring must cover multiple apps +- MCP clients would need separate server URLs per auth mode regardless +- Marginal benefit over Option C -- same isolation, more complexity + +**Verdict: Not recommended.** The added complexity of multi-ASGI routing within a single +container provides no meaningful advantage over Option C, where each deployment instance +is a clean, single-purpose server. + +### Option E: Hybrid -- Deployment Mode Selection with Tag-Based Refinement + +**Description:** Combines Option C's runtime mode selection with tag-based visibility +filtering within each mode. `BANKSY_MODE` selects the auth provider and broad tool set. +Tags and transforms provide finer-grained organization within that boundary. + +| Dimension | Assessment | +|---|---| +| Auth handling | Clean -- deployment mode selects one auth provider | +| Tool visibility | Layered: deployment (coarse), tags/transforms (fine) | +| Operational model | One Docker image, multiple deployments | +| Growth path | New mode for new auth; new tags for new groupings within a mode | +| Flexibility | Tags enable client-side filtering without server changes | + +**Pros:** +- All of Option C's benefits (single image, clean auth, startup registration) +- Tags provide within-mode organization (e.g., "read-only", "admin", "canvas", "widgets") +- `enable(tags={...}, only=True)` can create specialized deployments within a mode +- Custom transforms can implement runtime visibility rules based on user claims +- Clients can request tool subsets via tag filtering +- Enterprise per-customer tool subsets achievable via tags + `enable()` at startup + +**Cons:** +- Tags are not a security boundary; they filter visibility, not access +- Tag taxonomy requires design and maintenance +- Slightly more concepts for developers to learn + +**Verdict: Recommended.** Option E provides the cleanest architecture for banksy's +current needs and future growth trajectory. + + +## 5. Recommended Architecture: Option E Detailed + +### 5.1 Runtime Mode Selection + +``` +BANKSY_MODE=internal -> FastMCP(auth=SSOProxyAuth) + internal tools + internal tags +BANKSY_MODE=public -> FastMCP(auth=MuralOAuthAuth) + public tools + public tags +BANKSY_MODE=dev -> FastMCP(auth=None) + all tools + all tags +``` + +At startup: + +1. Read `BANKSY_MODE` from environment (default: fail with clear error if unset) +2. Instantiate the matching auth provider (Layer 1) +3. Create the FastMCP server with `auth=provider` +4. Register tool domains for the mode (e.g., `register_internal_tools(mcp)` or + `register_public_tools(mcp)`) +5. Register common infrastructure: health endpoint, auth callback routes +6. Optionally apply tag filters via `mcp.enable(tags={...}, only=True)` if a + specialized deployment is needed + +### 5.2 Startup Flow + +```python +from banksy.config import settings + +def create_server() -> FastMCP: + auth = create_auth_provider(settings.banksy_mode) + mcp = FastMCP("Banksy", auth=auth) + + register_common_routes(mcp) # /health, /version + + match settings.banksy_mode: + case "internal": + register_internal_tools(mcp) + register_session_activation_routes(mcp) + case "public": + register_public_tools(mcp) + register_mural_oauth_routes(mcp) + case "dev": + register_internal_tools(mcp) + register_public_tools(mcp) + register_session_activation_routes(mcp) + register_mural_oauth_routes(mcp) + + if settings.enabled_tags: + mcp.enable(tags=settings.enabled_tags, only=True) + + return mcp +``` + +### 5.3 Auth Provider per Mode + +**Internal mode (`sso-proxy`):** +- Layer 1: `OAuthProxy` or `RemoteAuthProvider` with Google IdP via SSO proxy +- Layer 2: Session Activation flow stores session JWTs in `mural_tokens` table +- Tools call `banksy-mural-api` (internal REST) with session JWTs + +**Public mode (`mural-oauth`):** +- Layer 1: `OAuthProxy` wrapping Mural's OAuth authorization server +- Layer 2: Mural OAuth access/refresh tokens stored in `mural_tokens` table +- Tools call mural-api's public API with OAuth access tokens + +**Dev mode:** +- Layer 1: No auth (`auth=None` or `StaticTokenVerifier`) +- Layer 2: Tokens loaded from dev seed data or `DISABLE_AUTH=true` bypass +- Both tool sets registered; backend URLs configurable + +### 5.4 Tag-Based Refinement + +Within each mode, tags organize tools along orthogonal dimensions: + +```python +@mcp.tool(tags={"murals", "read"}) +def get_mural_by_id(mural_id: str) -> dict: ... + +@mcp.tool(tags={"murals", "write"}) +def create_mural(title: str, workspace_id: str) -> dict: ... + +@mcp.tool(tags={"widgets", "write"}) +def create_sticky_note(mural_id: str, text: str) -> dict: ... +``` + +Clients can filter by tags to get focused tool sets. A specialized deployment could +use `enable(tags={"murals"}, only=True)` to expose only mural-related tools. + +### 5.5 How MCP Clients Connect + +MCP clients (Cursor, Claude Desktop, VS Code Copilot) support multiple simultaneous +server connections. A typical user's MCP configuration: + +```json +{ + "mcpServers": { + "banksy-internal": { + "url": "https://banksy-internal.example.com/mcp" + }, + "banksy-public": { + "url": "https://banksy-public.example.com/mcp" + } + } +} +``` + +Each connection has its own independent OAuth flow. The client handles auth for each +server separately. From the user's perspective, all tools appear in a unified list +regardless of which server provides them. + + +## 6. Code Organization Proposal + +``` +src/banksy/ + __init__.py + server.py # Entry point: reads BANKSY_MODE, wires auth + domains + config.py # pydantic-settings with BANKSY_MODE, DB URLs, auth config + auth/ + __init__.py + providers.py # create_auth_provider(mode) -> AuthProvider | None + sso_proxy.py # OAuthProxy/RemoteAuthProvider for SSO proxy mode + mural_oauth.py # OAuthProxy wrapping Mural OAuth + token_manager.py # Layer 2: per-user Mural token CRUD and refresh + token_verifier.py # BanksyTokenVerifier (custom subclass of TokenVerifier) + domains/ + __init__.py + internal/ + __init__.py # register_internal_tools(mcp) function + tools.py # Tool definitions (from_openapi or manual) + public/ + __init__.py # register_public_tools(mcp) function + tools.py # Tool definitions (from_openapi or manual) + canvas/ + __init__.py # register_canvas_tools(mcp) function (future) + tools.py + shared/ + __init__.py # Utility tools shared across modes (echo, health) + tools.py + routes/ + __init__.py + health.py # GET /health custom_route + session_activation.py # POST /auth/mural-link/code, /claim (internal mode) + mural_oauth_callback.py # GET /auth/mural-oauth/callback (public mode) + middleware/ + __init__.py + logging.py # MCP protocol middleware for request logging + metrics.py # MCP protocol middleware for Datadog metrics + db/ + __init__.py + engine.py # SQLAlchemy async engine + models.py # mural_tokens, pending_connections tables + token_store.py # CRUD for mural_tokens + spa.py # SpaStaticFiles for serving the React auth UI +``` + +### Key Design Decisions + +**`domains/` directory:** Each tool domain is a self-contained package with a +`register_*_tools(mcp)` function. This function takes a `FastMCP` instance and registers +all tools for that domain, including tags and metadata. The domain owns its tool +definitions, schemas, and any domain-specific helpers. + +**`auth/providers.py` as factory:** A single `create_auth_provider(mode)` function +returns the correct `AuthProvider` for the given mode. This keeps the server entry point +clean and makes mode-specific auth configuration testable in isolation. + +**`routes/` for custom HTTP endpoints:** Non-MCP HTTP routes (health, auth callbacks, +OAuth flows) are organized by concern, not by mode. Mode-specific routes are registered +conditionally in `server.py` based on `BANKSY_MODE`. + +**`shared/` domain:** Tools that work under any auth mode (or no auth) live here. The +`echo` tool and future diagnostic tools belong in this domain. + + +## 7. Tag Taxonomy Proposal + +Tags are a client-side filtering mechanism. They organize tools for discoverability and +enable specialized deployments via `enable()`. Tags are NOT a security boundary. + +### Taxonomy Dimensions + +Three orthogonal dimensions allow cross-cutting queries: + +**Domain tags** (which API surface): +- `internal-api` -- tools calling mural-api internal REST +- `public-api` -- tools calling mural-api public REST +- `canvas` -- future canvas-mcp tools +- `utility` -- diagnostic and infrastructure tools + +**Entity tags** (what resource type): +- `murals` -- mural CRUD and lifecycle +- `workspaces` -- workspace operations +- `rooms` -- room management +- `widgets` -- widget creation and manipulation +- `templates` -- template operations +- `users` -- user management and invitations +- `assets` -- file and image assets +- `voting` -- voting session management +- `labels` -- label/tag operations +- `search` -- search across entities + +**Capability tags** (what operation type): +- `read` -- read-only operations (GET) +- `write` -- create/update operations (POST/PUT/PATCH) +- `delete` -- destructive operations (DELETE) +- `admin` -- administrative operations (user creation, company setup) + +### Example Tool Tagging + +```python +# Internal mode tool +@mcp.tool(tags={"internal-api", "murals", "read"}) +def get_mural_by_id(mural_id: str) -> dict: ... + +# Public mode tool +@mcp.tool(tags={"public-api", "widgets", "write"}) +def create_sticky_note(mural_id: str, text: str) -> dict: ... + +# Utility tool (no auth needed) +@mcp.tool(tags={"utility", "read"}) +def echo(message: str) -> str: ... +``` + +### Cross-Cutting Queries + +Tags enable queries like: +- "All read-only mural tools" -> `{"murals", "read"}` +- "All widget tools" -> `{"widgets"}` +- "All admin tools" -> `{"admin"}` +- "All public API tools for rooms" -> `{"public-api", "rooms"}` + +### Specialized Deployments + +Tags combined with `enable()` enable deployment-time subsetting: + +```python +# Read-only deployment for auditors +mcp.enable(tags={"read"}, only=True) + +# Mural-focused deployment +mcp.enable(tags={"murals"}, only=True) + +# Full deployment (default -- no filtering) +# Don't call enable() at all +``` + +### Tag Governance + +- Domain tags are mandatory: every tool must have exactly one domain tag +- Entity tags are mandatory: every tool must have at least one entity tag +- Capability tags are mandatory: every tool must have exactly one of `read`, `write`, + `delete`, or `admin` +- New tags require updating this taxonomy document +- Tags must use lowercase kebab-case + + +## 8. Migration Plan Impact + +The following updates are needed to the migration execution strategy plan at +`willik-notes/fastmcp-migration/execution-strategy-research/00-migration-execution-strategy.plan.md`: + +### 8.1 Resolution of the Mode Merging Open Question + +The open question at approximately line 318 ("A dedicated research prompt is needed to +explore whether mode merging is feasible or whether mode selection should be preserved as +a runtime configuration flag") is now resolved: + +**Mode merging is not recommended. Mode selection is preserved as a runtime +configuration flag (`BANKSY_MODE`).** + +Rationale: Auth modes are capability constraints, not policy choices. Internal and public +tools call different APIs with incompatible token types. FastMCP's one-auth-per-server +constraint means a single server cannot cleanly handle multiple auth strategies. MCP +clients support multiple servers, so separate deployments per auth mode is transparent to +users. + +### 8.2 Server Topology Updates + +The plan's current server topology section describes `mount()` for composing sub-servers: + +```python +mcp.mount("internal", internal_api) +mcp.mount("public", public_api) +``` + +This should be updated to reflect that `mount()` is used within a single mode for +organizing tools by namespace, not across auth modes: + +```python +# In BANKSY_MODE=internal +mcp.mount(internal_api, namespace="mural") + +# In BANKSY_MODE=public +mcp.mount(public_api, namespace="mural") +mcp.mount(canvas_tools, namespace="canvas") # future +``` + +### 8.3 Docker Build Updates + +The plan should specify a single `Dockerfile` that produces one image. The two current +Dockerfiles (`Dockerfile` and `Dockerfile.mural-oauth`) are replaced by a single +Dockerfile. The deployment mode is selected at runtime via `BANKSY_MODE` env var, not at +build time via different configs. + +### 8.4 Config Schema Updates + +Add to the `config.py` pydantic-settings model: + +```python +class Settings(BaseSettings): + banksy_mode: Literal["internal", "public", "dev"] = "dev" + enabled_tags: set[str] | None = None # Optional tag filter at startup + # ... existing fields ... +``` + +### 8.5 Phase Updates + +**Phase 2 (Public API tools):** +- Register public tools when `BANKSY_MODE=public` or `BANKSY_MODE=dev` +- Apply `public-api` domain tag to all public tools +- Wire Mural OAuth auth provider for public mode + +**Phase 3 (Tool Curation):** +- Implement tag taxonomy from this document +- Add `enable(tags=..., only=True)` support for specialized deployments +- Tag-based visibility is a refinement layer, not a replacement for mode selection + +**Phase 4+ (Internal API tools):** +- Register internal tools when `BANKSY_MODE=internal` or `BANKSY_MODE=dev` +- Apply `internal-api` domain tag to all internal tools +- Wire SSO proxy auth provider for internal mode + +### 8.6 Decision 8 Resolution (Tool Tags and Meta) + +Decision 8 in the plan ("Tool tags and meta -- use as tag-based visibility in Phase 3, or +leave for implementation") should be resolved as: + +**Use tags as the primary tool organization mechanism.** Follow the three-dimensional +taxonomy (domain, entity, capability) defined in Section 7. Tags are mandatory on all +tools. `meta={}` can carry additional structured metadata (e.g., API version, rate limit +hints) but is not used for visibility filtering. + + +## 9. Open Questions + +### 9.1 Composite Tools Needing Multiple Auth Modes + +No composite tools exist today that call both internal and public APIs. If one is needed: + +- **Option A:** Run it in a mode that has access to one API, and call the other API via + a service-to-service token (machine-to-machine auth). +- **Option B:** Split the composite operation into two tools, one per mode, and let the + AI agent orchestrate them. +- **Option C:** Create a new mode that carries both token types (requires custom Layer 2 + token management). + +**Recommendation:** Defer until a concrete use case exists. Option B (agent orchestration) +is the most aligned with MCP's design philosophy of composable tools. + +### 9.2 Enterprise Per-Customer Tool Subsets + +Some enterprise customers may need restricted tool sets (e.g., read-only, no admin +tools). This is achievable with the recommended architecture: + +- Deploy with `BANKSY_MODE=public` and `ENABLED_TAGS=read` to restrict to read-only tools +- Or implement a custom `Transform` that filters tools based on claims in the user's + token (e.g., organization membership, role) +- No architectural changes needed + +### 9.3 API Key Auth for Machine-to-Machine + +The `m2m` auth mode is already defined as a type in the current TS codebase +(`auth-mode.ts`). In the FastMCP migration: + +- Add `BANKSY_MODE=m2m` as a valid mode +- Implement a `TokenVerifier` that validates API keys (or client_credentials JWTs) +- Register the appropriate tool set for machine-to-machine use cases +- Same architecture, new mode value + +### 9.4 canvas-mcp Auth Requirements + +The canvas-mcp prototype currently has no auth. When absorbed into banksy: + +- Confirm that canvas operations use Mural's public API (most likely) +- If so, canvas tools belong in `BANKSY_MODE=public` with a `canvas` domain tag +- If canvas tools need internal API access, they belong in `BANKSY_MODE=internal` +- The `domains/canvas/` package structure is ready for either case + +### 9.5 Migration Ordering + +The migration plan currently targets public API tools first (Phase 2). This research +confirms that is the right sequencing: + +- Public mode is the primary external-facing deployment +- Public API tools (87) outnumber internal tools (39) by more than 2:1 +- canvas-mcp absorption (public mode) will happen alongside public tool migration +- Internal mode can be migrated later with the SSO proxy auth provider + +### 9.6 `from_openapi()` and Mode Selection + +The migration plan uses `FastMCP.from_openapi()` to generate tools from OpenAPI specs. +This needs to work with mode selection: + +- In `BANKSY_MODE=public`: load the public API OpenAPI spec, generate tools, tag with + `public-api` +- In `BANKSY_MODE=internal`: load the internal API OpenAPI spec, generate tools, tag with + `internal-api` +- In `BANKSY_MODE=dev`: load both specs, generate both tool sets with appropriate + namespace prefixes + +Verify that `from_openapi()` supports passing `tags=` to generated tools. If not, tags +can be applied via a `Transform` after generation. diff --git a/fastmcp-migration/execution-strategy-research/00-migration-execution-strategy.md b/fastmcp-migration/execution-strategy-research/00-migration-execution-strategy.md new file mode 100644 index 0000000..d6daa3e --- /dev/null +++ b/fastmcp-migration/execution-strategy-research/00-migration-execution-strategy.md @@ -0,0 +1,1198 @@ + +# Banksy xmcp-to-FastMCP Migration + +## Summary + +Rewrite banksy from a 3-process TypeScript/xmcp architecture to a Python/FastMCP server. `BANKSY_MODE` (internal/public/dev) selects the auth provider and tool set at runtime — one Docker image, multiple deployments. Two `FastMCP.from_openapi()` calls replace both `banksy-mural-api` (internal API, 39 tools) and `banksy-public-api` (Public API, 87 tools) code-gen pipelines. Auth uses FastMCP's built-in OAuth with Google as the initial IdP for Layer 1 (IDE to banksy) plus custom Python for Layer 2 (banksy to Mural API) token management. Database is a fresh PostgreSQL schema (no data migration). A React SPA is preserved for browser-facing pages (home, Session Activation, error) and served from the same process via Starlette's `StaticFiles`. + +The repo uses a uv workspace structure under `pypackages/` — only `banksy-server` is created now. The workspace is ready to expand with `banksy-shared` (extracted shared code) and `banksy-harness` (agent orchestration) when those consumers are needed. Existing TS code in `packages/` stays as read-only reference until the final cleanup removes all TypeScript artifacts. + +```mermaid +graph TD + subgraph before ["Current (xmcp, 3 processes, 2 images)"] + Client1["LLM Client"] -->|"MCP HTTP"| Core["banksy-core :3001"] + Core -->|"MCP HTTP"| MuralAPI["banksy-mural-api :5678"] + Core -->|"MCP HTTP"| PublicAPI["banksy-public-api :5679"] + MuralAPI -->|REST| MURAL1["Mural API (internal)"] + PublicAPI -->|REST| MURAL1P["Mural API (public)"] + Core -.->|"code-gen at build time"| MuralAPI + Core -.->|"code-gen at build time"| PublicAPI + end + + subgraph after ["Target (FastMCP, 1 image, BANKSY_MODE per deploy)"] + ClientInt["LLM Client"] -->|"MCP HTTP"| InternalDeploy["banksy BANKSY_MODE=internal"] + ClientPub["LLM Client"] -->|"MCP HTTP"| PublicDeploy["banksy BANKSY_MODE=public"] + InternalDeploy -->|REST| MURAL2I["Mural API (internal)"] + PublicDeploy -->|REST| MURAL2P["Mural API (public)"] + Browser["Browser"] -->|"SPA + auth routes"| PublicDeploy + end +``` + +**Branch**: `feat/transition-to-fast-mcp` — work merges into this branch. + +--- + +## Sequencing Overview + +Work is organized into phases with dependency awareness. Phases can be packaged into PRs however makes sense — the sequencing defines what must come before what, not delivery boundaries. + +```mermaid +graph LR + P1["1 Bootstrap"] --> P2["2 OpenAPI Tools"] + P2 --> P3["3 Tool Curation"] + P1 --> P4["4 Database"] + P1 --> P5["5 SPA Setup"] + P4 --> P6["6 Auth Layer 1"] + P6 --> P7["7 Auth Layer 2"] + P2 --> P7 + P3 --> P8["8 Testing"] + P7 --> P8 + P5 --> P7 + P8 --> P9["9 Deploy + Cleanup"] +``` + +| Phase | What It Delivers | Depends On | Parallelism | +|-------|-----------------|------------|-------------| +| 1 Bootstrap | uv workspace skeleton (root + `banksy-server` under `pypackages/`), echo tool, health endpoint, `BANKSY_MODE` config, CI | Nothing | -- | +| 2 OpenAPI Tools | `from_openapi()` integration, Mural API tools | 1 | Parallel with 4, 5 | +| 3 Tool Curation | LLM-friendly names, descriptions, transforms, composites | 2 | -- | +| 4 Database | PostgreSQL schema, Alembic migrations, token storage | 1 | Parallel with 2, 5 | +| 5 SPA Setup | `ui/` directory, Vite build, `SpaStaticFiles` mount | 1 | Parallel with 2, 4 | +| 6 Auth Layer 1 | IDE-to-banksy auth (RemoteAuthProvider/OAuthProxy, JWT) | 4 | -- | +| 7 Auth Layer 2 | Banksy-to-Mural tokens, Session Activation, injection | 2, 5, 6 | -- | +| 8 Testing | Comprehensive test suite, CI integration | 3, 7 | -- | +| 9 Deploy + Cleanup | Python Dockerfile, TS removal, README rewrite | 8 | -- | + +--- + +## Server Topology + +The current TS architecture uses two separate deployment modes: **sso-proxy** (internal, 39 tools, session JWTs) and **mural-oauth** (public, 87 tools, OAuth tokens). Each mode is a separate Docker image. The auth mode is a capability constraint, not a policy choice — internal tools call mural-api's internal REST with session JWTs, while public tools call the public REST API with OAuth access tokens. These token types are incompatible across modes. + +Two hard constraints shape the FastMCP migration: + +1. **FastMCP enforces one auth provider per server instance.** The `auth=` parameter accepts exactly one provider. When a child server is mounted via `mount()`, the parent's auth applies — the child's auth is ignored. +2. **The MCP protocol defines auth at the transport level, not per-tool.** There is no mechanism in `tools/list` responses for a server to advertise per-tool auth requirements. + +### Deployment Mode Selection (Option E) + +Build one Docker image. At runtime, `BANKSY_MODE` selects the auth provider and tool set. Within each mode, tags provide finer-grained client-side filtering. + +``` +BANKSY_MODE=internal -> FastMCP(auth=SSOProxyAuth) + internal tools + internal tags +BANKSY_MODE=public -> FastMCP(auth=MuralOAuthAuth) + public tools + public tags +BANKSY_MODE=dev -> FastMCP(auth=None) + all tools + all tags +``` + +### Startup Flow + +```python +from banksy_server.config import settings + +def create_server() -> FastMCP: + auth = create_auth_provider(settings.banksy_mode) + mcp = FastMCP("Banksy", auth=auth) + + register_common_routes(mcp) # /health, /version + + match settings.banksy_mode: + case "internal": + register_internal_tools(mcp) + register_session_activation_routes(mcp) + case "public": + register_public_tools(mcp) + register_mural_oauth_routes(mcp) + case "dev": + register_internal_tools(mcp) + register_public_tools(mcp) + register_session_activation_routes(mcp) + register_mural_oauth_routes(mcp) + + if settings.enabled_tags: + mcp.enable(tags=settings.enabled_tags, only=True) + + return mcp +``` + +### Auth Provider per Mode + +**Internal mode (`sso-proxy`):** Layer 1 uses `OAuthProxy` or `RemoteAuthProvider` with Google IdP via SSO proxy. Layer 2 stores session JWTs in `mural_tokens`. Tools call `banksy-mural-api` (internal REST) with session JWTs. + +**Public mode (`mural-oauth`):** Layer 1 uses `OAuthProxy` wrapping Mural's OAuth authorization server. Layer 2 stores Mural OAuth access/refresh tokens in `mural_tokens`. Tools call mural-api's public API with OAuth access tokens. + +**Dev mode:** Layer 1 has no auth (`auth=None` or `StaticTokenVerifier`). Layer 2 tokens loaded from dev seed data. Both tool sets registered; backend URLs configurable. + +### MCP Client Connection Model + +MCP clients (Cursor, Claude Desktop, VS Code Copilot) support multiple simultaneous server connections. Each connection has its own independent OAuth flow. A typical user's MCP configuration: + +```json +{ + "mcpServers": { + "banksy-internal": { + "url": "https://banksy-internal.example.com/mcp" + }, + "banksy-public": { + "url": "https://banksy-public.example.com/mcp" + } + } +} +``` + +From the user's perspective, all tools appear in a unified list regardless of which server provides them. + +**Deep dive**: [banksy-architecture-research.md](../banksy-research/banksy-architecture-research.md) Sections 2–5; [tool-visibility-server-topology-research.md](../banksy-research/tool-visibility-server-topology-research.md) + +--- + +## Repo Layout + +The Python project uses a uv workspace under `pypackages/`. Only `banksy-server` is created now — the workspace glob (`pypackages/*`) auto-discovers future members when they're added. Existing TS code in `packages/` stays untouched as read-only reference. The final cleanup removes all TS artifacts. + +### During Transition + +``` +banksy/ +├── packages/ # EXISTING TS (untouched, read-only reference) +│ ├── banksy-core/ +│ ├── banksy-mural-api/ +│ └── banksy-public-api/ +├── pypackages/ # Python workspace members +│ └── server/ # banksy-server (only member created now) +│ ├── pyproject.toml +│ ├── src/ +│ │ └── banksy_server/ +│ │ ├── __init__.py +│ │ ├── server.py # Entry point: reads BANKSY_MODE, wires auth + domains +│ │ ├── config.py # pydantic-settings with BANKSY_MODE, DB URLs, auth +│ │ ├── mural_api.py # FastMCP.from_openapi() integration +│ │ ├── spa.py # SpaStaticFiles class +│ │ ├── auth/ # providers.py, sso_proxy.py, mural_oauth.py, token_manager.py +│ │ ├── domains/ # Tool organization (see Code Organization section) +│ │ │ ├── internal/ # register_internal_tools(mcp) +│ │ │ ├── public/ # register_public_tools(mcp) +│ │ │ ├── canvas/ # register_canvas_tools(mcp) (future) +│ │ │ └── shared/ # Utility tools (echo, health) +│ │ ├── routes/ # health.py, session_activation.py, mural_oauth_callback.py +│ │ ├── middleware/ # logging.py, metrics.py +│ │ ├── models/ # SQLAlchemy models (extract to banksy-shared later) +│ │ └── db/ # Engine, session factory, token storage +│ └── tests/ +│ ├── conftest.py +│ ├── test_tools/ +│ ├── test_auth/ +│ └── test_integration/ +│ # Future members (not created yet — workspace glob picks them up when added): +│ # └── shared/ # banksy-shared: Pydantic models, auth utils, Mural client +│ # └── harness/ # banksy-harness: MCP client, agent orchestration +├── ui/ # React SPA (standalone Node.js project) +│ ├── package.json +│ ├── package-lock.json +│ ├── vite.config.ts +│ ├── tailwind.config.mjs +│ ├── tsconfig.json +│ ├── index.html +│ ├── .npmrc +│ └── src/ +│ ├── main.tsx +│ ├── index.css +│ └── components/ +│ ├── home-page.tsx +│ ├── claude-landing.tsx +│ ├── session-activation.tsx +│ ├── completion-page.tsx +│ ├── error-page.tsx +│ └── shared/ # AuthPageLayout, MuralLogo, etc. +├── conftest.py # Root-level shared test fixtures +├── migrations/ # Alembic migrations +│ ├── alembic.ini +│ ├── env.py +│ └── versions/ +├── pyproject.toml # NEW: Workspace root ([tool.uv.workspace] members = ["pypackages/*"]) +├── uv.lock # NEW: Single lockfile for all workspace members +├── .python-version # NEW: Python version pin (3.14) +├── .pre-commit-config.yaml # NEW: Python git hooks +├── .github/workflows/ +│ ├── build.yml # EXISTING: TS Docker builds (kept until cleanup) +│ ├── quality.yml # EXISTING: TS lint/test (kept until cleanup) +│ └── python.yml # NEW: Python CI (uv run --package banksy-server) +├── Dockerfile # EXISTING: TS sso-proxy build (replaced at cleanup) +├── Dockerfile.mural-oauth # EXISTING: TS mural-oauth (deleted at cleanup) +├── package.json # EXISTING: TS workspace root (deleted at cleanup) +├── package-lock.json # EXISTING: npm lockfile (deleted at cleanup) +├── eslint.config.mjs # EXISTING: TS linting (deleted at cleanup) +├── .prettierrc # EXISTING: TS formatting (deleted at cleanup) +├── knip.jsonc # EXISTING: TS dead code (deleted at cleanup) +├── .nvmrc # EXISTING: Node version (deleted at cleanup) +├── .husky/ # EXISTING: TS hooks (replaced by pre-commit in Phase 1) +└── .gitignore # UPDATED: Add Python patterns +``` + +### After Cleanup (Phase 9) + +``` +banksy/ +├── pypackages/ # Can optionally rename to packages/ after TS removal +│ └── server/ +│ ├── pyproject.toml +│ ├── src/banksy_server/ +│ │ ├── server.py +│ │ ├── config.py +│ │ ├── mural_api.py +│ │ ├── spa.py +│ │ ├── auth/ +│ │ ├── domains/ +│ │ ├── routes/ +│ │ ├── middleware/ +│ │ ├── models/ +│ │ └── db/ +│ └── tests/ +│ # Add shared/ and harness/ here when needed +├── conftest.py +├── ui/ +│ ├── package.json +│ ├── package-lock.json +│ ├── vite.config.ts +│ ├── .npmrc +│ └── src/ +├── migrations/ +├── pyproject.toml # Workspace root +├── uv.lock +├── .python-version +├── .pre-commit-config.yaml +├── .github/workflows/ +│ ├── build.yml # Rewritten for Python + SPA Docker +│ └── quality.yml # Rewritten for Python lint/test +├── Dockerfile.server # Workspace-aware uv build (replaces both TS Dockerfiles) +├── Makefile +├── .gitignore +├── docs/ +│ ├── AUTH.md +│ └── SECURITY_POSTURE.md +└── README.md +``` + +### Workspace Expansion Path + +When a second consumer needs shared code (e.g., the agent harness), extract `banksy-shared` from `banksy-server`: + +1. Create `pypackages/shared/pyproject.toml` with `pydantic`, `sqlalchemy`, `httpx` +2. Move `models/`, `auth/token_utils.py`, `mural_client/` into `banksy_shared` +3. Add `banksy-shared` as a workspace dependency in the server's `pyproject.toml` + +When agent orchestration work begins, create `pypackages/harness/pyproject.toml` with `openai`, `anthropic`, `mcp` SDK. The workspace glob auto-discovers both. + +### CI During Transition + +Both pipelines run in parallel during the transition: + +- `quality.yml` (TS) — passes because TS code is untouched +- `python.yml` (new) — `uv run --package banksy-server` for `ruff check`, `ruff format --check`, `pyright`, `pytest` +- `build.yml` (TS Docker) — continues building TS images; not meaningful but not harmful + +At cleanup: delete `quality.yml`, rewrite `build.yml` for Python + SPA Docker. When workspace members are added, extend `python.yml` with per-package test jobs. + +--- + +## Toolchain + +| Category | Tool | Notes | +|---|---|---| +| Project/Dependency | **uv** | FastMCP-aligned, built-in Python version mgmt, PEP 621 | +| Linting + Formatting | **Ruff** | Same vendor as uv, sub-second, used by FastMCP itself | +| Type Checking | **Pyright** (strict mode) | Editor + CI; strict mode catches Pydantic and SQLAlchemy issues. mypy was evaluated (see doc 06) but dropped per the canvas-mcp alignment assessment — Pyright strict is sufficient without a second checker. | +| Testing | **pytest + pytest-asyncio** | Required by FastMCP's test utilities (`Client`, `HeadlessOAuth`) | +| Database ORM | **SQLAlchemy 2.0 async** (asyncpg driver) | Alembic integration, auto-migration generation | +| Migrations | **Alembic** | Reads SQLAlchemy models for auto-diff | +| HTTP Client | **httpx + httpx-retries** | Mandatory — `from_openapi()` takes an `httpx.AsyncClient`. Tool-level retries via `stamina` deferred — add if needed after initial integration. | +| Configuration | **pydantic-settings** | Type-safe env vars; FastMCP uses it internally | +| Git Hooks | **pre-commit + pre-commit-uv** | Ruff hooks for linting/formatting on commit | +| SPA Build | **Vite + React + Tailwind** | Standalone `ui/` project with Mural Design System packages | + +### Key Integration Details + +- `from_openapi()` takes an `httpx.AsyncClient` directly — use lifespan-managed connection pool +- `Client(transport=server)` provides in-memory testing; `asyncio_mode = "auto"` eliminates decorator boilerplate +- `create_async_engine("postgresql+asyncpg://...")` for async DB access +- Alembic async requires `run_async()` in `env.py` +- `pre-commit-uv` plugin lets pre-commit use uv for hook environments +- Pyright strict mode handles both Pydantic and SQLAlchemy models without additional plugins +- SPA builds with `npm run build` in `ui/`, output goes to `ui/dist/` + +--- + +## OpenAPI Integration + +Replace both `banksy-mural-api` (internal API, 39 tools) and `banksy-public-api` (Public API, 87 tools) with `FastMCP.from_openapi()` calls. Each sub-server wraps a different OpenAPI spec and mounts onto the main server. + +### from_openapi() Setup + +Two separate `from_openapi()` sub-servers, one per API spec: + +1. **Internal API sub-server** — wraps the Mural internal API (base path `/api/v0/`) + - Load spec via `MURAL_INTERNAL_API_SPEC` env var + - Filter to the operation IDs in `packages/banksy-mural-api/.tools` (the tool allowlist) + - Some tools require content session tokens (see Content Session Tokens below) + +2. **Public API sub-server** — wraps the Mural Public API (base path `/api/public/v1/`) + - Load spec via `MURAL_PUBLIC_API_SPEC` env var + - Filter to the operation IDs currently exposed by `banksy-public-api` + - Uses standard OAuth tokens for all operations + +Both use `RouteMap`: GET → RESOURCE, POST/PUT/DELETE → TOOL, deprecated/internal → EXCLUDE. Each mounts onto the server within its respective `BANKSY_MODE` — `mount()` organizes tools by namespace within a single mode, not across auth modes (see Server Topology). + +**Phasing**: Start with the Public API spec in Phase 2 (when `BANKSY_MODE=public` or `dev`). Add the internal API spec as a follow-on (when `BANKSY_MODE=internal` or `dev`). The plumbing is identical — `from_openapi()` is called with different specs and different httpx clients (different base URLs, different auth injection per mode). + +```python +# In BANKSY_MODE=public (or dev) +public_api = FastMCP.from_openapi( + openapi_spec=public_spec, + client=public_http_client, + route_map=ROUTE_MAP, + name="mural-public-api", +) +mcp.mount(public_api, namespace="mural") + +# In BANKSY_MODE=internal (or dev) +internal_api = FastMCP.from_openapi( + openapi_spec=internal_spec, + client=internal_http_client, + route_map=ROUTE_MAP, + name="mural-internal-api", +) +mcp.mount(internal_api, namespace="mural") +``` + +No code generation scripts are needed — `from_openapi()` generates tools at server startup, not at build time. This eliminates the entire `create-tool-handlers.mjs` / `@ivotoby/openapi-mcp-server` pipeline. + +### Tool Curation and Transforms + +After the raw OpenAPI tools are generated, reshape them for LLM consumption: + +- Rename from operationId conventions to LLM-friendly names +- Improve descriptions (tuned for LLM consumption, not developer docs) +- Hide parameters injected from context (auth headers, etc.) +- Sensible defaults where applicable +- Hand-write composite tools combining multiple API calls (e.g., `create_workshop` orchestrating several endpoints) + +### Tag Taxonomy + +Tags organize tools for discoverability and enable specialized deployments via `enable()`. Tags are a client-side filtering mechanism, **not a security boundary**. Three orthogonal dimensions allow cross-cutting queries: + +**Domain tags** (which API surface) — exactly one per tool: +- `internal-api` — tools calling mural-api internal REST +- `public-api` — tools calling mural-api public REST +- `canvas` — future canvas-mcp tools +- `utility` — diagnostic and infrastructure tools + +**Entity tags** (what resource type) — at least one per tool: +- `murals`, `workspaces`, `rooms`, `widgets`, `templates`, `users`, `assets`, `voting`, `labels`, `search` + +**Capability tags** (what operation type) — exactly one per tool: +- `read` — read-only operations (GET) +- `write` — create/update operations (POST/PUT/PATCH) +- `delete` — destructive operations (DELETE) +- `admin` — administrative operations + +```python +@mcp.tool(tags={"public-api", "murals", "read"}) +def get_mural_by_id(mural_id: str) -> dict: ... + +@mcp.tool(tags={"public-api", "widgets", "write"}) +def create_sticky_note(mural_id: str, text: str) -> dict: ... +``` + +**Specialized deployments** combine tags with `enable()`: + +```python +mcp.enable(tags={"read"}, only=True) # Read-only deployment for auditors +mcp.enable(tags={"murals"}, only=True) # Mural-focused deployment +``` + +**Tag governance**: Every tool must have exactly one domain tag, at least one entity tag, and exactly one capability tag. New tags require updating this taxonomy. Tags use lowercase kebab-case. + +**Key TS reference**: + +- `packages/banksy-mural-api/src/index.ts` — internal API spec loading and tool filtering +- `packages/banksy-mural-api/src/helpers.ts` — `mapOperationIdsToToolNames()` naming +- `packages/banksy-mural-api/.tools` — internal API tool allowlist +- `packages/banksy-public-api/src/index.ts` — public API spec loading and tool filtering +- `packages/banksy-public-api/src/helpers.ts` — public API naming helpers +- `packages/banksy-core/src/tools/internal/` — 39 internal API tool handlers (some use content sessions) +- `packages/banksy-core/src/tools/public/` — 87 public API tool handlers +- `packages/banksy-core/xmcp.config.ts` — sso-proxy mode config (internal tools) +- `packages/banksy-core/xmcp.config.mural-oauth.ts` — mural-oauth mode config (public tools) + +**Gotcha**: `from_openapi()` can struggle with polymorphism, nested `$ref`, and very large specs. If the Mural spec hits issues, fall back to manual tool definitions for problematic endpoints. + +### Deployment Modes (Resolved) + +Mode merging is not recommended. `BANKSY_MODE` is preserved as a runtime configuration flag. Auth modes are capability constraints — internal and public tools call different APIs with incompatible token types. FastMCP's one-auth-per-server constraint means a single server cannot cleanly handle multiple auth strategies. MCP clients support multiple servers, so separate deployments per auth mode is transparent to users. + +The current two TS Dockerfiles (`Dockerfile` for sso-proxy, `Dockerfile.mural-oauth` for mural-oauth) are replaced by a single `Dockerfile.server` — the mode is runtime config (`BANKSY_MODE` env var), not build-time. See Server Topology for the full design. + +--- + +## Auth + +Banksy has a two-layer auth architecture. Layer 1 authenticates the IDE user to banksy. Layer 2 authenticates banksy to the Mural API on behalf of that user. + +### Two-Layer Architecture + +``` +Layer 1 (IDE → Banksy): + RemoteAuthProvider + JWTVerifier + - Google as initial IdP; enterprise customers use their own IdP + - Banksy validates RS256 JWTs via JWKS + - Banksy serves PRM (RFC 9728) + - No session management, no token issuance + - Bearer tokens replace session cookies (MCP spec requirement) + +Layer 2 (Banksy → Mural API): + Custom token management (banksy's own DB) + - mural_tokens table (user_id, access_token, refresh_token, expires_at) + - get_mural_tokens(user_id) with proactive refresh + - call_mural_api() helper for per-request auth injection + - Session Activation routes via custom_route() +``` + +Layer 1 maps cleanly to FastMCP primitives. Layer 2 is entirely custom Python regardless of framework. The auth provider for Layer 1 is selected per deployment by `create_auth_provider(settings.banksy_mode)` (see Server Topology). + +### Layer 1: IDE to Banksy + +**Primary path — `RemoteAuthProvider` + `JWTVerifier`**: + +Google is the initial IdP. Enterprise Mural customers who use their own IdP (e.g., Okta, Azure AD) will configure their IdP directly. Banksy only validates JWTs — it doesn't care which IdP issued them as long as the JWKS, issuer, and audience match. Configuration is under 20 lines of Python: + +```python +jwt_verifier = JWTVerifier( + jwks_uri="https://idp.example.com/.well-known/jwks.json", + issuer="https://idp.example.com", + audience="banksy-mcp-server", + algorithm="RS256", + required_scopes=["openid", "mural"], +) + +auth = RemoteAuthProvider( + token_verifier=jwt_verifier, + authorization_servers=["https://idp.example.com"], + base_url="https://banksy.example.com", + resource_name="Banksy MCP Server", +) + +mcp = FastMCP("Banksy", auth=auth) +``` + +This auto-generates `/.well-known/oauth-protected-resource` (RFC 9728 PRM endpoint). The IDE reads PRM, discovers the IdP, performs OAuth directly with the IdP, and sends the resulting JWT as a Bearer token to banksy. + +**Fallback — `OAuthProxy`** (when IdP lacks DCR): + +If the chosen IdP does not support Dynamic Client Registration, banksy acts as a local Authorization Server that proxies to the upstream IdP. OAuthProxy handles OAuth flows server-side, stores upstream tokens encrypted, and issues its own JWTs. The IDE talks only to banksy's AS metadata endpoint. + +**`StaticTokenVerifier` for early development:** + +Before the IdP PoC is complete, use `StaticTokenVerifier` with a known test token. This allows auth-gated tool development without an IdP dependency. + +**What replaces what:** + +| Current (TS/Better Auth) | Target (Python/FastMCP) | +|---|---| +| Better Auth `mcp()` plugin | `RemoteAuthProvider` + `JWTVerifier` | +| `/.well-known/oauth-authorization-server` | `/.well-known/oauth-protected-resource` (PRM) | +| `getMcpSession()` (cookie validation) | `verify_token()` (JWT validation) | +| `banksyUserContextProvider` + `getUserId()` | `TokenClaim("sub")` / `CurrentAccessToken()` | +| Better Auth social providers | External IdP handles social login | +| SPA OAuth callback pages | Server-side callback routes (or none with RS) | +| SSO proxy Google OAuth plugin | IdP handles Google as social connection | + +### Layer 2: Banksy to Mural + +Neither `RemoteAuthProvider` nor `OAuthProxy` solves Layer 2. After Layer 1 authenticates the IDE user, banksy still needs per-user Mural API tokens. + +**Token storage**: Custom `mural_tokens` table in PostgreSQL (similar schema to current `muralSessionToken` / `muralOauthToken` tables): + +```sql +CREATE TABLE mural_tokens ( + user_id TEXT PRIMARY KEY, + access_token TEXT NOT NULL, + refresh_token TEXT, + token_type TEXT NOT NULL, -- 'session' or 'oauth' + expires_at TIMESTAMPTZ, + created_at TIMESTAMPTZ DEFAULT NOW(), + updated_at TIMESTAMPTZ DEFAULT NOW() +); +``` + +**Per-request injection**: Replace `AsyncLocalStorage` pattern with an `httpx.Auth` subclass that transparently handles per-user token lookup, injection, and refresh (pattern from research doc 07): + +```python +class MuralOAuth(httpx.Auth): + def __init__(self, token_store: MuralTokenStore, user_id: str): + self.token_store = token_store + self.user_id = user_id + + async def async_auth_flow(self, request): + token = await self.token_store.get_valid_token(self.user_id) + request.headers["Authorization"] = f"Bearer {token}" + yield request +``` + +The httpx.Auth subclass is instantiated per-request with the user's ID (from `TokenClaim("sub")`), and passed to the httpx client. This keeps token management transparent to tool implementations — tools just make HTTP calls without handling auth. + +**Token refresh**: Proactive refresh inside `get_valid_token()` with optimistic locking to prevent duplicate refreshes under concurrent tool calls. + +**Token acquisition options**: +- **Option A**: IdP stores upstream Mural tokens (Auth0 Token Vault / Descope Outbound Apps) — eliminates Session Activation for this flow +- **Option B**: Session Activation survives (code/claim endpoints) — browser-based Mural OAuth after Layer 1 auth +- **Option C (most likely near-term)**: Hybrid — IdP handles Google/social for Layer 1; Session Activation handles Mural tokens for Layer 2 + +### Session Activation + +Session Activation links a user's Mural account to their authenticated MCP session. It uses a two-phase code flow: + +1. The IDE (authenticated via Bearer token) calls `POST /auth/mural-link/code` to pre-generate an activation code +2. The IDE opens the browser to `/auth/connect?code=XXXX` (the SPA reads the code from the URL) +3. The SPA collects the user's email, calls the Mural realm API, and redirects to Mural OAuth +4. Mural OAuth callback returns to a server-side `custom_route()` handler — not to the SPA +5. Server-side handler exchanges code for Mural tokens, stores them, redirects browser to `/auth/complete` + +This pattern avoids sending any auth credential to the browser. The activation code itself is the credential, generated by the authenticated IDE. + +**Implementation**: Session Activation routes use `custom_route()` with the explicit `get_authenticated_user()` guard (see Auth Middleware Architecture). The app-wide `BearerAuthBackend` populates `request.user` but does not enforce auth — the handler must check. + +### Auth Middleware Architecture + +FastMCP registers `BearerAuthBackend` as **app-wide** Starlette middleware. It runs on every HTTP request — MCP transport, `custom_route()` handlers, and raw Starlette routes alike. Critically, it **permits** unauthenticated requests rather than rejecting them: when no Bearer token is present, it sets `request.user = UnauthenticatedUser()` and lets the request proceed. Auth *enforcement* happens at a second layer — `RequireAuthMiddleware` — which wraps only the MCP transport route and returns 401 for unauthenticated users. + +``` +HTTP Request + └── RequestContextMiddleware (outermost) + └── AuthenticationMiddleware (BearerAuthBackend) + └── Token present & valid: scope["user"] = AuthenticatedUser(access_token) + └── Token missing/invalid: scope["user"] = UnauthenticatedUser() + └── AuthContextMiddleware + └── Stores user in context var for MCP SDK + └── Router (first-match-wins) + └── MCP transport route → RequireAuthMiddleware → 401 if unauthenticated + └── custom_route() handlers → NO RequireAuthMiddleware (handler must check) + └── app.mount() SPA → NO RequireAuthMiddleware +``` + +This means `custom_route()` handlers that need auth **must explicitly guard**: + +```python +from mcp.server.auth.middleware.bearer_auth import AuthenticatedUser + +def get_authenticated_user(request: Request) -> AuthenticatedUser | None: + if isinstance(request.user, AuthenticatedUser): + return request.user + return None + +@mcp.custom_route("/auth/mural-link/code", methods=["POST"]) +async def generate_code(request: Request) -> Response: + user = get_authenticated_user(request) + if user is None: + return JSONResponse({"error": "unauthorized"}, status_code=401) + user_id = user.token.claims["sub"] + # ... handler logic +``` + +This `get_authenticated_user` helper belongs in the auth module and is reused across all protected `custom_route()` handlers. Unauthenticated routes (`/health`, SPA mount) need no guard — they work naturally because the middleware lets them through. + +### Environment Variable Changes + +| Current | Target | +|---|---| +| `BETTER_AUTH_SECRET` | Eliminated | +| `BETTER_AUTH_BASE_URL` | `BANKSY_BASE_URL` (used for PRM resource URL) | +| `GOOGLE_CLIENT_ID` / `GOOGLE_CLIENT_SECRET` | Eliminated (IdP handles Google) | +| `SSO_PROXY_URL` | Eliminated (SSO proxy not needed with dedicated IdP) | +| `MURAL_OAUTH_*` | Kept if Session Activation requires Mural OAuth | +| (new) `IDP_JWKS_URI` | JWKS URL for JWT verification | +| (new) `IDP_ISSUER` | Expected JWT issuer | +| (new) `IDP_AUDIENCE` | Expected JWT audience | +| (new) `IDP_AUTHORIZATION_SERVER` | IdP URL for PRM metadata | +| (new) `BANKSY_MODE` | `internal`, `public`, or `dev` — selects auth provider and tool set (see Server Topology) | +| (new) `ENABLED_TAGS` | Optional comma-separated tag filter for specialized deployments (e.g., `read`) | + +**Key TS reference**: + +- `packages/banksy-core/src/lib/auth/provider.ts` — Better Auth config +- `packages/banksy-core/src/lib/mural-session/router.ts` — activation code/claim routes +- `packages/banksy-core/src/lib/mural-tool-caller.ts` — per-user token injection +- `packages/banksy-core/src/lib/mural-session/token-refresh.ts` — refresh logic +- `packages/banksy-mural-api/src/per-request-auth.ts` — AsyncLocalStorage pattern (replaced by DI) + +--- + +## SPA and UI + +Banksy serves a React SPA for browser-facing pages from the same FastMCP process via Starlette's `StaticFiles`. + +### SpaStaticFiles + +Starlette does not have a built-in SPA catch-all mode. The standard pattern is to subclass `StaticFiles` to fall back to `index.html` for unrecognized paths: + +```python +class SpaStaticFiles(StaticFiles): + async def lookup_path(self, path: str): + full_path, stat_result = await super().lookup_path(path) + if stat_result is None: + return await super().lookup_path("./index.html") + return full_path, stat_result +``` + +Mount after `http_app()` so API routes take precedence: + +```python +app = mcp.http_app(transport="streamable-http") +app.mount("/", SpaStaticFiles(directory=str(SPA_DIR), html=True), name="spa") +``` + +Starlette evaluates routes in declaration order. MCP transport, auth routes, and `custom_route()` endpoints are all registered before the SPA mount — they always take precedence. The SPA catch-all only handles what nothing else matched. + +### Surviving vs. Eliminated Pages + +| Page | Status | Notes | +|---|---|---| +| `HomePage` | **Kept** | Landing page with AI assistant client selection | +| `ClaudeLanding` | **Kept** | Client-specific landing (`/?client=claude-ai`) | +| `SessionActivation` | **Kept** | Mural account linking (`/auth/connect?code=...`) | +| `CompletionPage` | **New** | Success/error display after OAuth (`/auth/complete`) | +| `ErrorPage` | **New** | Error display (`/auth/error?message=...`) | +| `SignInPage` | **Eliminated** | Better Auth sign-in replaced by IdP login | +| `GoogleCallback` | **Eliminated** | SSO proxy Google callback is server-side | +| `MuralCallbackPage` | **Eliminated** | Client-side Mural callback replaced by server-side handler | +| `MuralOAuthCallbackPage` | **Eliminated** | Mural OAuth callback is server-side | + +### Development Workflow + +Two processes in dev mode: + +1. **Terminal 1**: FastMCP server — `fastmcp dev` or `uvicorn banksy.server:app --port 8000 --reload` +2. **Terminal 2**: Vite dev server — `cd ui && npm run dev` (port 3002) + +Vite proxies API requests (`/mcp`, `/auth/*`, `/health`, `/.well-known`) to FastMCP on port 8000. HMR works without special configuration because WebSocket connections stay within Vite's domain. + +### Mural Design System + +The SPA uses `@muraldevkit/ui-toolkit`, `@muraldevkit/ds-foundation`, and related packages. These are published npm packages from Mural's private registry — they work independently of any parent TypeScript monorepo. The `ui/` directory's `.npmrc` configures private registry access. + +**Note**: The current codebase has two separate Vite builds: `vite.config.mjs` (auth UI SPA at `/auth/`) and `vite.home.config.mjs` (home page, single-file build to `home.html`). These are intentionally consolidated into one SPA in the target `ui/` directory, since most auth callback pages are eliminated and the remaining pages (home, Session Activation, completion, error) fit naturally in a single SPA with client-side routing. + +**Key TS reference**: + +- `packages/banksy-core/ui/src/components/` — current SPA components +- `packages/banksy-core/ui/vite.config.mjs` — auth UI Vite configuration +- `packages/banksy-core/ui/vite.home.config.mjs` — home page Vite configuration + +--- + +## Database + +Fresh PostgreSQL schema — no data migration from the TS version. + +### Schema and Models + +SQLAlchemy async models for token persistence: + +- `MuralToken` — userId, accessToken, refreshToken, tokenType, expiresAt +- `PendingMuralConnection` — code, nonce, userId, expiresAt (for Session Activation) +- Optional lightweight `User` table mapping `idp_sub` → banksy user ID (if needed for internal references) + +The current Better Auth tables (`user`, `session`, `account`, `verification`) are eliminated — the IdP manages users and sessions. + +### Alembic Migrations + +- Async Alembic requires `run_async()` in `env.py` +- Auto-migration generation reads SQLAlchemy models for auto-diff +- Initial migration creates `mural_tokens` and `pending_mural_connections` + +### Configuration + +- `DATABASE_URL` env var for async engine: `postgresql+asyncpg://...` +- Lifespan-managed connection pool (created at startup, closed at shutdown) + +**Escape hatch**: If the schema stays at 3–5 tables with pure CRUD, raw asyncpg + dbmate migrations may be simpler than SQLAlchemy. Evaluate after the schema stabilizes. + +### Content Session Tokens + +Some internal API tools (7 of 39, including `update-widgets`, `store-image`, `update-label`, and the `post-murals-*` tools) require a **content session token** — a short-lived token representing an active editing session within a specific mural. These are distinct from standard Mural access tokens. + +The current TS implementation uses a strategy pattern (`ContentSessionProvider`) that routes through different endpoints based on auth mode: +- **sso-proxy mode**: `POST /api/v0/content/murals/:muralId/sessions` (internal endpoint, session JWT) +- **mural-oauth mode**: `POST /api/public/v1/murals/:muralId/content-session` (public endpoint, OAuth token) + +**Python port**: This is straightforward to replicate. A `get_content_session(user_id, mural_id)` async function calls the appropriate endpoint using the user's stored Mural token, returns a short-lived content session token, and the tool passes it as an auth header on subsequent widget-manipulation calls. Content session tokens are **not stored in the database** — they're ephemeral, acquired per-tool-call, and used immediately. + +```python +async def get_content_session(user_id: str, mural_id: str) -> str: + mural_token = await token_store.get_valid_token(user_id) + resp = await http_client.post( + f"/api/public/v1/murals/{mural_id}/content-session", + headers={"Authorization": f"Bearer {mural_token}"}, + ) + return resp.json()["value"]["token"] +``` + +**Key TS reference**: + +- `packages/banksy-core/src/lib/mural-session/content-session-provider.ts` — strategy pattern for content sessions +- `packages/banksy-core/src/lib/auth/mural-session.ts` — Better Auth plugin schema +- `packages/banksy-core/src/lib/db/mural-tokens.ts` — token CRUD operations + +--- + +## Code Organization + +Within the `banksy-server` workspace member, tools are organized by **domain** — self-contained directories with `register_*_tools(mcp)` functions. Workspace boundaries (future) handle dependency isolation between services; domains handle tool organization within the server. + +### Domains + +``` +pypackages/server/src/banksy_server/domains/ +├── internal/ # register_internal_tools(mcp) — 39 tools calling mural-api internal REST +│ ├── __init__.py +│ └── tools.py +├── public/ # register_public_tools(mcp) — 87 tools calling mural-api public REST +│ ├── __init__.py +│ └── tools.py +├── canvas/ # register_canvas_tools(mcp) — future canvas-mcp tools (absorbed, not mounted) +│ ├── __init__.py +│ └── tools.py +└── shared/ # Utility tools available in all modes (echo, version) + ├── __init__.py + └── tools.py +``` + +Each domain's `register_*_tools(mcp)` function takes a `FastMCP` instance and registers all tools for that domain, including tags and metadata. The domain owns its tool definitions, schemas, and any domain-specific helpers. `server.py` calls the appropriate registration functions based on `BANKSY_MODE` (see Server Topology). + +### from_openapi() in Domain Context + +`FastMCP.from_openapi()` tools are generated at startup, not as files in the directory tree. They are invoked from within a domain's registration function: + +```python +# banksy_server/domains/public/__init__.py +from banksy_server.mural_api import create_public_api_tools + +def register_public_tools(mcp: FastMCP) -> None: + from .tools import my_composite_tool + mcp.add_tool(my_composite_tool) + + public_api = create_public_api_tools() + mcp.mount(public_api, namespace="mural") +``` + +### Auth Provider Factory + +`auth/providers.py` contains a single `create_auth_provider(mode)` function that returns the correct `AuthProvider` for the given mode. This keeps `server.py` clean and makes mode-specific auth configuration testable in isolation. + +### Routes by Concern + +Non-MCP HTTP routes (`routes/`) are organized by concern, not by mode. Mode-specific routes are registered conditionally in `server.py` based on `BANKSY_MODE` — for example, Session Activation routes are only registered in `internal` and `dev` modes. + +### Canvas-MCP Absorption + +The canvas-mcp prototype's tools are absorbed into `banksy-server` under `domains/canvas/`, not as a separate workspace member. Rationale: canvas-mcp currently has only two trivial tools, `mount()` ignores child auth anyway, and the domain concept provides sufficient organizational separation. + +### Future Extraction to banksy-shared + +Code that will eventually be shared — SQLAlchemy models (`models/`), auth utilities, Mural API client — lives inside `banksy-server` for now. When a second consumer (the agent harness) arrives, extract this code into a new `banksy-shared` workspace member under `pypackages/shared/`. Until then, keeping it in the server avoids premature abstraction. + +**Deep dive**: [banksy-architecture-research.md](../banksy-research/banksy-architecture-research.md) Section 8 + +--- + +## HTTP Routes + +Banksy needs non-MCP HTTP endpoints (health, auth callbacks, Session Activation). FastMCP provides `custom_route()` for this — a zero-cost pass-through to `starlette.routing.Route`. Since FastMCP is built on Starlette, the underlying framework is already there if more routing power is ever needed. + +### Current Route Inventory + +Routes in the TS codebase that need consideration: + +| Route | Method | Status | Notes | +|---|---|---|---| +| `/health` | GET | **Port** | Health check; `custom_route()` in Phase 1 | +| `/auth/mural-link/code` | POST | **Port** | Session Activation: generate activation code | +| `/auth/mural-link/claim` | POST | **Port** | Session Activation: claim Mural tokens | +| `/auth/mural-oauth/callback` | GET | **Port** | Mural OAuth callback (server-side) | +| `/auth/config` | GET | **Eliminate** | Returns auth mode config; replaced by IdP discovery (PRM) | +| `/api/auth/*` | * | **Eliminate** | Better Auth routes; replaced by IdP | +| `/favicon.svg`, `/favicon.ico` | GET | **SPA handles** | Served by `SpaStaticFiles` static mount | +| `/.well-known/oauth-protected-resource` | GET | **Auto-generated** | FastMCP generates PRM from `RemoteAuthProvider` config | + +### custom_route() + +A thin decorator over Starlette's routing. Handlers receive a `Request`, return a `Response`. Supports any HTTP method. + +```python +@mcp.custom_route("/health", methods=["GET"]) +async def health(request: Request) -> JSONResponse: + return JSONResponse({"status": "ok"}) +``` + +**Limitations**: No FastMCP dependency injection. No built-in route grouping. No request body validation via Pydantic. For a handful of routes (health, 2-3 auth endpoints), this is fine. + +### Two Middleware Layers + +FastMCP has two distinct middleware layers: + +1. **MCP protocol middleware** — operates on tool/resource calls within the MCP protocol. Hooks: `on_request`, `on_after_initialize`. Used for tool-level auth, logging, rate limiting. +2. **HTTP/ASGI middleware** — operates on all HTTP traffic including `custom_route()` endpoints. Standard Starlette middleware. Includes `BearerAuthBackend` for auth. + +Both MCP tools and `custom_route()` handlers share the same `BearerAuthBackend` — auth validation is automatic for both. + +### Graduated Approach + +1. **Phase 1**: `custom_route("/health")` for the health endpoint +2. **Phases 6-7**: `custom_route()` for Session Activation routes (`/auth/mural-link/code`, `/auth/mural-link/claim`, `/auth/mural-oauth/callback`) with explicit auth guard +3. **If route grouping needed**: Use `starlette.routing.Router` with prefix, pass as `routes` parameter to `http_app()` or compose with the app directly +4. **If per-route middleware needed**: Use Starlette's `Route` with middleware wrapper (same pattern as `RequireAuthMiddleware`) +5. **If sub-applications needed**: Use `starlette.routing.Mount` with a sub-app +6. **If Pydantic request validation + OpenAPI generation needed for HTTP endpoints**: Only then consider FastAPI + +Banksy's current and foreseeable needs (health, 3 auth endpoints, SPA) fall squarely in step 1–2. Steps 3–5 are available without adding any dependency — Starlette is already underneath FastMCP. Step 6 is unlikely to be needed. + +### Route Precedence with SPA + +All `custom_route()` endpoints are registered before the SPA mount. Starlette's `Route` objects (exact match) always take precedence over `Mount` objects (prefix match). The SPA catch-all only fires when no API route matched. + +**mount() caveat**: `mount()` on child FastMCP servers does not expose their `custom_route()` endpoints on the parent. Custom routes must be registered on the main server directly. + +--- + +## Testing + +Comprehensive test suite using FastMCP's testing utilities alongside standard httpx for HTTP route testing. + +### FastMCP Client Fixtures + +In-memory testing via `Client(transport=server)`: + +```python +async with Client(transport=mcp) as client: + result = await client.call_tool("echo", {"message": "hello"}) +``` + +Use `asyncio_mode = "auto"` in pytest config to eliminate `@pytest.mark.asyncio` boilerplate. + +### HTTP Route Testing + +`httpx.AsyncClient` with ASGI transport for testing `custom_route()` endpoints: + +```python +async with httpx.AsyncClient(transport=httpx.ASGITransport(app=app)) as client: + resp = await client.get("/health") + assert resp.status_code == 200 +``` + +### Mock Strategies + +- **respx** for mocking httpx requests to Mural API (transport-level interception, no monkeypatching) +- In-memory SQLite or test PostgreSQL for database tests +- `StaticTokenVerifier` with known test tokens for auth-gated tests +- FastMCP's `HeadlessOAuth` for OAuth flow testing + +### Test Organization + +Tests are co-located inside the server workspace member. A root-level `conftest.py` provides shared fixtures that will be reusable by future members. + +``` +conftest.py # Root-level shared fixtures (DB engine, mock HTTP) +pypackages/server/tests/ +├── conftest.py # Server-specific fixtures (FastMCP server, auth mocks) +├── test_echo.py # Basic tool tests +├── test_mural_tools.py # OpenAPI-generated tool tests +├── test_token_storage.py # DB token CRUD +├── test_token_refresh.py # Token refresh logic +├── test_auth_flow.py # OAuth flow (HeadlessOAuth) +├── test_session_activation.py # Session Activation routes +├── test_mode_selection.py # BANKSY_MODE startup paths +└── test_integration/ # End-to-end tests +``` + +Run with: `uv run --package banksy-server pytest` + +When `banksy-shared` or `banksy-harness` are added, they get their own `tests/` directories under `pypackages/`. The root `conftest.py` is automatically inherited. + +### Migration from Existing Tests + +The TS codebase has ~15 Vitest test files. These should be reviewed as reference for Python test coverage targets. Many will be redundant since the auth callback SPA pages they test are being eliminated. The remaining relevant tests (token storage CRUD, Session Activation flow, content session provider logic) should be ported to their pytest equivalents. + +--- + +## Deployment and Cleanup + +### Dockerfile.server + +Workspace-aware multi-stage build using uv. One Docker image serves all modes — `BANKSY_MODE` is a runtime env var, not build-time. This replaces the two current TS Dockerfiles (`Dockerfile` for sso-proxy, `Dockerfile.mural-oauth` for mural-oauth). + +```dockerfile +# Stage 1: Build SPA +FROM node:22-alpine AS spa-builder +WORKDIR /app/ui +COPY ui/package.json ui/package-lock.json ui/.npmrc ./ +RUN --mount=type=secret,id=npm_token_ro,env=NPM_TOKEN npm ci --ignore-scripts +COPY ui/ ./ +RUN npm run build + +# Stage 2: Python dependencies +FROM python:3.14-slim AS builder +COPY --from=ghcr.io/astral-sh/uv:0.10 /uv /uvx /bin/ +WORKDIR /app + +COPY pyproject.toml uv.lock ./ +COPY pypackages/server/pyproject.toml pypackages/server/pyproject.toml + +RUN --mount=type=cache,target=/root/.cache/uv \ + uv sync --frozen --no-install-workspace --no-dev --package banksy-server + +COPY pypackages/server/src pypackages/server/src + +RUN --mount=type=cache,target=/root/.cache/uv \ + uv sync --locked --no-dev --no-editable --package banksy-server + +# Stage 3: Production image +FROM python:3.14-slim +COPY --from=builder /app/.venv /app/.venv +COPY --from=spa-builder /app/ui/dist /app/ui/dist +COPY migrations /app/migrations +ENV PATH="/app/.venv/bin:$PATH" +ENV BANKSY_SPA_DIR=/app/ui/dist +WORKDIR /app +EXPOSE 8000 +CMD ["banksy-server"] +``` + +A `Dockerfile.harness` for the agent harness is documented in the architecture research ([Section 7.6](../banksy-research/banksy-architecture-research.md)) but is not created until the harness workspace member exists. + +### Entrypoint + +Alembic migrations run before server start: + +```bash +alembic upgrade head && banksy-server +``` + +### TS Artifact Removal (Phase 9) + +Delete all TypeScript artifacts: + +- `packages/` (entire directory) +- `package.json`, `package-lock.json`, `tsconfig.json` +- `node_modules`, `.npmrc` +- `xmcp.config.ts`, `scripts/create-tool-handlers.mjs` +- `eslint.config.mjs`, `.prettierrc`, `.prettierignore`, `knip.jsonc` +- `.nvmrc`, `.husky/` +- `Dockerfile.mural-oauth` +- `.github/workflows/quality.yml` (TS version) + +Rewrite `README.md`, `docs/AUTH.md`. Update `.env.example` for Python env vars. + +**Verify**: `rg -l 'node_modules|package\.json|\.ts\b' --type-not py` finds nothing. + +--- + +## Bootstrap pyproject.toml Files + +Two `pyproject.toml` files are created in Phase 1. The architecture research documents the planned `banksy-shared` and `banksy-harness` configs ([Sections 7.4–7.5](../banksy-research/banksy-architecture-research.md)) for reference when those members are needed, but they are not created now. + +### Workspace Root (`pyproject.toml`) + +```toml +[project] +name = "banksy-workspace" +version = "0.0.0" +description = "Banksy monorepo workspace root" +requires-python = ">=3.14" + +[tool.uv.workspace] +members = ["pypackages/*"] + +[build-system] +requires = ["hatchling"] +build-backend = "hatchling.build" + +[dependency-groups] +dev = [ + "pyright>=1.1.0", + "pytest>=8.0.0", + "pytest-asyncio>=0.24.0", + "pytest-cov>=6.0", + "inline-snapshot>=0.15", + "dirty-equals>=0.8", + "respx>=0.22", + "ruff>=0.8.0", + "pre-commit>=4.0.0", +] + +[tool.ruff] +target-version = "py314" +line-length = 88 +src = ["pypackages/*/src", "pypackages/*/tests"] + +[tool.ruff.lint] +select = ["E", "W", "F", "I", "B", "UP", "S", "ASYNC", "RUF"] +ignore = ["B008"] + +[tool.ruff.format] +quote-style = "double" +docstring-code-format = true + +[tool.pyright] +pythonVersion = "3.14" +typeCheckingMode = "strict" +reportUnnecessaryTypeIgnoreComment = true +include = ["pypackages/*/src", "pypackages/*/tests"] + +[tool.pytest.ini_options] +asyncio_mode = "auto" +addopts = "-v --tb=short --strict-markers" +markers = [ + "integration: marks tests requiring database or external services", + "slow: marks tests that take >1s", +] +``` + +### Server Member (`pypackages/server/pyproject.toml`) + +```toml +[project] +name = "banksy-server" +version = "0.1.0" +description = "Banksy MCP server" +requires-python = ">=3.14" +dependencies = [ + "fastmcp>=3.1", + "httpx>=0.27", + "httpx-retries>=0.1", + "pydantic>=2.0", + "pydantic-settings>=2.0", + "sqlalchemy[asyncio]>=2.0", + "asyncpg>=0.30", + "alembic>=1.15", +] + +[project.scripts] +banksy-server = "banksy_server.server:main" + +[build-system] +requires = ["hatchling"] +build-backend = "hatchling.build" + +[tool.pytest.ini_options] +testpaths = ["tests"] +``` + +When `banksy-shared` is needed, add it as a workspace dependency: + +```toml +# Add to pypackages/server/pyproject.toml +dependencies = ["banksy-shared", ...] + +[tool.uv.sources] +banksy-shared = { workspace = true } +``` + +--- + +## Risks + +| Risk | Likelihood | Impact | Mitigation | +|---|---|---|---| +| Husky hooks fire on Python commits | High | Low | Replace `.husky/pre-commit` with `pre-commit run` in Phase 1 | +| `from_openapi()` struggles with Mural spec | Medium | High | Test early in Phase 2; fall back to manual tool definitions for problematic endpoints | +| IdP integration delays auth phases | Low | Medium | Google is the initial IdP (decision made); use `StaticTokenVerifier` in early phases; enterprise IdP support is configuration, not code | +| `structuredContent` missing in FastMCP | Known | Medium | Return JSON as text; monitor FastMCP releases | +| SSO Proxy integration | Medium | Medium | Proxy is a separate service; Python must replicate the interaction or confirm it's framework-agnostic | +| Content session tokens | Medium | Medium | Separate flow from standard Mural tokens; port `ContentSessionProvider` pattern (see Content Session Tokens section) in Phase 7. Straightforward — ephemeral tokens, no storage. | +| Tool naming parity | Low | Medium | If consumers reference tools by current names, rename is breaking; consider compatibility mapping | +| Dual config at root confuses IDE | Low | Low | Python tools ignore `package.json`; npm ignores `pyproject.toml`; verify Cursor behavior in Phase 1 | +| uv pre-1.0 API instability | Low | Medium | `pyproject.toml` is PEP 621 portable; pin uv version in CI | +| SQLAlchemy async + Alembic friction | Low | Medium | Requires `run_async()` in `env.py`; well-documented pattern | +| `custom_route()` lacks DI and route grouping | Medium | Low | Start simple; Starlette `Router` and `Mount` are already available underneath FastMCP if routes grow | +| SPA build adds Node.js as build dependency | Low | Low | Docker multi-stage isolates Node to build stage; no Node in production image | +| Mural DS version drift when SPA is decoupled | Low | Medium | Pin exact versions in `ui/package.json`; CI step verifies SPA builds | +| No import boundary enforcement in uv workspaces | Low | Low | Enforce via ruff import conventions; workspace members get true isolation when extracted | +| Tag taxonomy maintenance burden | Low | Low | domain+entity+capability taxonomy is stable; new tags documented in Tag Taxonomy section | + +### Escape Hatches + +- **SQLAlchemy → raw asyncpg + dbmate**: If the schema stays at 3–5 tables with pure CRUD, drop the ORM. +- **Pyright strict → standard**: If strict mode causes excessive friction with third-party library types, relax to standard temporarily and tighten later. +- **Single workspace member → multi-member**: When a second Python service is needed (e.g., agent harness), add a directory under `pypackages/` with its own `pyproject.toml`. Extract shared code into `banksy-shared` at that time. The workspace glob auto-discovers new members. +- **pre-commit → CI only**: If hooks cause friction during rapid iteration and the team is 1–2 developers, rely on CI alone. +- **custom_route() → raw Starlette routing**: If HTTP routes grow complex, use `starlette.routing.Router` for grouping, `Mount` for sub-apps, or Starlette middleware wrappers for per-route concerns. FastAPI is a last resort — Starlette is already underneath FastMCP. +- **`BANKSY_MODE` per-deployment → mode merging**: If a future need requires multi-auth in a single process, revisit Option B (protocol-level routing) or Option D (middleware-based auth) from the [server topology analysis](../banksy-research/tool-visibility-server-topology-research.md). + +--- + +## Open Decisions + +Items from the canvas-mcp alignment assessment and architecture research that are resolved in direction but not yet finalized in implementation detail: + +| # | Topic | Options | Status | +|---|---|---|---| +| 5 | Health endpoint timing | (a) Add in Phase 1 alongside echo tool (b) Add in Phase 2 alongside OpenAPI tools | Open | +| 6 | Profiling middleware from canvas-mcp | (a) Adopt in Phase 1 as opt-in dev tool (b) Skip — add only at absorption time | Open | +| 7 | Profiling dev deps (`pyinstrument`, `memray`) | (a) Add to banksy dev deps (b) Skip — canvas-mcp-only concern | Open | +| 8 | Tool tags and meta (`tags={}`, `meta={}`) | Three-dimensional taxonomy (domain, entity, capability) adopted. Tags mandatory on all tools. `meta={}` carries additional structured metadata but is not used for visibility filtering. See Tag Taxonomy section. | **Resolved** | +| 9 | Composite tools needing multiple auth modes | (a) Machine-to-machine token (b) Agent orchestration via harness (c) Dual-token mode — defer until a concrete use case exists | Open | +| 10 | Enterprise per-customer tool subsets | Achievable via `ENABLED_TAGS` or custom Transform — no architectural changes needed | Resolved (direction) | +| 11 | `from_openapi()` + mode selection | Verify `tags=` support on generated tools; fall back to Transform if not | Open | +| 12 | When to extract `banksy-shared` | Trigger: when a second consumer (agent harness) needs shared code (models, auth utils, Mural client) | Open (deferred) | +| 13 | When to create `banksy-harness` | Trigger: when agent orchestration work begins | Open (deferred) | + +--- + +## Deep Research Index + +All supporting research documents produced during this migration planning effort, organized by topic. Each document contains detailed findings, code samples, and analysis that informed the decisions in this plan. + +### Architecture and Framework + +- [xmcp vs FastMCP deep dive](../research-xmcp-vs-fastmcp-deep-dive.md) — Technical comparison: tools, schema, DI, composition, transforms, new capabilities +- [MCP server frameworks landscape](../research-mcp-server-frameworks.md) — Survey of official SDKs and community frameworks across languages +- [FastMCP project structure and patterns](../research-fastmcp-project-structure.md) — Layout, testing, config, `from_openapi()`, DI, uv workspaces +- [FastMCP project patterns (condensed)](03-fastmcp-project-patterns.md) — Summary of conventions used in the execution strategy + +### Auth + +- [FastMCP auth strategy](../fastmcp-auth-strategy.md) — Initial auth mapping: OAuthProxy, Mural vs SSO-proxy, risk assessment +- [FastMCP auth migration research](../banksy-research/fastmcp-auth-migration-research.md) — Deep dive: RemoteAuthProvider vs OAuthProxy, Layer 2 token management, Session Activation, middleware architecture +- [Resource server migration evaluation](../resource-server-migration-eval.md) — RS vs AS design, Layer 1 scope, IdP requirements, sequencing +- [Security audit analysis](../security-audit-analysis.md) — 8-ticket audit review, mural-oauth extrapolation, hardening sequence +- [Auth provider alternatives](../auth-provider-alternatives.md) — 12 IdP evaluation: Auth0, Descope, Keycloak, and others + +### Toolchain + +- [Project management and monorepo](05-toolchain-project-mgmt-monorepo.md) — uv vs Poetry vs PDM; uv workspaces vs single pyproject +- [Linting, typing, testing, and hooks](06-toolchain-linting-typing-testing-hooks.md) — Ruff, Pyright, mypy, pytest, pre-commit evaluation +- [Database, HTTP, and config](07-toolchain-db-http-config.md) — SQLAlchemy vs asyncpg, Alembic vs dbmate, httpx, pydantic-settings +- [Python 3.14 compatibility](../banksy-research/python-314-compatibility-research.md) — GIL background, free-threading relevance, dependency compatibility, Pydantic partial support +- [Pyright strict dependency typing](../banksy-research/pyright-strict-dependency-typing-research.md) — Per-dependency type support assessment, strict mode burden evaluation + +### Repo and Migration + +- [Banksy repo exploration](02-banksy-repo-exploration.md) — Repo map: packages, config, CI/CD, Docker, TS/Python config conflicts +- [Repo organization evaluation](04-repo-organization-evaluation.md) — Four approaches compared; Approach D (hybrid) chosen +- [Supporting docs digest](01-supporting-docs-digest.md) — Consolidated digest of all pre-strategy research + +### Implementation Details + +- [FastMCP custom routes](../banksy-research/fastmcp-custom-routes-research.md) — `custom_route()` API, limitations, two middleware layers (initial research; framing updated by Starlette routing research) +- [Starlette vs custom_route() for HTTP endpoints](../banksy-research/fastmcp-starlette-routing-research.md) — Source-level analysis of `custom_route()` internals, auth middleware behavior (permissive `BearerAuthBackend` vs enforcing `RequireAuthMiddleware`), auth guard pattern, eliminates "graduate to FastAPI" framing +- [Serving a React SPA from FastMCP](../banksy-research/fastmcp-react-spa-serving-research.md) — SpaStaticFiles, Vite build, dev mode HMR, Docker multi-stage, route precedence +- [structuredContent support](../research-fastmcp-structuredContent-support.md) — FastMCP v3.34.0 does not support structuredContent on success path; JSON-as-text workaround + +### Server Topology and Workspace Architecture + +- [Banksy architecture research](../banksy-research/banksy-architecture-research.md) — Combined: server topology (Option E) + uv workspace layout + code organization (domains, tag taxonomy) + migration impact. Rolls up the two feeder documents below. +- [Tool visibility and server topology research](../banksy-research/tool-visibility-server-topology-research.md) — Auth × tool matrix, FastMCP auth/mount constraints, Options A–E analysis, tag taxonomy design +- [Monorepo layout and agent harness research](../banksy-research/monorepo-layout-agent-harness-research.md) — uv workspaces, 3-member package design (server/shared/harness), agent harness constraints, Docker builds, CI + +### Alignment + +- [Canvas-MCP alignment assessment](../banksy-research/canvas-mcp-alignment-assessment.md) — Toolchain comparison, structure comparison, absorption feasibility, 8 alignment decisions (4 resolved, 4 open) + diff --git a/fastmcp-migration/execution-strategy-research/01-supporting-docs-digest.md b/fastmcp-migration/execution-strategy-research/01-supporting-docs-digest.md new file mode 100644 index 0000000..573c775 --- /dev/null +++ b/fastmcp-migration/execution-strategy-research/01-supporting-docs-digest.md @@ -0,0 +1,127 @@ +# Banksy xmcp → Python FastMCP Migration Research — Structured Summary + +## Confirmed Decisions + +- **Architecture**: Single-process via `FastMCP.from_openapi()`; remove `banksy-mural-api` entirely. +- **Auth model**: Banksy becomes an OAuth Resource Server (RS) validating IdP-issued JWTs; no longer an Authorization Server. +- **Auth**: FastMCP OAuth plus custom logic for Mural token management; external IdP required. +- **IdP choice**: Dedicated IdP (Auth0 or Descope) with Mural as custom social connection is the recommended approach. +- **Mural as IdP**: Not viable (HS256 tokens/no JWKS, no discovery, MCP token passthrough prohibition). +- **Project structure**: Replace in-place on the feature branch; keep existing TS as reference until final cleanup. +- **Database**: Fresh PostgreSQL schema; no data migration. +- **Python tooling**: `uv` for project management, deps, and lockfile. +- **Branch strategy**: `feat/transition-to-fast-mcp` with each chunk as a mergeable PR into that branch. +- **Security hardening**: 7 of 8 audit tickets are backward-compatible and can ship before the migration. +- **Mode convergence**: sso-proxy and mural-oauth converge to the same RS architecture post-migration. + +--- + +## Open Questions + +- **IdP selection**: Auth0 vs Descope vs Keycloak; PoC needed to validate custom social connection and upstream token storage. +- **Auth0 Token Vault pricing**: Enterprise-only (likely $1,000+/mo); unclear if required. +- **Descope custom OAuth + Outbound Apps**: Unclear if they integrate cleanly; needs PoC. +- **FastMCP auth strategy risks**: Risks 2 and 4 (`get_access_token()` return value, token refresh lifecycle) still open. +- **SSO Proxy**: Current auth uses `SSO_PROXY_URL` for Google OAuth; Python must replicate or proxy must become framework-agnostic. +- **Content session tokens**: Some widget endpoints use separate content session flow; needs focused handling in PR 6. +- **Tool naming compatibility**: Renames in PR 3 may break LLM prompts/configs; need mapping or compatibility plan. + +--- + +## Key Architectural Constraints + +- **Two-layer auth**: Layer 1 (IDE → Banksy) via IdP JWT; Layer 2 (Banksy → Mural) via stored Mural tokens; layers must not collapse. +- **PRM**: Serve `/.well-known/oauth-protected-resource` (RFC 9728) pointing to the external IdP. +- **FastMCP auth classes**: Use `RemoteAuthProvider` (DCR) or `OAuthProxy` (non-DCR); bare `JWTVerifier` is not enough. +- **User coverage**: IdP must support Mural as custom upstream OAuth so email/password and non-Google users are supported. +- **IDE targets**: Cursor, VS Code, Claude Desktop support PRM; Zed and Continue.dev do not. + +--- + +## Tooling Assumptions Made (needing verification) + +- **`uv`**: Used as build/packaging tool; not explicitly validated for the full workflow. +- **`ruff`** vs **`mypy`**: Listed as dev deps; assumed adequate without comparison to alternatives. +- **`asyncpg` + SQLAlchemy (async) + Alembic**: Assumed sufficient for token storage and migrations. +- **FastMCP `from_openapi()` maturity**: May struggle with Mural spec (polymorphism, nested refs); needs early testing. +- **Auth0 Token Vault**: Mentioned as if decided; actually depends on PoC and pricing. +- **Descope Outbound Apps**: Suggested as Token Vault alternative; integration with custom OAuth needs PoC. + +--- + +## Auth Architecture Summary + +### Layer 1 (IDE → Banksy) + +- Banksy as Resource Server: validates IdP-issued JWT (signature via JWKS, issuer, audience, expiration). +- **`RemoteAuthProvider`**: for DCR-capable IdPs (Auth0, Descope); pure RS. +- **`OAuthProxy`**: for non-DCR IdPs (Google, Azure AD); reintroduces some AS-like surface. +- **IdP choice**: Dedicated IdP with Mural as custom social connection so all user segments (email/password, Google, Microsoft, SAML, etc.) are covered. +- **Token storage**: Optional upstream token storage in IdP (Auth0 Token Vault, Descope Outbound Apps) to keep single-step UX; otherwise separate Mural OAuth step. + +### Layer 2 (Banksy → Mural) + +- Mural tokens stored in PostgreSQL (access, refresh, expiry). +- Tokens obtained via activation code/nonce flow (sso-proxy) or embedded in Mural OAuth (mural-oauth). +- Per-request injection via FastMCP DI into `httpx.AsyncClient`; no AsyncLocalStorage. +- Token refresh with expiry buffer; transparent refresh before use. + +### Token Security + +- Plaintext refresh tokens in Postgres must be fixed (encrypted via Azure Key Vault or similar). +- Envelope encryption; decrypt-on-read, encrypt-on-write. + +--- + +## FastMCP-Specific Patterns + +### In Use + +- **`FastMCP.from_openapi()`**: replaces code-gen and `banksy-mural-api`; tools from spec, HTTP via `httpx.AsyncClient`. +- **`RouteMap`**: GETs as RESOURCE, POST/PUT/DELETE as TOOL, deprecated/internal as EXCLUDE. +- **`mount()`**: mount OpenAPI sub-server onto main app. +- **Tool transforms**: `ArgTransform` to hide, rename, describe, default; curate for LLM use. +- **DI**: per-request auth headers from session/context. +- **Tag-based visibility**: optional dynamic enable/disable by tags. + +### Limitations + +- **`structuredContent`**: not supported on success path in fastmcp v3.34.0; `outputSchema` only on listing. +- **Workaround**: return `JSON.stringify()` in text; clients parse; no typed validation or `structuredContent` semantics. +- **Impact**: `get-upload-url`-style tools with SAS URLs can still use text; lower reliability but not a hard blocker. +- **Spec gap**: 2025-06-18 adds `structuredContent`; fastmcp follows 2025-03-26-style result shape. + +--- + +## PR/Implementation Sequence (from transition doc) + +``` +PR1 (Bootstrap) ─┬─► PR2 (OpenAPI Tools) ──► PR3 (Tool Curation) + │ + └─► PR4 (DB Schema) ──► PR5 (Auth IDE-Banksy) ──► PR6 (Auth Banksy-Mural) + │ +PR2 ─────────────────────────────────────────────────────────┘ + │ +PR3 ────────────────────────────────────► PR7 (Testing) ◄────┘ + │ +PR7 ────────────────────────────────────► PR8 (Deploy + Cleanup) +``` + +- **PR 1**: Bootstrap — pyproject.toml, uv, FastMCP skeleton, echo tool. +- **PR 2**: OpenAPI tools — `from_openapi()`, RouteMap, filter by `.tools` allowlist. +- **PR 3**: Tool curation — transforms, LLM-friendly names/descriptions. +- **PR 4**: DB schema — models, Alembic, token storage. +- **PR 5**: Auth (IDE → Banksy) — OAuth, Google SSO, session/user models. +- **PR 6**: Auth (Banksy → Mural) — token exchange, refresh, injection via DI. +- **PR 7**: Tests — fixtures, mocks, `run_server_async`, `HeadlessOAuth`. +- **PR 8**: Deploy and cleanup — Dockerfile, remove TS, docs. + +**Parallel**: PR 2 and PR 4 after PR 1. + +--- + +## Additional Notes + +- **Security hardening order**: Tickets 2 (open redirect) and 6 (plaintext tokens) are highest priority; then 1, 4, 5, 7, 8. +- **IdP PoC sequence**: Auth0 first, then Descope, then Keycloak for self-hosted fallback. +- **Mural platform evolution**: JWKS, discovery, RFC 8693 would help long-term but are outside Banksy's control; treat as parallel roadmap discussion. diff --git a/fastmcp-migration/execution-strategy-research/02-banksy-repo-exploration.md b/fastmcp-migration/execution-strategy-research/02-banksy-repo-exploration.md new file mode 100644 index 0000000..2aec67a --- /dev/null +++ b/fastmcp-migration/execution-strategy-research/02-banksy-repo-exploration.md @@ -0,0 +1,351 @@ +# Banksy Repository Map + +## 1. Root-Level Files and Directories + +### Root Directory Tree (2-3 levels) + +``` +banksy/ +├── .cursor/ +│ ├── decisions/ +│ │ └── auth-context-propagation.md +│ ├── plans/ +│ │ ├── ai_foundry_tracing_setup_76a65f11.plan.md +│ │ ├── caddy_https_architecture_589a1276.plan.md +│ │ └── tool-proxying-implementation.plan.md +│ ├── prompts/ (20+ prompt files) +│ ├── research/ +│ │ ├── mural-api-internal-widget-endpoints.md +│ │ ├── mural-api-list-workspaces-findings.md +│ │ ├── auth-provider-alternatives.md +│ │ ├── resource-server-migration-eval.md +│ │ ├── security-audit-analysis.md +│ │ ├── ui-toolkit-bundle-bloat.md +│ │ └── prompt-investigate-list-workspaces-endpoint.md +│ ├── rules/ +│ │ └── lockfile-dependency-management.mdc +│ ├── EVAL_ARCHITECTURE_MURAL_INDEX.md +│ ├── research-*.md (xmcp/fastmcp, npm, etc.) +│ └── prompt-*.md (20+ prompt files) +├── .github/ +│ └── workflows/ +│ ├── build.yml +│ └── quality.yml +├── .husky/ +│ └── pre-commit → npx lint-staged +├── .vscode/ +│ ├── mcp.json +│ └── settings.json +├── docs/ +│ ├── AUTH.md +│ ├── SECURITY_POSTURE.md +│ └── local/ (docs/.local/ in .gitignore) +├── packages/ +│ ├── banksy-core/ +│ ├── banksy-mural-api/ +│ └── banksy-public-api/ +├── .dockerignore +├── .gitignore +├── .nvmrc → 22.21.0 +├── .prettierignore +├── .prettierrc +├── Dockerfile +├── Dockerfile.mural-oauth +├── eslint.config.mjs +├── knip.jsonc +├── package.json +├── package-lock.json +├── README.md +└── xmcp-fix-a.tgz (dev artifact) +``` + +### Root Config Files + +| File | Purpose | +|------|---------| +| **eslint.config.mjs** | ESLint 9 flat config; TypeScript + Prettier | +| **.prettierrc** | Prettier (semi, trailingComma, singleQuote, printWidth 100) | +| **.prettierignore** | Ignores node_modules, dist, lockfile, logs | +| **knip.jsonc** | Unused code; workspaces for banksy-core, mural-api, public-api | +| **.nvmrc** | Node 22.21.0 | +| **.gitignore** | node_modules, dist, .env, .tools, .xmcp, .cursor, mcp.config.json | +| **.dockerignore** | node_modules, dist, .git, .husky, .vscode, .idea | + +No root `tsconfig.json`; each package defines its own. + +--- + +## 2. packages/ Directory + +### banksy-core (`@banksy/core`) + +Role: OAuth-based xmcp MCP server; orchestrates tools from mural-api/public-api. + +Structure: + +``` +packages/banksy-core/ +├── src/ +│ ├── clients.ts +│ ├── middleware.ts +│ ├── lib/ +│ │ ├── config.ts +│ │ ├── auth/ +│ │ ├── db/ +│ │ └── mural-session/ +│ ├── generated/ +│ │ ├── client.index.ts +│ │ ├── client.banksy-mural-api.ts +│ │ └── client.banksy-public-api.ts +│ ├── scripts/ +│ │ └── migrate.ts +│ └── tools/ +│ ├── internal/ (39 tool handlers) +│ └── public/ (87 tool handlers) +├── ui/ +│ ├── src/ +│ │ ├── main.tsx +│ │ ├── home-main.tsx +│ │ ├── components/ +│ │ │ ├── sign-in.tsx +│ │ │ ├── oauth-callback.tsx +│ │ │ ├── sso-proxy-*.tsx +│ │ │ └── shared/ +│ │ └── lib/ +│ ├── vite.config.mjs +│ ├── vite.home.config.mjs +│ ├── tailwind.config.mjs +│ └── postcss.config.mjs +├── xmcp.config.ts +├── xmcp.config.mural-oauth.ts +├── entrypoint.sh +├── .env.example +├── .env.production +├── .env.production.mural-oauth +├── package.json +└── tsconfig.json +``` + +Key `package.json` details: + +- Scripts: `dev`, `build`, `start`, `migrate`, `generate`, `create-tool-handlers` +- Dependencies: `xmcp`, `better-auth`, `react`, `pg`, `zod`, `@muraldevkit/*`, `@tactivos/mural-shared` +- Entry: `node -r dotenv/config dist/http.js` + +Size: ~126 tool files (39 internal + 87 public), plus UI and libs. Largest package. + +--- + +### banksy-mural-api (`@banksy/mural-api`) + +Role: OpenAPI-based MCP server for internal Mural API (email/password + token refresh). + +Structure: + +``` +packages/banksy-mural-api/ +├── src/ +│ ├── index.ts +│ ├── login.ts +│ ├── credentials.ts +│ ├── env.ts +│ ├── per-request-auth.ts +│ ├── token-aware-transport.ts +│ ├── helpers.ts +│ └── helpers.test.ts +├── .env.example +├── .env.production +├── .tools / .tools.example +├── package.json +├── tsconfig.json +├── vitest.config.ts +└── declarations.d.ts +``` + +Key `package.json` details: + +- Main: `dist/index.js` +- Dependencies: `@ivotoby/openapi-mcp-server`, `express`, `axios`, `openapi-typescript`, `dotenv` +- Scripts: `dev` (tsx watch), `build` (tsc), `start`, `login`, `logout` + +Size: ~8 src files. Small, OpenAPI-driven. + +--- + +### banksy-public-api (`@banksy/public-api`) + +Role: OpenAPI-based MCP server for Mural Public API (OAuth-based). + +Structure: + +``` +packages/banksy-public-api/ +├── src/ +│ ├── index.ts +│ ├── login.ts +│ ├── credentials.ts +│ ├── env.ts +│ ├── per-request-auth.ts +│ ├── token-aware-transport.ts +│ ├── helpers.ts +│ └── helpers.test.ts +├── specs/ +│ └── public-api-openapi-bundle.yaml +├── .env.example +├── .env.production +├── .tools / .tools.example +├── package.json +├── tsconfig.json +└── vitest.config.ts +``` + +Key `package.json` details: + +- Same dependency pattern as `banksy-mural-api` +- Spec bundled in repo: `specs/public-api-openapi-bundle.yaml` + +Size: ~8 src files. Very similar to mural-api. + +--- + +## 3. CI/CD Pipeline + +### `.github/workflows/build.yml` + +- Runs on push to `main` and all PRs. +- Runner: `[self-hosted, linux, small]`. + +Jobs: + +1. **build-sso-proxy** + - Checkout → Node version check → DockerHub login → ACR login + - Build image (default Dockerfile) and push to `${{ secrets.DOCKER_REGISTRY_HOST }}/tactivos/banksy` + - Tags: branch, SHA, PR head SHA + - Uses secret `npm_token_ro` for install + +2. **build-mural-oauth** + - Same flow with `Dockerfile.mural-oauth` + - Push to `${{ secrets.DOCKER_REGISTRY_HOST }}/tactivos/banksy-mural-oauth` + +### `.github/workflows/quality.yml` + +- Same triggers. +- Jobs: Lint, Test, Knip. +- All use `tactivos/private-actions/node-consistency@main` and `NPM_TOKEN_RO`. + +--- + +## 4. Docker Builds + +### Dockerfile (sso-proxy) + +- Base: `node:22.21.0-alpine` +- Stages: `deps` → `builder` → `runner` +- Packages: `banksy-core`, `banksy-mural-api` +- Ports: 3001 (core), 5678 (mural-api) +- Entry: `entrypoint.sh` (migrate → start) +- `.env.production` from each package copied into image + +### Dockerfile.mural-oauth + +- Same base and stage pattern. +- Packages: `banksy-core`, `banksy-public-api` +- Ports: 3001 (core), 5679 (public-api) +- Swaps config: `cp xmcp.config.mural-oauth.ts xmcp.config.ts` before build +- Uses `.env.production.mural-oauth` for core + +--- + +## 5. Root package.json Summary + +- **Workspaces**: `packages/*` +- **Override**: `"xmcp": "0.6.2"` +- **Scripts**: build/dev/start for each package; lint, format, knip, test, migrate/generate +- **lint-staged**: eslint + prettier on `*.{ts,tsx}`, prettier on `*.{json,md}` +- **Husky**: `prepare` → husky install +- **Engine**: Node 22.21.0 + +--- + +## 6. docs/ Directory + +| File | Purpose | +|------|---------| +| **AUTH.md** | Two-layer auth (IDE→Banksy, Banksy→Mural), SSO proxy, OAuth flows | +| **SECURITY_POSTURE.md** | Security posture, risk assessment; K8s test env deployment, references `SECURITY_HARDENING_PLAN.md` | +| **local/** | In `.gitignore` (`.local/`) | + +--- + +## 7. .cursor/ Directory + +### Rules + +- `rules/lockfile-dependency-management.mdc` — lockfile handling, overrides, OOM, upgrades + +### Decisions + +- `decisions/auth-context-propagation.md` — how `userId` flows from auth to tool handlers + +### Plans + +- `plans/ai_foundry_tracing_setup_76a65f11.plan.md` +- `plans/caddy_https_architecture_589a1276.plan.md` +- `plans/tool-proxying-implementation.plan.md` + +### Research + +- Auth, resource server migration, Mural API, security, UI toolkit, etc. + +### Prompts + +- Many `prompt-*.md` and `prompts/*.md` files for OAuth, tools, eval pipeline, etc. + +--- + +## 8. Other Notable Files + +- **README.md** — Architecture, setup, env vars, development +- **LICENSE** — UNLICENSED +- **.env.example** (per package) — Auth, DB, API host, ports +- **entrypoint.sh** — Runs DB migration before start + +--- + +## Root Config Files That Would Conflict With a Python Project + +| File | Conflict | +|------|----------| +| **package.json** | npm workspace root; Python would typically use pyproject.toml or setup.py | +| **package-lock.json** | npm lockfile; Python uses pip/poetry/uv lockfiles | +| **eslint.config.mjs** | JS/TS linting; Python uses Ruff, flake8, pylint | +| **.prettierrc** | JS/TS formatting; Python uses Black, Ruff, autopep8 | +| **.prettierignore** | Prettier ignores | +| **knip.jsonc** | JS/TS unused-code detection | +| **.nvmrc** | Node version; Python uses .python-version or pyenv | +| **tsconfig.json** (in packages) | TypeScript; Python uses py.typed and type checkers | +| **lint-staged** (in package.json) | Runs eslint/prettier on commit | +| **.husky/pre-commit** | JS/TS lint-staged; Python would use different hooks | + +--- + +## Files Referencing External Systems + +| Location | Reference | +|----------|-----------| +| **.github/workflows/build.yml** | `secrets.DOCKER_REGISTRY_HOST`, DockerHub, ACR, `tactivos/banksy`, `tactivos/banksy-mural-oauth` | +| **.github/workflows/quality.yml** | `tactivos/private-actions/node-consistency@main`, `NPM_TOKEN_RO` | +| **docs/AUTH.md** | SSO proxy URL: `mural-testing-sso-proxy.mural.engineering` | +| **packages/banksy-core/.env.example** | `SSO_PROXY_URL`, `MURAL_OAUTH_*` URLs, `app.mural.co` | +| **docs/SECURITY_POSTURE.md** | K8s namespaces, PVCs, sidecar DB, test clusters | +| **packages/banksy-core** | `@tactivos/mural-shared` dependency | + +--- + +## Summary Table + +| Package | Role | Complexity | Source Files | Entry | +|---------|------|------------|--------------|-------| +| **banksy-core** | OAuth xmcp server, tool orchestration | High | ~200+ (tools, UI, lib) | dist/http.js | +| **banksy-mural-api** | Internal Mural API MCP server | Low | ~8 | dist/index.js | +| **banksy-public-api** | Public Mural API MCP server | Low | ~8 | dist/index.js | diff --git a/fastmcp-migration/execution-strategy-research/03-fastmcp-project-patterns.md b/fastmcp-migration/execution-strategy-research/03-fastmcp-project-patterns.md new file mode 100644 index 0000000..927ac5a --- /dev/null +++ b/fastmcp-migration/execution-strategy-research/03-fastmcp-project-patterns.md @@ -0,0 +1,33 @@ +# FastMCP Project Structure Research — Key Findings + +## 1. Project Layout + +FastMCP projects follow a `src/` layout convention. The framework auto-discovers entry points named `mcp`, `server`, or `app`. Tools, resources, and prompts are registered via decorators (`@mcp.tool`, `@mcp.resource`, `@mcp.prompt`). The official `fastmcp.json` config file handles deployment concerns (source location, environment, transport). + +## 2. Testing + +FastMCP is all-in on **pytest + pytest-asyncio** with `asyncio_mode = "auto"`. The primary pattern is **in-memory client testing** — you pass the server instance directly to `Client()` and test without any network overhead. FastMCP ships test utilities in `fastmcp.utilities.tests`: `run_server_async`, `run_server_in_process`, `temporary_settings`, and `HeadlessOAuth`. Tests mirror the `src/` directory structure and should complete in <1 second. + +## 3. Configuration + +Two layers: **`fastmcp.json`** handles deployment config (transport, host, port, env vars with `${VAR}` interpolation), while **pydantic-settings** is the standard for application-level config (API keys, base URLs). FastMCP internally uses Pydantic Settings with priority: explicit params > env vars > `.env` > defaults. + +## 4. `from_openapi()` + +Requires **httpx** (`httpx.AsyncClient`). Supports OpenAPI 3.0.0+. Default: all endpoints become Tools. Customizable via `RouteMap` objects to map endpoints to Tools/Resources/ResourceTemplates/Exclude. Authentication goes on the httpx client. Important caveat from docs: auto-converted APIs perform worse than curated MCP servers — use `from_openapi()` for bootstrapping, then curate. + +## 5. Dependency Injection + +Powered by Docket/uncalled-for. Built-in deps: `Context`, `CurrentFastMCP()`, `CurrentRequest()`, `CurrentHeaders()`, `CurrentAccessToken()`, `TokenClaim()`. Custom deps via `Depends()` with per-request caching, nesting, and async context manager cleanup. Lifespans handle server-level resources (DB pools, HTTP clients), composable with `|` operator. + +## 6. uv Workspaces for Monorepos + +Root `pyproject.toml` defines `[tool.uv.workspace]` with `members` globs. Each member has its own `pyproject.toml`. Cross-member dependencies use `tool.uv.sources` with `workspace = true`. Single lockfile, editable installs between members, targeted execution with `uv run --package`. Key limitation: single `requires-python` intersection across all members. + +## 7. Key Gotchas + +- Don't open `Client` in pytest fixtures (event loop issues) +- `from_openapi()` is for prototyping, not production mirroring +- DI parameters are automatically excluded from MCP tool schemas +- uv workspace sources in root apply to all members (override, not merge) +- Workspace members can accidentally import each other's deps (no isolation) diff --git a/fastmcp-migration/execution-strategy-research/04-repo-organization-evaluation.md b/fastmcp-migration/execution-strategy-research/04-repo-organization-evaluation.md new file mode 100644 index 0000000..bf4ba2f --- /dev/null +++ b/fastmcp-migration/execution-strategy-research/04-repo-organization-evaluation.md @@ -0,0 +1,184 @@ +# Repo Organization Evaluation: TS→Python/FastMCP Migration + +## Grounding: What the Repo Actually Looks Like + +Before scoring, some observations that materially affect the analysis: + +- **167 source files** in `banksy-core`, plus ~8 each in the two API packages. This is a modest codebase. +- **Root-level TS artifacts** are a known, finite set: `package.json`, `package-lock.json`, `eslint.config.mjs`, `.prettierrc`, `knip.jsonc`, `.nvmrc`, `.husky/`, two Dockerfiles. +- **CI (`quality.yml`)** runs ESLint, Prettier, tests, and Knip — all via `npm ci` then `npm run`. Targets `packages/**/*.ts`. +- **CI (`build.yml`)** builds two Docker images: `tactivos/banksy` and `tactivos/banksy-mural-oauth`. +- **Lint-staged** only triggers on `*.{ts,tsx}` and `*.{json,md}` — Python files are invisible to it. +- **No production deployment.** No users. No data. No operational continuity concern. +- **The feature branch strategy is decided.** All four approaches play out on `feat/transition-to-fast-mcp`. + +The critical constraint shaping this decision: **the LLM-assisted rewrite workflow**. An agent needs to read `packages/banksy-core/src/tools/echo.ts`, understand its intent, and write `src/banksy/tools/echo.py` — ideally in the same workspace, same file tree, same context window. + +--- + +## Approach-by-Approach Evaluation + +### Approach A: Nest-and-Build-Alongside + +Move all TS into `legacy/`. Build Python at root. + +| Criterion | Score | Rationale | +|-----------|:-----:|-----------| +| LLM workflow efficiency | 4 | Both codebases visible, but TS now lives at `legacy/packages/banksy-core/src/...` — deeper nesting, less intuitive paths for cross-referencing. | +| Incremental progress | 4 | After the initial "move" commit, each subsequent PR is clean. Python at root is immediately testable. | +| CI/CD transition | 3 | The "move" commit breaks all TS CI paths (`packages/` → `legacy/packages/`). Must update or disable TS CI in the same commit. Adding Python CI is straightforward after. | +| Cleanup simplicity | 4 | `rm -rf legacy/` is clean, but must also audit root for any lingering references. | +| External system impact | 4 | Dockerfiles break in the "move" commit (they reference `packages/banksy-core/` directly). Must rewrite Dockerfiles as part of the move. | +| Cognitive overhead | 3 | The `legacy/` convention is one more thing to explain. "Where's the TS code?" → "In `legacy/`." Adds a navigation layer. | +| Risk of interference | 4 | Good isolation. TS tooling is contained within `legacy/`. Root is Python-only. But the initial "move everything" commit is a 167+ file rename — a messy diff that obscures the real changes. | +| **Total** | **26** | | + +**Pros:** +- Clear physical separation between old and new +- Root is "clean Python" from the start of the build +- `legacy/` name clearly communicates intent + +**Cons:** +- The "move everything" commit is a massive rename touching every file +- Breaks every relative path in Dockerfiles, CI workflows, and TS config +- Adds a navigation layer for LLM cross-referencing +- Feels like ceremony for a pre-deployment project that nobody else is using + +--- + +### Approach B: Fork-and-Blank-Slate + +Fork to `banksy-legacy`. Strip TS from original. Start fresh. + +| Criterion | Score | Rationale | +|-----------|:-----:|-----------| +| LLM workflow efficiency | 2 | Two separate repos. Agent must open both in a multi-root workspace or `git clone` the legacy repo alongside. Cross-referencing requires switching context. This is the biggest penalty — it directly degrades the core workflow. | +| Incremental progress | 4 | Every PR is pure Python, testable in isolation. No hybrid states. | +| CI/CD transition | 4 | Clean break. Delete TS CI, write Python CI. No coexistence period. | +| Cleanup simplicity | 5 | Nothing to clean up — TS was never in the new repo. | +| External system impact | 3 | The "strip all TS" commit is a drastic content change. Same repo name, but Dockerfile goes from Node to Python in one step. Fork needs to remain accessible for the rewrite duration. | +| Cognitive overhead | 3 | Repo itself is clean, but developers/agents need to know the legacy fork exists and how to reference it. Two-repo mental model. | +| Risk of interference | 5 | Zero. Only Python in the repo. | +| **Total** | **26** | | + +**Pros:** +- Cleanest possible repo state from day one +- No tooling interference +- No cleanup phase + +**Cons:** +- Fundamentally breaks the LLM cross-referencing workflow — the primary use case for keeping TS around +- Fork administration overhead (permissions, visibility, eventual archival) +- Git history discontinuity — the repo's history becomes meaningless noise +- Overkill "cleanliness" for a project with zero users + +--- + +### Approach C: Branch-Based Separation + +TS on `main`. Feature branch progressively replaces TS with Python. + +| Criterion | Score | Rationale | +|-----------|:-----:|-----------| +| LLM workflow efficiency | 3 | Feature branch starts with all TS code (inherited from main), so initially both are visible. But as TS files get deleted/replaced, reference material disappears mid-migration. Agent must `git show main:path/to/file` to recover deleted TS — friction every time. | +| Incremental progress | 3 | The "progressive replacement" model means the branch is a non-functional hybrid during mid-transition. Neither TS nor Python fully works. Each PR may be individually coherent, but the branch state between PRs is ambiguous. | +| CI/CD transition | 3 | TS CI inherited from main. It starts passing (TS unchanged), then starts failing as TS files are deleted. Awkward intermediate state where CI is partially broken. | +| Cleanup simplicity | 3 | TS files are deleted incrementally throughout the migration, not in a single cleanup step. This spreads the cleanup across many PRs, making it harder to verify completeness. "Did we get everything?" requires auditing the whole tree. | +| External system impact | 4 | Main stays untouched. Feature branch changes are internal until merge. | +| Cognitive overhead | 3 | "Which files on this branch are active Python and which are leftover TS?" is unclear during mid-migration. The intent of individual files is ambiguous. | +| Risk of interference | 3 | Feature branch has both `package.json` and `pyproject.toml`. Husky hooks (from TS) fire on commits. `npm ci` in existing CI tries to install Node deps alongside Python deps. Must actively manage the conflict. | +| **Total** | **22** | | + +**Pros:** +- No repo restructuring needed — simplest to start +- Main always has the last-known-working TS version +- Natural git workflow (branch and merge) + +**Cons:** +- TS reference material erodes as migration progresses +- Ambiguous branch state — neither fully TS nor fully Python at any point +- CI in a broken intermediate state for most of the migration +- Cleanup is scattered, not atomic + +--- + +### Approach D: Hybrid — Python at Root, TS Preserved as Reference + +Add Python project at root. TS code stays in `packages/`. Delete TS at the end. + +| Criterion | Score | Rationale | +|-----------|:-----:|-----------| +| LLM workflow efficiency | **5** | Both codebases in same workspace, same branch, natural paths throughout the entire migration. Agent reads `packages/banksy-core/src/lib/auth/provider.ts` → writes `src/banksy/auth/oauth.py`. Everything visible simultaneously, start to finish. This is the ideal setup for the LLM-assisted rewrite. | +| Incremental progress | **5** | PR1 adds `pyproject.toml` + `src/banksy/server.py` — immediately testable with `uv run`. PR2 adds OpenAPI tools — testable. TS code is inert reference material, never interfering with Python testing. Each PR adds Python functionality that can be independently verified. | +| CI/CD transition | 4 | Add a new Python CI workflow (`python.yml`) alongside existing TS CI. Both run during transition — TS CI passes because TS code is unchanged, Python CI validates new code. At cleanup, delete TS CI. Clean parallel operation. | +| Cleanup simplicity | **5** | Atomic cleanup in one PR: delete `packages/`, `package.json`, `package-lock.json`, `eslint.config.mjs`, `.prettierrc`, `knip.jsonc`, `.nvmrc`, `.husky/`, old Dockerfiles. Known, finite list of paths. Single `rm -rf packages/ && rm package.json package-lock.json ...` commit. | +| External system impact | **5** | TS code is never moved, broken, or modified. Dockerfiles are untouched until the cleanup PR deliberately replaces them. External systems (K8s manifests, image names) reference `tactivos/banksy` — nothing changes until the final PR swaps the Dockerfile to Python, producing the same image name. | +| Cognitive overhead | 4 | Clear mental model: `packages/` = TS reference (read-only), `src/banksy/` = Python active work. Root has both `package.json` and `pyproject.toml`, which is mildly unusual but immediately understandable. | +| Risk of interference | 4 | npm and uv/Python tooling are completely independent — `npm install` ignores `pyproject.toml`, `uv sync` ignores `package.json`. ESLint targets `packages/**/*.ts`, won't touch `.py` files. Lint-staged only fires on `*.{ts,tsx}`. The one interference point: Husky hooks run on every commit — easily addressed in PR1 by replacing `.husky/` with `pre-commit` or disabling TS hooks on the feature branch. | +| **Total** | **32** | | + +**Pros:** +- Optimizes for the primary workflow (LLM reads TS, writes Python) +- Zero "ceremony" commits — no file moves, no renames, no path breakage +- Both CI systems can run in parallel without conflict +- TS code is pristine reference material throughout the entire migration +- Cleanup is a single, auditable commit at the very end + +**Cons:** +- Root-level config duplication (`package.json` + `pyproject.toml`) during transition +- Husky hooks need to be handled in PR1 (minor) +- A developer running `npm install` at root would create `node_modules/` alongside `.venv` (cosmetic; both are gitignored) + +--- + +## Novel Approach Considered + +**Approach E: TS as a Git Worktree or Submodule.** Archive TS in a separate branch, mount it as a worktree for reference. Rejected — adds git complexity for no benefit over D, since D already preserves TS in-tree with zero maintenance. + +No novel approach meaningfully improves on D for this specific situation (pre-deployment, LLM-assisted, same-repo constraint). + +--- + +## Ranked Recommendation + +| Rank | Approach | Total Score | Verdict | +|:----:|----------|:-----------:|---------| +| **1** | **D: Hybrid — Python at Root, TS Preserved** | **32** | **Recommended** | +| 2 | A: Nest-and-Build-Alongside | 26 | Viable but unnecessary ceremony | +| 2 | B: Fork-and-Blank-Slate | 26 | Clean but cripples LLM workflow | +| 4 | C: Branch-Based Separation | 22 | Worst of both worlds | + +### Why D Wins Decisively + +**The deciding factor is the LLM workflow.** This migration will be executed by Cursor agents reading TypeScript and writing Python. Every other approach introduces friction in that core loop: + +- **A** makes the agent navigate into `legacy/` — minor friction, but friction multiplied by hundreds of cross-references adds up. +- **B** forces the agent to switch between repos — major friction, fundamentally breaking the workflow. +- **C** progressively destroys the reference material the agent needs. +- **D** keeps everything visible, at natural paths, for the entire duration. + +The secondary factors reinforce D: no ceremony commits, parallel CI, atomic cleanup, zero external system impact during transition. The only costs (dual config at root, managing Husky hooks) are trivial one-time tasks in PR1. + +### How to Execute D + +The first commit on `feat/transition-to-fast-mcp` should: + +1. Add `pyproject.toml`, `src/banksy/__init__.py`, `src/banksy/server.py` (echo tool) +2. Add `ruff.toml` (or equivalent Python linting config) +3. Update `.gitignore` to include Python artifacts (`__pycache__/`, `.venv/`, `*.pyc`, `.mypy_cache/`) +4. Replace `.husky/pre-commit` with a Python-aware hook (or disable TS hooks on this branch) +5. Add `.github/workflows/python.yml` for Python CI +6. Leave **all TS files untouched** + +The last PR (PR8 in the existing plan) is the cleanup: + +```bash +rm -rf packages/ node_modules/ +rm package.json package-lock.json eslint.config.mjs .prettierrc knip.jsonc .nvmrc +rm -rf .husky/ +rm Dockerfile.mural-oauth # or rename Dockerfile.python → Dockerfile +# Delete .github/workflows/quality.yml (TS CI) +# Update .github/workflows/build.yml to build Python Docker image +``` + +This is the approach that treats the TS codebase as what it actually is: **a specification document for the rewrite, not a living system that needs to keep working.** diff --git a/fastmcp-migration/execution-strategy-research/05-toolchain-project-mgmt-monorepo.md b/fastmcp-migration/execution-strategy-research/05-toolchain-project-mgmt-monorepo.md new file mode 100644 index 0000000..a21ba88 --- /dev/null +++ b/fastmcp-migration/execution-strategy-research/05-toolchain-project-mgmt-monorepo.md @@ -0,0 +1,187 @@ +# Python Project Management & Monorepo Strategy Evaluation + +## Category 1: Project & Dependency Management + +### Candidates + +Narrowed from five to **three realistic candidates**. Hatch's workspace support is too new (v1.16.0, Nov 2025) with documented issues around non-TTY environments (Docker/CI), race conditions, and no cryptographic verification. pip+pip-tools is the baseline but lacks lockfiles, workspaces, and Python version management — too much manual glue for a greenfield project. PDM has a solid standards story but significantly lower community adoption, meaning fewer examples, fewer StackOverflow answers, and more friction onboarding contributors. + +**Realistic candidates:** + +| | **uv** | **Poetry** | **PDM** | +|---|---|---|---| +| **Speed** | 10-100x faster than Poetry (Rust) | Slowest of the three | Middle ground | +| **Lockfile** | Universal cross-platform TOML (`uv.lock`), PEP 751 export support | `poetry.lock` — mature, single-platform resolution | `pdm.lock` — PEP 665 compliant | +| **Workspace/Monorepo** | Native, Cargo-inspired, single lockfile, editable installs | No native support. Open issue since 2020, no roadmap | Native workspace support | +| **Docker caching** | Excellent. Official multi-stage patterns, BuildKit cache mounts, separate dep/code layers | Good. Copy `poetry.lock` first pattern works | Decent. Similar patterns possible | +| **Python version mgmt** | Built-in (`uv python install 3.12`) | None (use pyenv separately) | None (use pyenv separately) | +| **PEP 621 compliance** | Yes, standard `pyproject.toml` | No (uses `[tool.poetry]` sections) | Yes | +| **CI/CD (GitHub Actions)** | Official `astral-sh/setup-uv` action with auto-caching | `snok/install-poetry` community action | Community actions | +| **IDE / Cursor** | Works with standard `.venv`; Cursor rules documented; Pyright config manual | Mature VSCode/Cursor integration | Less documented | +| **Community (2026)** | Dominant for new projects, rapidly growing | Large established base, migration trend away | Niche, knowledgeable but small | +| **Maturity** | v0.10+ (pre-1.0 but production-used widely) | v1.8+ (mature, stable) | v2.x (mature) | + +### Key Differentiators + +**uv wins on:** +- Raw speed (6s vs 47s for equivalent installs in benchmarks) +- Native workspace support with single lockfile +- Docker story (official patterns, cache mount docs) +- PEP 621 compliance (standard `pyproject.toml`) +- Python version management built-in (no separate pyenv) +- GitHub Actions integration (official action with auto-caching) +- Cross-platform universal lockfile + +**Poetry wins on:** +- Battle-tested maturity (years of production use) +- PyPI publishing workflow +- IDE integration depth (longer history of tooling support) +- Ecosystem of plugins + +**The uv risk:** pre-1.0 versioning means API surface could shift. In practice, Astral (the company behind uv and Ruff) has strong backing and the tool is already used in production by major teams. FastMCP itself recommends `uv add fastmcp` in its docs. + +### Recommendation: **uv** + +**Rationale:** +1. **FastMCP alignment** — FastMCP's own documentation uses `uv` for installation and project setup. Swimming with the current, not against it. +2. **Workspace support** — Native, mature, single-lockfile workspaces are exactly what a 1-3 package project needs. Poetry doesn't have this at all. +3. **Docker-first** — Documented multi-stage patterns with BuildKit cache mounts. The Rust binary doesn't need to be in the production image (37%+ size reduction). +4. **CI speed** — 10-100x faster dependency resolution directly translates to faster GitHub Actions runs and cheaper CI minutes. +5. **PEP 621** — Standard `pyproject.toml` means no vendor lock-in. If uv disappears, the project metadata is portable. +6. **Single tool** — Replaces pip, pip-tools, pyenv, virtualenv, and workspace management. Fewer moving parts. + +### Runner-up: **Poetry** + +**When to fall back:** +- If the project needs to publish multiple packages to PyPI with complex release workflows (Poetry's publishing story is more mature) +- If a team member has deep Poetry expertise and no appetite for learning uv +- If uv's pre-1.0 status causes organizational compliance concerns + +**Migration path:** Since uv reads standard `pyproject.toml`, migrating from Poetry to uv later is straightforward. The reverse (uv to Poetry) would require restructuring `[tool.poetry]` sections. Starting with uv preserves more optionality. + +--- + +## Category 2: Monorepo / Workspace Strategy + +### Candidates + +Eliminating upfront: **pants/bazel** — massive overhead for a 1-3 package Python project. These are for hundreds-of-packages monorepos. **nx for Python** — immature Python support, adds Node.js as a dependency for a Python project. **Independent pyproject.toml per package** — no shared lockfile, dependency drift, more CI complexity. Not worth it for this scale. + +**Realistic candidates:** + +| | **uv workspaces** | **Single pyproject.toml with extras** | **Poetry path dependencies** | +|---|---|---|---| +| **Structure** | Root `pyproject.toml` + member `pyproject.toml` per package | One `pyproject.toml`, optional dependency groups | Per-package `pyproject.toml` with `develop = true` path deps | +| **Lockfile** | Single `uv.lock` for entire workspace | Single lockfile | Per-package `poetry.lock` (no unified lock) | +| **Cross-package deps** | `tool.uv.sources` with `workspace = true`, editable by default | Implicit (same package) | Manual path dependency wiring | +| **Targeted execution** | `uv run --package server` | N/A (single package) | `cd package && poetry run` | +| **Scalability** | Grows from 1 to N packages naturally | Collapses at 3+ packages (one bloated pyproject) | Fragile at scale (lock sync issues) | +| **Docker** | Build specific packages from workspace, shared lockfile for caching | Simple — one image, one lockfile | Complex — each package needs its own build context | +| **CI complexity** | Low — single lock, targeted test runs | Lowest — but tests everything always | High — multiple locks, complex matrix | +| **Key limitation** | All members must share compatible `requires-python` | No real package boundaries | No native workspace support, plugin required | + +### Analysis for This Project + +The project needs: a main FastMCP server + potentially shared DB models/utils + potentially a shared library. That's 1-3 packages. + +**uv workspaces** — The structure would look like: + +``` +banksy-mcp/ +├── pyproject.toml # workspace root +├── uv.lock # single lockfile +├── packages/ +│ ├── server/ +│ │ ├── pyproject.toml # fastmcp, httpx deps +│ │ └── src/server/ +│ ├── db/ +│ │ ├── pyproject.toml # sqlalchemy, asyncpg deps +│ │ └── src/db/ +│ └── shared/ +│ ├── pyproject.toml # pydantic, shared types +│ └── src/shared/ +``` + +The `requires-python` limitation is a non-issue here — all packages target the same Python version. Cross-package imports work via editable installs. Docker builds copy `uv.lock` first for layer caching, then install specific packages. + +**Single pyproject.toml with extras** — The structure: + +``` +banksy-mcp/ +├── pyproject.toml # everything in one file +├── uv.lock +└── src/ + ├── server/ + ├── db/ + └── shared/ +``` + +With dependency groups like `[project.optional-dependencies]` having `db = ["sqlalchemy", "asyncpg"]` and `server = ["fastmcp", "httpx"]`. This is simpler but loses package boundaries — you can't `uv run --package db` to run just the DB tests, and imports aren't enforced across boundaries. + +### Recommendation: **Start with single pyproject.toml, graduate to uv workspaces when needed** + +**Rationale:** + +This is a **pragmatic two-phase approach**: + +**Phase 1 (start here):** Single `pyproject.toml` with well-organized `src/` directory structure and optional dependency groups. For 1 package (just the server), a workspace is overhead. Even for the DB layer, it can start as a module within the same package. + +```toml +[project] +name = "banksy-mcp" +requires-python = ">=3.12" +dependencies = [ + "fastmcp>=3.1", + "pydantic>=2.0", + "httpx>=0.27", + "sqlalchemy[asyncio]>=2.0", + "asyncpg>=0.30", +] + +[project.optional-dependencies] +dev = ["pytest", "pytest-asyncio", "ruff", "mypy"] +``` + +``` +banksy-mcp/ +├── pyproject.toml +├── uv.lock +├── src/ +│ └── banksy_mcp/ +│ ├── __init__.py +│ ├── server.py # FastMCP server +│ ├── tools/ # MCP tools +│ ├── db/ # Database models, sessions +│ └── shared/ # Shared types, utils +├── tests/ +├── Dockerfile +└── .github/workflows/ +``` + +**Phase 2 (when complexity demands it):** When you genuinely need separate packages (e.g., the DB layer is shared with another service, or you want independent versioning), promote to uv workspaces. The migration is straightforward because: +- uv is already the package manager +- `pyproject.toml` is PEP 621 compliant +- You're just splitting one package into members + +The trigger to graduate: when you find yourself wanting `import db_models` from a second, separate service, or when the single `pyproject.toml` dependencies list exceeds ~30 entries and test isolation becomes painful. + +### Runner-up: **uv workspaces from day one** + +**When to use instead:** +- If you're already confident you'll have 3 packages from the start +- If team members will work on different packages independently +- If you want to enforce import boundaries from the beginning + +The cost is ~30 minutes of initial setup and slightly more `pyproject.toml` files to maintain. The benefit is cleaner boundaries. For a solo developer or small team on a new project, this is premature structure — but it's not wrong, just early. + +--- + +## Summary Table + +| Decision | Primary | Runner-up | Avoid | +|---|---|---|---| +| **Package Manager** | uv | Poetry | pip+pip-tools, hatch | +| **Monorepo Strategy** | Single pyproject.toml → uv workspaces | uv workspaces from day one | pants, bazel, nx | +| **Lockfile** | `uv.lock` (universal, cross-platform) | — | No lockfile | +| **Python Version** | `>=3.12` (managed by uv) | — | 3.10 (missing asyncio improvements) | +| **CI Setup** | `astral-sh/setup-uv@v7` | — | Manual pip caching | diff --git a/fastmcp-migration/execution-strategy-research/06-toolchain-linting-typing-testing-hooks.md b/fastmcp-migration/execution-strategy-research/06-toolchain-linting-typing-testing-hooks.md new file mode 100644 index 0000000..b78f65e --- /dev/null +++ b/fastmcp-migration/execution-strategy-research/06-toolchain-linting-typing-testing-hooks.md @@ -0,0 +1,290 @@ +# Python Development Tooling Evaluation: Categories 3, 4, 5, 10 + +## Category 3: Linting & Formatting + +### Candidates + +| Tool | Role | Language | Speed | Maturity | +|---|---|---|---|---| +| **Ruff (lint + format)** | All-in-one linter + formatter | Rust | 10-100x faster than alternatives | Production-ready, 46k+ GitHub stars | +| **Black + Flake8 + isort** | Traditional trio | Python | Baseline | Mature, widely adopted | +| **Black + Ruff (lint only)** | Black formats, Ruff lints | Mixed | Fast linting, normal formatting | Both mature | +| **Pylint** | Deep analysis linter | Python | Slow | Mature but declining mindshare | + +### Comparison + +**Ruff (lint + format)** has crossed the maturity threshold decisively. It's used in production by FastAPI, Pandas, Apache Airflow, Hugging Face, and SciPy. It implements 800+ lint rules from 50+ rule sets (pycodestyle, pyflakes, flake8-bugbear, isort, pyupgrade, bandit, pydocstyle, and more). The formatter is >99.9% Black-compatible. Critically for this project, Ruff is built by Astral — the same team behind `uv` — so the toolchain cohesion is tight. Configuration lives in a single `[tool.ruff]` section in `pyproject.toml`. + +**Black + Flake8 + isort** requires three separate tools, three configs, three CI steps. Each is Python-based and slower. The only advantage is "this is what everyone has used for years," which is diminishing as Ruff adoption accelerates. + +**Black + Ruff (lint only)** is a hedge: "I trust Ruff for linting but not formatting." This made sense in 2023 when the Ruff formatter was new. By 2026, the formatter has been stable for 2+ years and is used by projects far larger than this one. + +**Pylint** does deeper analysis (including some things Ruff doesn't cover, like checking method resolution order or certain design patterns), but it's dramatically slower and the extra coverage rarely justifies the developer friction. Ruff's `PLC`/`PLE`/`PLW` rule families import Pylint's most valuable rules anyway. + +**IDE integration**: Ruff has a first-party VS Code extension (works in Cursor), providing real-time diagnostics, auto-fix on save, and format-on-save. It replaces the need for separate Black, Flake8, and isort extensions. + +**FastMCP compatibility**: FastMCP itself uses Ruff. Following the same tooling means consistent style when reading FastMCP source code or contributing upstream. + +**Starter configuration for this project:** + +```toml +[tool.ruff] +target-version = "py312" +line-length = 88 + +[tool.ruff.lint] +select = [ + "E", "W", # pycodestyle + "F", # pyflakes + "I", # isort + "B", # flake8-bugbear + "UP", # pyupgrade + "S", # bandit (security) + "ASYNC", # flake8-async + "RUF", # ruff-specific rules +] +ignore = ["B008"] # FastAPI/FastMCP use function calls in defaults + +[tool.ruff.format] +quote-style = "double" +docstring-code-format = true +``` + +### Recommendation: **Ruff (lint + format)** + +Single tool, single config, 10-100x faster, same vendor as `uv`, actively used by FastMCP itself. There is no compelling reason to introduce the complexity of multiple tools. + +### Runner-up: **Black + Ruff (lint only)** + +If the team has strong muscle-memory around Black and doesn't want to trust Ruff's formatter, use Black for formatting and Ruff for linting. The cost is two tools instead of one and a minor config overhead, but both are well-supported. + +--- + +## Category 4: Type Checking + +### Candidates + +| Tool | Author | Language | Speed | Pydantic V2 Support | +|---|---|---|---|---| +| **Pyright** | Microsoft | TypeScript/Node | Fast | Native (dataclass transforms), some gaps | +| **mypy** | Python community | Python (mypyc) | Slower | Via official plugin, comprehensive | +| **ty** | Astral | Rust | 10-60x faster than mypy/Pyright | Planned, not yet available | +| **No type checker** | — | — | — | — | + +### Comparison + +**Pyright** is fast, has excellent VS Code/Cursor integration (it powers Pylance), and does aggressive type inference. It handles standard Python typing well and is the best option for real-time editor feedback. However, it has **known limitations with Pydantic V2**: it uses Python's standard `dataclass_transform` support, which doesn't fully capture Pydantic's dynamic `__init__` generation. This means false positives on valid Pydantic patterns (e.g., settings classes with env-sourced defaults, some `model_construct()` usage). Pydantic maintains a Pyright test suite, but acknowledges gaps. + +**mypy** with the Pydantic plugin is the most accurate type checker for Pydantic-heavy code. The plugin generates correct `__init__` signatures, respects `Config.frozen`, `Config.extra`, field aliases, and `model_construct()`. This matters significantly for a project that uses Pydantic models extensively (which FastMCP does — tool parameters, resource models, server configuration are all Pydantic). The downside is speed: mypy is slower, especially without caching. However, for a project of this size (1-3 packages), mypy runs in seconds, not minutes. The speed difference is material for 100k+ line codebases, not for this project. + +**ty (Astral)** entered beta in December 2025. It's blazingly fast (10-60x faster than mypy/Pyright) and has a language server. Astral has announced planned first-class Pydantic support, but it's not available yet. It's too early for a project that needs solid Pydantic checking from day one. Worth revisiting in 6-12 months. + +**No type checker** is a false economy. This project uses Pydantic extensively, has async code where type errors are subtle, and will be maintained by a team. The cost of adding types is low (Pydantic models already define them), and the payoff in bug prevention is high. + +**The "use both" strategy**: Pyright/Pylance runs in the editor for instant feedback. mypy runs in CI as the authoritative gate. This is the approach recommended by the Pydantic team and by practitioners working with Pydantic V2. The two tools catch slightly different things, and the overlap is not a problem — it's defense in depth. + +**Practical concern**: Running two type checkers in CI doubles type-checking time. For this project's size, that's seconds, not minutes. If it becomes a concern, drop mypy from CI and rely on Pyright alone, accepting the Pydantic edge-case gaps. + +### Recommendation: **Pyright (primary) + mypy with Pydantic plugin (CI gate)** + +- Pyright/Pylance for local development: instant feedback, excellent Cursor integration, aggressive inference +- mypy with `pydantic.mypy` plugin in CI: catches Pydantic-specific issues Pyright misses +- If forced to pick one: **Pyright**, because developer experience matters more than catching Pydantic edge cases that are also caught by runtime validation + +**Configuration sketch:** + +```toml +# pyproject.toml + +[tool.pyright] +pythonVersion = "3.12" +typeCheckingMode = "standard" +reportUnnecessaryTypeIgnoreComment = true + +[tool.mypy] +python_version = "3.12" +plugins = ["pydantic.mypy"] +strict = false +warn_return_any = true +warn_unused_configs = true +disallow_untyped_defs = true + +[tool.pydantic-mypy] +init_forbid_extra = true +init_typed = true +warn_required_dynamic_aliases = true +``` + +### Runner-up: **Pyright only** + +If maintaining two type checkers feels like overhead, Pyright alone is good enough. It catches the vast majority of real bugs, has the best editor experience, and Pydantic V2's `dataclass_transform` support covers the common cases. Accept that a few Pydantic-specific patterns won't be fully checked statically — Pydantic's runtime validation catches them anyway. + +**Future watch**: **ty** from Astral. When it ships Pydantic support, it could replace both mypy and Pyright with a single, fast, Ruff-ecosystem-integrated tool. Monitor for stable release. + +--- + +## Category 5: Testing Framework + +### Candidates + +| Tool | Role | Async Support | FastMCP Integration | +|---|---|---|---| +| **pytest + pytest-asyncio** | Full test framework | First-class (`auto` mode) | Native — FastMCP ships test utilities for it | +| **unittest** | Standard library | `IsolatedAsyncioTestCase` | None | +| **ward** | Alternative framework | Limited | None | +| **hypothesis** | Property-based testing | Via pytest plugin | Complement, not replacement | + +### Comparison + +**pytest + pytest-asyncio** is the only realistic choice, and this evaluation is largely about confirming that and documenting the setup. + +FastMCP's testing story is explicitly built on pytest. The framework ships: +- `Client(transport=server)` for in-memory testing without network overhead +- `fastmcp.utilities.tests` module with `run_server_async`, `run_server_in_process`, `temporary_settings`, `HeadlessOAuth` +- Documentation assumes `asyncio_mode = "auto"` configuration +- Thousands of tests in the FastMCP repo itself as reference material + +The in-memory `Client` pattern is the core of the testing approach: instantiate your `FastMCP` server, pass it directly to `Client()`, and test tools/resources/prompts without subprocess overhead. This is elegant and fast. + +**pytest-asyncio** reached v1.3.0 (November 2025) with stable `auto` mode. Key recent changes: +- v1.0.0 removed the deprecated `event_loop` fixture (clean break) +- `asyncio_mode = "auto"` eliminates boilerplate `@pytest.mark.asyncio` decorators +- `asyncio_default_test_loop_scope` config option available for controlling loop lifecycle + +**unittest** would work technically but requires fighting FastMCP's testing utilities, loses pytest's fixture system (critical for async setup/teardown), loses `parametrize`, and gains nothing. There is no reason to choose it. + +**ward** has a small community and no FastMCP integration. It would mean rewriting all testing patterns from scratch. Not viable. + +**hypothesis** is a valuable complement to pytest for property-based testing of Pydantic model validation, tool parameter edge cases, and data transformation logic. Add it when the test suite matures, not at project bootstrap. + +**Recommended test structure:** + +``` +tests/ +├── conftest.py # Shared fixtures (client, server, db) +├── test_tools/ +│ ├── test_mural_tools.py +│ └── test_workspace_tools.py +├── test_resources/ +│ └── test_mural_resources.py +├── test_auth/ +│ └── test_oauth_flow.py +└── test_integration/ + └── test_end_to_end.py +``` + +**Recommended libraries alongside pytest:** + +| Library | Purpose | +|---|---| +| `pytest-asyncio` | Async test support | +| `pytest-cov` | Coverage reporting | +| `inline-snapshot` | Readable assertions on complex structures (recommended by FastMCP docs) | +| `dirty-equals` | Flexible equality for dynamic values like timestamps/UUIDs | +| `pytest-xdist` | Parallel test execution (when suite grows) | +| `respx` or `aioresponses` | HTTP mocking for async code | + +**Configuration:** + +```toml +[tool.pytest.ini_options] +asyncio_mode = "auto" +testpaths = ["tests"] +addopts = "-v --tb=short --strict-markers" +markers = [ + "integration: marks tests requiring database/external services", + "slow: marks tests that take >1s", +] +``` + +### Recommendation: **pytest + pytest-asyncio** + +This is not a close call. FastMCP's entire testing infrastructure assumes pytest. The `Client(transport=server)` in-memory pattern is clean and fast. `asyncio_mode = "auto"` eliminates async boilerplate. Add `inline-snapshot` and `dirty-equals` per FastMCP's own documentation recommendations. + +### Runner-up: N/A + +There is no realistic runner-up. If pytest somehow became unavailable, you'd use unittest's `IsolatedAsyncioTestCase` and lose significant productivity. + +**Future addition**: **hypothesis** for property-based testing of Pydantic model validation boundaries and tool parameter edge cases, once the core test suite is established. + +--- + +## Category 10: Pre-commit / Git Hooks + +### Candidates + +| Tool | Language | Parallel Execution | Community Hooks | uv Compatibility | +|---|---|---|---|---| +| **pre-commit** | Python | No (sequential) | Extensive ecosystem | Via `pre-commit-uv` plugin | +| **lefthook** | Go | Yes | No shared ecosystem | Language-agnostic | +| **No hooks** | — | — | — | — | + +### Comparison + +**pre-commit** is the standard in the Python ecosystem. The key advantage is its hook ecosystem: ruff, mypy, pyright, and most Python tools ship pre-commit hook definitions. Configuration is declarative YAML. The main friction points: +1. **uv compatibility**: Not native. The `pre-commit-uv` plugin patches pre-commit to use uv for hook environment creation, which works but is a community workaround, not an official integration. Install via `uv tool install pre-commit --with pre-commit-uv`. +2. **Sequential execution**: Hooks run one after another. For a project this size, the total hook time is under 5 seconds with Ruff (sub-second) + type checking (2-3 seconds), so parallelism doesn't matter. +3. **Environment isolation**: pre-commit creates isolated virtualenvs per hook, which is a strength (hooks don't pollute your dev env) but adds first-run setup time. + +**lefthook** is faster (parallel execution, Go binary, no runtime deps) and language-agnostic. However, it doesn't have a hook ecosystem — you write shell commands that invoke your tools directly. For a pure Python project where all tools are already configured in `pyproject.toml`, this is fine. The configuration is arguably simpler: + +```yaml +# lefthook.yml +pre-commit: + parallel: true + commands: + lint: + glob: "*.py" + run: ruff check --fix {staged_files} && ruff format {staged_files} + typecheck: + run: pyright +``` + +The downside: you're responsible for ensuring the tools are installed in the dev environment, whereas pre-commit manages that automatically. + +**No hooks** is viable if CI is fast and the team is disciplined. The argument: hooks add developer friction (slow commits, fighting the hook when you want to commit WIP), and CI catches everything anyway. The counter-argument: catching lint errors before push saves a CI round-trip (2-5 minutes). + +**A middle ground**: Run only fast checks (Ruff lint + format) as a pre-commit hook, and leave type checking to CI. Ruff completes in under a second even on large projects, so the friction is negligible. + +### Recommendation: **pre-commit (with pre-commit-uv)** + +For a Python project using uv, pre-commit with the `pre-commit-uv` plugin is the pragmatic choice. The ecosystem integration is unmatched — ruff, mypy, and pyright all ship pre-commit hook configs. The isolated environments mean new contributors run `pre-commit install` and get consistent tooling without manual setup. + +**Keep hooks fast**: Only run Ruff (lint + format) in the pre-commit hook. Leave type checking for CI. This keeps commit-time friction under 1 second. + +**Configuration:** + +```yaml +# .pre-commit-config.yaml +repos: + - repo: https://github.com/astral-sh/ruff-pre-commit + rev: v0.9.0 # pin to current version + hooks: + - id: ruff + args: [--fix] + - id: ruff-format +``` + +**Installation for contributors:** + +```bash +uv tool install pre-commit --with pre-commit-uv +pre-commit install +``` + +### Runner-up: **lefthook** + +If the team prefers a language-agnostic, faster tool and doesn't need the pre-commit hook ecosystem (since Ruff is the only hook that matters), lefthook is cleaner. It's a single Go binary, supports parallel execution, and the configuration is straightforward. The tradeoff is that you lose automatic tool isolation — developers need Ruff installed in their environment, which `uv` handles anyway via dev dependencies. + +**Also acceptable: No hooks** if the team commits to fast CI feedback loops and doesn't mind the occasional "oops, forgot to format" push. Add hooks later when the team grows or the friction becomes worth it. + +--- + +## Summary Table + +| Category | Recommendation | Runner-up | Rationale | +|---|---|---|---| +| **3: Lint + Format** | Ruff (all-in-one) | Black + Ruff (lint) | Single tool, fastest, same vendor as uv, used by FastMCP | +| **4: Type Checking** | Pyright (editor) + mypy (CI) | Pyright only | Pyright for speed/DX, mypy plugin for Pydantic accuracy | +| **5: Testing** | pytest + pytest-asyncio | N/A | FastMCP's testing utilities require it; no viable alternative | +| **10: Git Hooks** | pre-commit + pre-commit-uv | lefthook | Ecosystem hooks, auto-isolation; keep hooks fast (Ruff only) | diff --git a/fastmcp-migration/execution-strategy-research/07-toolchain-db-http-config.md b/fastmcp-migration/execution-strategy-research/07-toolchain-db-http-config.md new file mode 100644 index 0000000..0b12370 --- /dev/null +++ b/fastmcp-migration/execution-strategy-research/07-toolchain-db-http-config.md @@ -0,0 +1,232 @@ +# Library Evaluation: Categories 6-9 for FastMCP Server + +## Category 6: Database ORM / Query Layer + +### Candidates + +| Library | Async Native | Pydantic Integration | Maturity | Migration Story | +|---|---|---|---|---| +| **SQLAlchemy 2.0 (async)** | Via asyncpg driver | Manual mapping | Industrial | Alembic (gold standard) | +| **SQLModel** | Via SQLAlchemy async | Models *are* Pydantic | Moderate | Alembic (same engine) | +| **asyncpg (raw)** | Native | None (manual) | Very high | Manual / external tool | +| **Tortoise ORM** | Native | Pydantic export only | Moderate | Built-in (basic) | + +**Piccolo** and **databases** are excluded — Piccolo's community is too small for a production bet, and `databases` (encode) has been effectively unmaintained since 2023. + +### Comparison + +**SQLAlchemy 2.0 async** — The safe default. Uses asyncpg under the hood (`postgresql+asyncpg://`). Full relationship mapping, identity map, unit-of-work. Alembic auto-generates migrations from model diffs. The cost: you maintain *two* model layers — SQLAlchemy models for the DB and Pydantic models for your API/tools. For 3-5 tables, that duplication is a mild annoyance, not a real problem. + +**SQLModel** — Tiangolo's bridge: a single class is both a Pydantic `BaseModel` and a SQLAlchemy `Table`. Eliminates the two-model problem entirely. Async works (`sqlmodel.ext.asyncio.session.AsyncSession` + asyncpg), though the docs are sparse and you'll need the `greenlet` dependency. The model-is-Pydantic story is compelling for a FastMCP project where every tool input/output is already Pydantic. The risk: SQLModel is a relatively thin wrapper, and when you hit an edge case (complex queries, custom column types), you drop down to raw SQLAlchemy anyway — at which point the abstraction cost is net-negative. For a simple schema this rarely matters. + +**asyncpg (raw)** — Maximum performance, zero abstraction. You write SQL, you get tuples/Records back. For 3-5 tables with straightforward CRUD (insert token, lookup by user, delete expired sessions), raw asyncpg is arguably the *right* level of abstraction. The lifespan pattern fits perfectly: + +```python +@lifespan +async def db_lifespan(server): + pool = await asyncpg.create_pool(dsn=settings.database_url) + yield {"db": pool} + await pool.close() +``` + +The cost: no auto-migrations, no model validation at the DB layer, and you're writing SQL strings (or using a query builder like `pypika`). + +**Tortoise ORM** — Async-native with a Django-like API. Simpler than SQLAlchemy, but the community is significantly smaller and the Pydantic integration is export-only (`.from_tortoise_orm()` / `pydantic_model_creator()`), not native. For a Pydantic-first project, this mismatch is friction you don't need. + +### FastMCP DI/Lifespan Integration + +All four candidates integrate cleanly with FastMCP's lifespan pattern. The lifespan yields a context dict; tools access it via `ctx.lifespan_context["db"]`. The key difference is *what* you yield: + +- **SQLAlchemy/SQLModel**: Yield an `async_sessionmaker`; tools create sessions per-call +- **asyncpg**: Yield the connection pool; tools do `async with pool.acquire() as conn` +- **Tortoise**: `Tortoise.init()` in lifespan setup, `Tortoise.close_connections()` in teardown + +### Recommendation: **SQLAlchemy 2.0 async** + +**Rationale**: The schema is simple *now*, but OAuth token storage, user records, and sessions have a way of growing — refresh token rotation needs foreign keys back to users, sessions need expiry indexes, and you'll likely want to join across tables. SQLAlchemy handles all of this without friction, Alembic gives you real migration tooling from day one, and the async story with asyncpg is fully production-ready. The two-model overhead (SQLAlchemy model + Pydantic model) is trivial for 3-5 tables and buys you clean separation between DB concerns and API concerns. + +**Runner-up: asyncpg (raw)** + +If the schema stays truly simple (token CRUD, user lookup, session check) and you're confident it won't grow meaningfully, raw asyncpg with hand-written SQL is the lightest-weight, fastest, and most debuggable option. Pair with dbmate for migrations (see Category 7). This is the "YAGNI" pick — no ORM overhead for a problem that may never need an ORM. + +**Why not SQLModel?** It's tempting for the Pydantic unification, but the async documentation gaps, the `greenlet` dependency, and the tendency to drop into raw SQLAlchemy at the first non-trivial query make it a leaky abstraction. For a 3-5 table schema, maintaining separate Pydantic and SQLAlchemy models is less work than debugging SQLModel's edge cases. + +--- + +## Category 7: Database Migrations + +### Candidates + +| Tool | ORM Coupling | Auto-generate | Language | Complexity | +|---|---|---|---|---| +| **Alembic** | SQLAlchemy required | Yes (model diff) | Python | Medium | +| **dbmate** | None | No (manual SQL) | Go binary | Low | +| **yoyo-migrations** | None | No (manual SQL) | Python | Low | +| **Manual SQL files** | None | No | N/A | Minimal | + +### Comparison + +**Alembic** — The natural pairing with SQLAlchemy. `alembic revision --autogenerate` diffs your models against the DB and produces migration scripts. For a greenfield project, this is a powerful workflow: change the model, auto-generate, review, apply. The downside is setup overhead (alembic.ini, env.py, version table) and the fact that autogeneration isn't magic — it misses data migrations, index changes can be wrong, and you always need to review the generated code. For 3-5 tables, the initial setup cost is the same as for 50 tables, but the ongoing benefit is proportionally smaller. + +**dbmate** — A single Go binary (`brew install dbmate`), no Python dependency, uses plain `.sql` files with `-- migrate:up` / `-- migrate:down` markers. Runs migrations atomically in transactions. Docker-friendly (just mount the migrations dir). Tracks applied migrations in a `schema_migrations` table. For a simple schema, this is the lowest-friction option: write SQL, run `dbmate up`. The cost: no auto-generation, so you write every migration by hand — which for 3-5 initial tables and occasional additions is perfectly fine. + +**yoyo-migrations** — Similar philosophy to dbmate but Python-native. Supports both raw SQL and Python migration files. Slightly more setup than dbmate, slightly less community adoption. The Python-native angle is nice if you want to avoid a Go binary in your toolchain. + +**Manual SQL files** — Version-numbered `.sql` files applied by a script or `psql`. Viable for the initial schema, but you'll reinvent migration tracking (which migrations have been applied?) and lose idempotency guarantees. Not recommended even for simple schemas — the tools above cost almost nothing and give you tracking for free. + +### Recommendation: **Alembic** (if SQLAlchemy is the ORM choice) / **dbmate** (if asyncpg raw) + +**If you chose SQLAlchemy (Category 6 primary)**: Use **Alembic**. The auto-generation workflow pays for itself immediately, even with 3-5 tables. Model-migration consistency is valuable, and Alembic is the undisputed standard in the Python ecosystem. The initial setup is a one-time cost. + +**If you chose asyncpg raw (Category 6 runner-up)**: Use **dbmate**. No ORM means Alembic's auto-generation can't help you. dbmate gives you tracked, versioned, transactional SQL migrations with minimal ceremony. It runs as a standalone binary, so it works in Docker `initContainers` or as a pre-start step. + +**Runner-up: yoyo-migrations** + +If you prefer an all-Python toolchain and don't want a Go binary, yoyo is the equivalent of dbmate within the Python ecosystem. Slightly less polished but functionally equivalent. + +--- + +## Category 8: HTTP Client + +### Candidates + +| Library | Async | FastMCP Compat | Retry Story | Auth Story | +|---|---|---|---|---| +| **httpx** | Native | **Required** | Via transport/stamina | Built-in `Auth` class | +| **aiohttp** | Native | Incompatible | Built-in | Manual | +| **requests** | No | Incompatible | urllib3 built-in | Prepared requests | + +### The Constraint + +**This is not a real choice.** FastMCP's `from_openapi()` requires `httpx.AsyncClient`. Full stop. The evaluation here is about *how to use httpx effectively*, not whether to use it. + +### httpx: What You Get + +**Connection pooling**: Built-in. `httpx.AsyncClient` maintains a connection pool by default (`max_connections=100`, `max_keepalive_connections=20`). Create one client in the lifespan, reuse across all tool calls: + +```python +@lifespan +async def http_lifespan(server): + client = httpx.AsyncClient(timeout=30.0) + yield {"http": client} + await client.aclose() +``` + +**Timeout configuration**: Granular — `httpx.Timeout(connect=5.0, read=30.0, write=10.0, pool=5.0)` or a single float for all. + +**Auth token injection**: httpx has a first-class `Auth` class with `async_auth_flow()`. This is the correct pattern for injecting OAuth tokens and handling refresh: + +```python +class MuralOAuth(httpx.Auth): + async def async_auth_flow(self, request): + token = await self.get_valid_token() # Checks expiry, refreshes if needed + request.headers["Authorization"] = f"Bearer {token}" + yield request +``` + +The `Auth` class supports response-aware retries too — you can `yield request` again after a 401 to retry with a fresh token. + +**Retry patterns** — Three options: + +1. **httpx built-in transport retries**: `httpx.AsyncHTTPTransport(retries=3)` — retries only on connection failures, not HTTP errors. Too limited for production. + +2. **httpx-retries** (actively maintained, ~2.3M downloads/month): Transport-level retry with backoff, jitter, and configurable status codes. API mirrors urllib3's `Retry`: + + ```python + from httpx_retries import Retry, AsyncRetryTransport + retry = Retry(total=3, backoff_factor=0.5, status_forcelist=[502, 503, 504]) + transport = AsyncRetryTransport(retry=retry) + client = httpx.AsyncClient(transport=transport) + ``` + +3. **stamina** (by Hynek Schlawack): General-purpose retry decorator wrapping Tenacity. Works with any async function. Built-in Prometheus/structlog instrumentation. Better for retrying at the *application* level (retry the whole tool operation) rather than at the transport level. + +### Recommendation: **httpx** (mandatory) with **httpx-retries** for transport retries + **httpx.Auth** for token injection + +**Configuration**: +- Create a single `httpx.AsyncClient` in the lifespan with connection pool settings +- Use `AsyncRetryTransport` for 502/503/504 retries with exponential backoff +- Implement a custom `httpx.Auth` subclass with `async_auth_flow()` for per-user OAuth token injection and 401-triggered refresh +- Consider **stamina** as an *additional* layer for tool-level retries on transient errors (distinct from HTTP transport retries) + +**There is no runner-up.** httpx is the only option. The supplementary libraries (httpx-retries, stamina) are complementary, not alternatives. + +--- + +## Category 9: Configuration Management + +### Candidates + +| Library | Type Safety | Env Vars | .env Files | Validation | Pydantic Native | +|---|---|---|---|---|---| +| **pydantic-settings** | Full | Yes | Yes | Full Pydantic | Yes | +| **python-dotenv** | None | No (loads only) | Yes | None | No | +| **dynaconf** | Partial | Yes | Yes | Basic | No | +| **environs** | Partial | Yes | Yes | Marshmallow | No | +| **os.environ** | None | Yes | No | None | No | + +### The Constraint (Again) + +FastMCP internally uses Pydantic Settings with this priority: **explicit params > env vars > .env > defaults**. Your project is Pydantic-native by definition. Using anything *other* than pydantic-settings would create a parallel, incompatible configuration system. + +### Comparison + +**pydantic-settings** (v2.13+) — `BaseSettings` gives you: +- Type-safe field definitions with validation +- Automatic env var loading (with configurable prefix) +- `.env` file support (via python-dotenv integration) +- Nested model support with `env_nested_delimiter` +- Secret file support (`_secrets_dir`) +- CLI argument parsing (new in v2) +- Default value validation (unique to Settings, not BaseModel) + +This is the obvious, correct, and only reasonable choice for a Pydantic-native project. + +**python-dotenv** — Not a configuration *manager*, just a `.env` file loader. pydantic-settings uses it internally when `env_file` is set. You'll install it as a dependency of pydantic-settings, not as a standalone tool. + +**dynaconf** — A powerful multi-layer config system (env vars, TOML, YAML, Redis, Vault). Overkill for this project and introduces a non-Pydantic type system. Would make sense if you needed runtime config switching or multi-environment TOML files, but Docker + env vars + pydantic-settings covers this. + +**environs** — Marshmallow-based env parsing. In a Pydantic project, using Marshmallow for config is a pointless dependency mismatch. + +**os.environ** — No validation, no typing, no defaults, no .env support. Never appropriate when pydantic-settings exists. + +### Recommendation: **pydantic-settings** + +**Rationale**: There is no meaningful alternative in a Pydantic-native project. pydantic-settings is the canonical solution, it's what FastMCP uses internally, and it provides everything you need with zero friction: + +```python +from pydantic_settings import BaseSettings, SettingsConfigDict + +class Settings(BaseSettings): + model_config = SettingsConfigDict( + env_prefix="BANKSY_", + env_file=".env", + env_file_encoding="utf-8", + ) + + database_url: str + mural_api_base_url: str = "https://app.mural.co/api" + mural_client_id: str + mural_client_secret: str # from env, never default + server_port: int = 3001 + log_level: str = "INFO" +``` + +**Dependencies**: `pydantic-settings` (pulls in `python-dotenv` automatically for `.env` support). + +**Runner-up**: None. This is a settled question. If you need multi-environment TOML/YAML layering *in addition* to pydantic-settings, you could add dynaconf, but that's an additive concern, not an alternative. + +--- + +## Summary Table + +| Category | Primary | Runner-up | Locked? | +|---|---|---|---| +| **6. ORM/Query** | SQLAlchemy 2.0 async | asyncpg raw | No — real choice | +| **7. Migrations** | Alembic (with SQLAlchemy) / dbmate (with asyncpg) | yoyo-migrations | Follows Cat. 6 | +| **8. HTTP Client** | httpx + httpx-retries + Auth class | — | Yes (FastMCP constraint) | +| **9. Configuration** | pydantic-settings | — | Yes (Pydantic constraint) | + +Categories 8 and 9 are effectively locked by FastMCP's architecture — the evaluation confirms the constrained choices and identifies the supplementary tools (httpx-retries, stamina, python-dotenv) that round them out. + +The real decision is Category 6, and it cascades into Category 7. The choice between SQLAlchemy and raw asyncpg depends on how much you trust the "3-5 tables, simple schema" promise to hold long-term. diff --git a/fastmcp-migration/fastmcp-auth-strategy.md b/fastmcp-migration/fastmcp-auth-strategy.md new file mode 100644 index 0000000..092aefc --- /dev/null +++ b/fastmcp-migration/fastmcp-auth-strategy.md @@ -0,0 +1,168 @@ +# FastMCP Auth Strategy Assessment + +## Current Banksy Auth Architecture + +Banksy has a **two-layer auth model**: + +- **Layer 1 (User session):** Better Auth manages browser sessions (HTTP-only cookies). Two modes select the upstream identity provider: + - `sso-proxy` -- Google OAuth routed through an SSO proxy that rewrites redirect URIs + - `mural-oauth` -- Direct Mural OAuth (authorization_code grant) +- **Layer 2 (API tokens):** After the user session is established, Mural API access/refresh tokens are stored in Postgres (`muralOauthToken` / `muralSessionToken` tables) and refreshed automatically before expiry. + +Token flow to downstream MCP tool servers: + +```mermaid +sequenceDiagram + participant Client as MCP Client + participant Core as banksy-core + participant DB as Postgres + participant Sidecar as banksy-mural-api + participant API as Mural API + + Client->>Core: MCP request (session cookie) + Core->>Core: Better Auth session check + Core->>DB: getAndRefreshTokens(userId) + Core->>Sidecar: MCP call + x-mural-access-token header + Sidecar->>Sidecar: AsyncLocalStorage.enterWith(tokens) + Sidecar->>API: Authorization Bearer {accessToken} + API-->>Sidecar: response + Sidecar-->>Core: MCP response + Core-->>Client: result +``` + +Key files (in banksy repo): + +- Auth mode selection: `packages/banksy-core/src/lib/auth/auth-mode.ts` +- Better Auth provider: `packages/banksy-core/src/lib/auth/provider.ts` +- Mural OAuth plugin: `packages/banksy-core/src/lib/auth/mural-oauth.ts` +- SSO proxy plugin: `packages/banksy-core/src/lib/auth/sso-google-oauth.ts` +- Token refresh: `packages/banksy-core/src/lib/mural-session/token-refresh.ts` +- Token-aware transport: `packages/banksy-mural-api/src/token-aware-transport.ts` +- Per-request auth: `packages/banksy-mural-api/src/per-request-auth.ts` + +## What FastMCP Provides + +FastMCP (Python) has mature, built-in auth that maps well to banksy's needs: + +- **OAuthProxy** -- Bridges non-DCR providers (Google, custom Mural OAuth) to MCP's DCR-compliant flow. Built-in `GitHubProvider`, `GoogleProvider`. +- **Full OAuthProvider** -- Abstract class for self-contained OAuth server (advanced, likely not needed). +- **TokenVerifier / JWTVerifier** -- Validates tokens from upstream; supports JWKS, HMAC, static keys, introspection. +- **`get_access_token()`** -- Tools can access the current user's token at runtime -- key for making Mural API calls. +- **Authorization system** -- Scope-based, tag-based, and custom auth checks on tools/resources. +- **DI / Context** -- Session-scoped state, dependency injection -- replaces AsyncLocalStorage pattern. +- **Token factory** -- OAuthProxy issues its own JWTs, stores upstream tokens encrypted, handles refresh. + +## Proposed Direction: OAuthProxy with Custom Mural Provider + +### Mural-OAuth Mode (primary path) + +FastMCP's `OAuthProxy` is a near-direct replacement for banksy's mural-oauth flow: + +```python +from fastmcp import FastMCP +from fastmcp.server.auth import OAuthProxy +from fastmcp.server.auth.providers.jwt import JWTVerifier + +auth = OAuthProxy( + upstream_authorization_endpoint=MURAL_AUTHORIZE_URL, + upstream_token_endpoint=MURAL_TOKEN_URL, + upstream_client_id=MURAL_CLIENT_ID, + upstream_client_secret=MURAL_CLIENT_SECRET, + token_verifier=..., # See risk #1 below + base_url="https://banksy.example.com", +) + +mcp = FastMCP(name="Banksy MCP", auth=auth) +``` + +The OAuthProxy handles: DCR emulation, PKCE, callback forwarding, token exchange, refresh token mapping, encrypted token storage, and consent screens. + +Inside tools, `get_access_token()` returns the FastMCP JWT. The **upstream Mural token** is stored encrypted by the proxy and used for refresh -- but accessing it directly from within a tool to call the Mural API is a key open question (see risk #2). + +### SSO-Proxy Mode + +The SSO proxy pattern (Google OAuth via redirect-rewriting proxy) is architecturally different from what `OAuthProxy` does. Two possible approaches: + +- **Option A: OAuthProxy pointed at Google** -- Use FastMCP's built-in `GoogleProvider` (or raw `OAuthProxy` with Google endpoints). This eliminates the SSO proxy entirely, but means the MCP server needs a Google OAuth app registration directly. +- **Option B: OAuthProxy pointed at the SSO proxy** -- Treat the SSO proxy as the upstream OAuth provider. The proxy already has Google credentials; the `OAuthProxy` would use the SSO proxy's authorize/token endpoints. This preserves the existing SSO proxy infrastructure. + +Either way, the SSO-proxy mode faces an additional problem: after Google auth, banksy currently does a **Session Activation** step to obtain Mural API tokens. This secondary token acquisition has no equivalent in OAuthProxy and would need custom logic. + +## Risks and Open Questions + +### Risk 1: Token Verification for Mural OAuth + +FastMCP's `OAuthProxy` requires a `TokenVerifier` to validate upstream tokens. Mural's OAuth tokens may be **opaque** (not JWTs), meaning: + +- `JWTVerifier` won't work +- `IntrospectionTokenVerifier` requires a Mural introspection endpoint (does one exist?) +- May need `DebugTokenVerifier` or custom verifier that validates against Mural's `/users/me` endpoint + +**Action needed:** Determine Mural OAuth token format (JWT vs opaque) and available validation endpoints. + +### Risk 2: Upstream Token Access from Tools + +Banksy's tools need the **Mural access token** (not the FastMCP JWT) to call the Mural API. FastMCP's `get_access_token()` returns the FastMCP-issued JWT, not the upstream provider token. The upstream token is stored encrypted by the proxy. + +The OAuthProxy documentation mentions that "the upstream token is available in your tool functions via `get_access_token()` or the `CurrentAccessToken` dependency" for token exchange flows -- but it's unclear whether this refers to the upstream token or the FastMCP JWT. This needs verification. + +**If upstream token is NOT directly accessible:** We'd need to either: + +1. Subclass `OAuthProxy` to expose upstream tokens through a custom mechanism +2. Look up the upstream token from storage using the JTI from the FastMCP JWT +3. Implement the Full `OAuthProvider` pattern to have full control over token issuance + +**Action needed:** Build a minimal spike to confirm whether `get_access_token()` in an OAuthProxy setup returns the upstream Mural token or the FastMCP JWT, and whether there's a supported path to retrieve the upstream token. + +### Risk 3: SSO Proxy + Session Activation Flow + +The sso-proxy mode involves a two-step auth: + +1. Google OAuth (via SSO proxy) to establish identity +2. Session Activation to obtain Mural API tokens + +OAuthProxy handles step 1 naturally, but step 2 is custom business logic that runs after the OAuth flow completes. This means: + +- The tool server would authenticate the user via Google, but still wouldn't have Mural API tokens +- A post-auth hook or custom middleware would need to perform Session Activation and store the resulting tokens + +**Action needed:** Determine if the sso-proxy mode is still a requirement for the FastMCP rewrite, or if mural-oauth alone is sufficient. + +### Risk 4: Token Refresh Lifecycle + +Banksy's current refresh logic (`getAndRefreshTokens` with 60s buffer) runs proactively before each MCP tool call. FastMCP's OAuthProxy handles refresh at the transport layer when a client presents an expired FastMCP JWT -- the proxy refreshes the upstream token and issues a new FastMCP JWT. + +This is a different model: + +- **Banksy today:** Server-side proactive refresh before each API call +- **FastMCP:** Client-initiated refresh via standard OAuth refresh_token grant + +The risk is timing: if the upstream Mural token expires between the client's last refresh and a tool execution, the Mural API call would fail. FastMCP's token expiry alignment (FastMCP token lifetime matches upstream lifetime) mitigates this, but the exact behavior under race conditions needs testing. + +### Risk 5: Eliminating the Sidecar Architecture + +Banksy currently runs tools in a separate sidecar process (`banksy-mural-api` / `banksy-public-api`) with tokens passed via HTTP headers. FastMCP Python would run tools in-process. This simplifies the architecture (no AsyncLocalStorage, no header-based token passing) but means: + +- All tools and auth logic share one Python process +- Token context is available via `get_access_token()` or DI instead of transport headers +- The PerRequestAuthProvider pattern is unnecessary + +This is a **positive simplification** but should be validated that it doesn't introduce concurrency issues under load (Python's asyncio vs Node.js event loop). + +### Risk 6: No Python Better Auth Equivalent + +Better Auth (user/session management, DB-backed sessions, MCP plugin) has no Python equivalent. FastMCP's OAuthProxy handles the session aspects that Better Auth covered, but: + +- User management (create/find users in DB) would need custom implementation or a library like `authlib` +- Session persistence uses FastMCP's `client_storage` backends (memory, disk, Redis) instead of Postgres tables +- The Better Auth MCP plugin's session-checking middleware is replaced by FastMCP's built-in transport-level auth + +This is acceptable if we're okay with FastMCP managing sessions its own way rather than replicating Better Auth's exact behavior. + +## Recommended Assessment Steps + +1. **Spike: OAuthProxy + Mural OAuth** -- Build a minimal FastMCP server with OAuthProxy pointed at Mural's OAuth endpoints. Confirm token format, verify that tools can obtain usable Mural API tokens, and test the refresh flow. +2. **Determine SSO-proxy requirement** -- Clarify whether the FastMCP rewrite needs to support the sso-proxy mode or if mural-oauth is sufficient. This significantly simplifies the auth strategy. +3. **Spike: Upstream token retrieval** -- If the spike in step 1 confirms that `get_access_token()` returns the FastMCP JWT (not the upstream token), investigate the supported path to retrieve the upstream Mural token from within tool functions. +4. **Evaluate token storage** -- Decide on FastMCP's `client_storage` backend for production (Redis is recommended). Compare with banksy's current Postgres-based token storage. +5. **Document the delta** -- Produce a matrix of current banksy auth behaviors vs FastMCP equivalents, flagging any behaviors that require custom code beyond FastMCP's built-in capabilities. diff --git a/fastmcp-migration/research-fastmcp-project-structure.md b/fastmcp-migration/research-fastmcp-project-structure.md new file mode 100644 index 0000000..a15605c --- /dev/null +++ b/fastmcp-migration/research-fastmcp-project-structure.md @@ -0,0 +1,624 @@ +# FastMCP Project Structure Research + +> Research date: 2026-03-11 +> FastMCP version: v3.1.0 (latest as of research date) +> Sources: gofastmcp.com docs, GitHub PrefectHQ/fastmcp, community projects, uv docs + +--- + +## 1. FastMCP Project Layout Conventions + +### Official Repo Structure (PrefectHQ/fastmcp) + +FastMCP itself uses a `src/` layout with mirrored test directories: + +``` +fastmcp/ +├── pyproject.toml +├── src/ +│ └── fastmcp/ +│ ├── server/ +│ │ ├── auth.py +│ │ ├── context.py +│ │ ├── lifespan.py +│ │ ├── openapi.py # from_openapi() lives here +│ │ ├── openapi_server.py # deprecated, use from_openapi() +│ │ └── providers/ +│ ├── client/ +│ │ └── transports/ +│ ├── dependencies.py # CurrentContext, Depends, etc. +│ ├── utilities/ +│ │ ├── tests.py # Test utilities (run_server_async, etc.) +│ │ └── lifespan.py +│ └── ... +├── tests/ +│ ├── server/ +│ │ ├── test_auth.py +│ │ └── ... +│ ├── client/ +│ └── conftest.py +└── examples/ +``` + +### Recommended Application Layout + +Based on official docs, examples, and community conventions, a FastMCP server project should follow: + +#### Minimal (single-file server) +``` +my-mcp-server/ +├── pyproject.toml +├── fastmcp.json # Optional: declarative config +├── server.py # Server entry point with `mcp = FastMCP("Name")` +└── README.md +``` + +#### Standard (multi-module server) +``` +my-mcp-server/ +├── pyproject.toml +├── fastmcp.json +├── src/ +│ └── my_mcp_server/ +│ ├── __init__.py +│ ├── server.py # FastMCP instance + run() +│ ├── tools/ # Tool definitions grouped by domain +│ │ ├── __init__.py +│ │ ├── users.py +│ │ └── projects.py +│ ├── resources/ # Resource definitions +│ │ └── __init__.py +│ ├── prompts/ # Prompt definitions +│ │ └── __init__.py +│ ├── config.py # Application config (pydantic-settings) +│ └── deps.py # Custom Depends() factories +├── tests/ +│ ├── conftest.py # Shared fixtures +│ ├── test_tools.py +│ └── test_resources.py +└── README.md +``` + +#### Production (from_openapi + custom tools) +``` +my-mcp-server/ +├── pyproject.toml +├── fastmcp.json +├── src/ +│ └── my_mcp_server/ +│ ├── __init__.py +│ ├── server.py # Composes from_openapi + custom tools +│ ├── openapi/ +│ │ ├── __init__.py +│ │ ├── spec.py # Load/fetch OpenAPI spec +│ │ └── routes.py # RouteMap customizations +│ ├── tools/ +│ │ └── custom.py # Hand-crafted tools beyond API +│ ├── config.py +│ └── deps.py +├── tests/ +│ ├── conftest.py +│ ├── test_openapi_tools.py +│ └── test_custom_tools.py +└── README.md +``` + +### Key Conventions + +- **Entry point naming**: FastMCP auto-discovers `mcp`, `server`, or `app` variable names +- **Decorators**: `@mcp.tool`, `@mcp.resource`, `@mcp.prompt` for registration +- **fastmcp.json**: Canonical config file for deployment (source, environment, deployment) +- **Build system**: Hatchling is common; `uv_build` for workspace setups + +--- + +## 2. Testing Patterns + +### Framework: pytest + pytest-asyncio + +FastMCP assumes **pytest** with **pytest-asyncio**. This is the only officially supported testing approach. + +#### Required dev dependencies +```toml +[dependency-groups] +dev = [ + "pytest>=8.0", + "pytest-asyncio>=0.24", + "inline-snapshot", # Optional: snapshot testing + "dirty-equals", # Optional: flexible assertions +] +``` + +#### Required pytest config +```toml +[tool.pytest.ini_options] +asyncio_mode = "auto" # Eliminates @pytest.mark.asyncio on every test +``` + +### Core Testing Pattern: In-Memory Client + +The primary testing approach passes the FastMCP server directly to a Client, running the full MCP protocol in-process without network overhead: + +```python +import pytest +from fastmcp import FastMCP, Client + +@pytest.fixture +def server(): + mcp = FastMCP("TestServer") + + @mcp.tool + def add(a: int, b: int) -> int: + return a + b + + return mcp + +async def test_tool_execution(server): + async with Client(server) as client: + result = await client.call_tool("add", {"a": 1, "b": 2}) + assert result.data == 3 +``` + +**Important**: Do NOT open Client instances in fixtures — create the server in the fixture, open the client in the test body. This avoids event loop issues. + +### Built-in Test Utilities (`fastmcp.utilities.tests`) + +| Utility | Purpose | When to use | +|---------|---------|-------------| +| `run_server_async(server)` | Starts server as asyncio task, returns URL | In-process network transport testing (preferred) | +| `run_server_in_process(fn)` | Starts server in subprocess, returns URL | STDIO transport or full process isolation testing | +| `temporary_settings(**kw)` | Context manager to override FastMCP settings | Testing config variations | +| `HeadlessOAuth` | Simulates OAuth flow without browser | Testing auth flows programmatically | + +### Network Transport Testing + +```python +from fastmcp.utilities.tests import run_server_async +from fastmcp.client.transports import StreamableHttpTransport + +@pytest.fixture +async def http_url(server): + async with run_server_async(server) as url: + yield url + +async def test_http_transport(http_url): + async with Client(transport=StreamableHttpTransport(http_url)) as client: + result = await client.ping() + assert result is True +``` + +### Test Markers + +```python +@pytest.mark.integration # Requires external services +@pytest.mark.client_process # Spawns subprocesses +``` + +### Best Practices from FastMCP docs + +1. **Single behavior per test** — one assertion focus per test +2. **Self-contained setup** — every test creates its own server/state +3. **Mirror `src/` in `tests/`** — `src/fastmcp/server/auth.py` → `tests/server/test_auth.py` +4. **Tests should complete in <1 second** unless marked integration +5. **Use `inline-snapshot`** for complex data structures (schemas, API responses) +6. **Use `dirty-equals`** for flexible assertions on dynamic values +7. **Mock external deps** with standard `unittest.mock.AsyncMock` + +--- + +## 3. Configuration Management + +### FastMCP's Own Config: `fastmcp.json` + +FastMCP provides a declarative config file for **deployment concerns** (not application config): + +```json +{ + "$schema": "https://gofastmcp.com/public/schemas/fastmcp.json/v1.json", + "source": { + "path": "src/my_server/server.py", + "entrypoint": "mcp" + }, + "environment": { + "type": "uv", + "python": ">=3.10", + "dependencies": ["httpx", "pydantic-settings"], + "editable": ["."] + }, + "deployment": { + "transport": "http", + "host": "0.0.0.0", + "port": 8000, + "log_level": "INFO", + "env": { + "API_BASE_URL": "https://api.${ENVIRONMENT}.example.com" + } + } +} +``` + +Supports `${VAR_NAME}` interpolation for env vars in the deployment section. + +### Application Config: pydantic-settings + +FastMCP internally uses Pydantic Settings for its own settings. For **application-level config** (API keys, base URLs, feature flags), the standard pattern is pydantic-settings: + +```python +from pydantic_settings import BaseSettings + +class Settings(BaseSettings): + api_base_url: str = "https://api.example.com" + api_token: str = "" + debug: bool = False + + model_config = {"env_prefix": "MY_SERVER_"} + +settings = Settings() +``` + +Priority order (FastMCP convention): +1. Explicit constructor params +2. Environment variables (`FASTMCP_*` prefix for FastMCP's own settings) +3. `.env` file values +4. Defaults + +### Injecting Config via DI + +```python +from fastmcp import FastMCP +from fastmcp.dependencies import Depends + +def get_settings() -> Settings: + return Settings() + +@mcp.tool +async def fetch_data( + query: str, + settings: Settings = Depends(get_settings), +) -> str: + # settings is injected, excluded from MCP schema + ... +``` + +--- + +## 4. FastMCP `from_openapi()` Details + +### Requirements + +- **httpx**: Required — `from_openapi()` expects an `httpx.AsyncClient` +- **OpenAPI 3.0.0+**: Supported spec versions +- No swagger 2.0 support + +### Basic Setup + +```python +import httpx +from fastmcp import FastMCP + +client = httpx.AsyncClient( + base_url="https://api.example.com", + headers={"Authorization": "Bearer TOKEN"} +) + +spec = httpx.get("https://api.example.com/openapi.json").json() + +mcp = FastMCP.from_openapi( + openapi_spec=spec, + client=client, + name="My API", + timeout=30.0, +) +``` + +### Route Mapping + +Default: **all endpoints become Tools** (for maximum LLM client compatibility). + +Custom mapping with `RouteMap`: + +```python +from fastmcp.server.openapi import RouteMap, MCPType + +mcp = FastMCP.from_openapi( + openapi_spec=spec, + client=client, + route_maps=[ + RouteMap(methods=["GET"], pattern=r".*\{.*\}.*", mcp_type=MCPType.RESOURCE_TEMPLATE), + RouteMap(methods=["GET"], mcp_type=MCPType.RESOURCE), + RouteMap(pattern=r"^/admin/.*", mcp_type=MCPType.EXCLUDE), + RouteMap(tags={"internal"}, mcp_type=MCPType.EXCLUDE), + ], +) +``` + +### Advanced Customization + +- **`mcp_names`**: Dict mapping `operationId` → custom MCP tool name +- **`mcp_component_fn`**: Callback to modify components in-place after creation +- **`route_map_fn`**: Advanced callable for per-route type decisions +- **`tags`**: Global tags applied to all components + +### Parameter Handling + +- **Query params**: Only non-empty values sent (None/empty filtered out) +- **Path params**: Required, raises on missing +- **Array params**: Supports `explode` parameter per OpenAPI spec +- **Headers**: Auto-converted to strings + +### Important Caveat + +From the docs: *"LLMs achieve significantly better performance with well-designed, curated MCP servers than with auto-converted OpenAPI servers."* The recommendation is to use `from_openapi()` for bootstrapping/prototyping, then curate tools for production. + +--- + +## 5. Dependency Injection System + +### Powered by Docket + uncalled-for + +FastMCP's DI is built on [Docket](https://github.com/chrisguidry/docket) and [uncalled-for](https://github.com/chrisguidry/uncalled-for). Core DI works without installing Docket. Background tasks need `fastmcp[tasks]`. + +### Built-in Dependencies + +| Dependency | Injection | Function API | Notes | +|---|---|---|---| +| `Context` | Type annotation `ctx: Context` | `get_context()` | Logging, progress, resource access | +| `FastMCP` | `CurrentFastMCP()` | `get_server()` | Server introspection | +| `Request` | `CurrentRequest()` | `get_http_request()` | HTTP-only, raises on STDIO | +| `Headers` | `CurrentHeaders()` | `get_http_headers()` | Safe fallback (empty dict on STDIO) | +| `AccessToken` | `CurrentAccessToken()` | `get_access_token()` | Auth required, raises if none | +| `TokenClaim` | `TokenClaim("oid")` | — | Extract specific JWT claim | + +### Custom Dependencies with `Depends()` + +```python +from fastmcp.dependencies import Depends + +def get_db_connection(): + return {"connection": "active"} + +@mcp.tool +async def query(sql: str, db=Depends(get_db_connection)) -> list: + # db is injected, excluded from MCP schema + ... +``` + +Key behaviors: +- **Per-request caching**: Same dependency resolved once per request, shared across consumers +- **Nested dependencies**: Dependencies can depend on other dependencies +- **Resource cleanup**: Use `@asynccontextmanager` for dependencies needing teardown +- **Schema exclusion**: DI params are automatically hidden from MCP tool schema + +### Lifespan Pattern (Server-level DI) + +For resources that live for the server's lifetime (DB pools, HTTP clients): + +```python +from fastmcp import FastMCP, Context +from fastmcp.server.lifespan import lifespan + +@lifespan +async def app_lifespan(server): + db = await create_db_pool() + try: + yield {"db": db} + finally: + await db.close() + +mcp = FastMCP("MyServer", lifespan=app_lifespan) + +@mcp.tool +def query_users(ctx: Context) -> list: + db = ctx.lifespan_context["db"] + return db.fetch_all("SELECT * FROM users") +``` + +Lifespans can be composed with `|` operator: `lifespan_a | lifespan_b`. + +--- + +## 6. Monorepo / Multi-Package Patterns + +### No Official FastMCP Monorepo Pattern + +There is no official FastMCP guidance on monorepo structure. Community projects are typically single-package. However, `fastmcp.json` does support `editable` paths for workspace setups: + +```json +{ + "environment": { + "editable": [".", "../shared-lib", "/path/to/another-package"] + } +} +``` + +### Python MCP Server Monorepo Approaches + +For a project with multiple MCP servers sharing common code, the recommended approach is a **uv workspace** (see section 7). + +Typical layout: +``` +mcp-servers/ +├── pyproject.toml # Workspace root +├── uv.lock # Single lockfile +├── packages/ +│ ├── common/ # Shared utilities, auth, config +│ │ ├── pyproject.toml +│ │ └── src/common/ +│ ├── server-api/ # API-wrapping MCP server +│ │ ├── pyproject.toml +│ │ ├── fastmcp.json +│ │ └── src/server_api/ +│ └── server-custom/ # Custom tools MCP server +│ ├── pyproject.toml +│ ├── fastmcp.json +│ └── src/server_custom/ +└── tests/ # Or tests within each package +``` + +### Reference: JasperHG90/uv-monorepo + +A practical example showing: +- `shared/` package for common utils +- `src/` package for core application +- Cross-package dependencies via `tool.uv.sources` +- Single lockfile management + +--- + +## 7. uv Workspace Patterns + +### Root `pyproject.toml` + +```toml +[project] +name = "my-mcp-workspace" +version = "0.1.0" +requires-python = ">=3.12" + +[build-system] +requires = ["uv_build>=0.10.9,<0.11.0"] +build-backend = "uv_build" + +[tool.uv.workspace] +members = ["packages/*"] +exclude = ["packages/experimental"] + +# Workspace-wide source overrides +[tool.uv.sources] +common = { workspace = true } +``` + +### Member `pyproject.toml` + +```toml +[project] +name = "server-api" +version = "0.1.0" +requires-python = ">=3.12" +dependencies = [ + "fastmcp>=3.0", + "httpx>=0.27", + "common", # Workspace member dependency +] + +[tool.uv.sources] +common = { workspace = true } + +[build-system] +requires = ["hatchling"] +build-backend = "hatchling.build" + +[tool.hatch.build.targets.wheel] +packages = ["src/server_api"] +``` + +### Key uv Workspace Features + +- **Single lockfile**: `uv lock` resolves all workspace members together +- **Editable installs**: Workspace member dependencies are automatically editable +- **Shared sources**: `tool.uv.sources` in root apply to all members (unless overridden) +- **Targeted execution**: `uv run --package server-api` runs commands in specific member +- **Single `requires-python`**: Intersection of all members (can be limiting) + +### Common Commands + +```bash +uv lock # Lock all workspace dependencies +uv sync # Install all deps (single venv by default) +uv run --package server-api pytest # Run tests for specific package +uv run --package server-api fastmcp run # Run specific server +uv add httpx --package server-api # Add dep to specific member +``` + +### Workspace Layout Example + +``` +my-mcp-workspace/ +├── pyproject.toml # Workspace root +├── uv.lock +├── packages/ +│ ├── common/ +│ │ ├── pyproject.toml +│ │ └── src/ +│ │ └── common/ +│ │ ├── __init__.py +│ │ ├── auth.py # Shared auth helpers +│ │ ├── config.py # Shared pydantic-settings +│ │ └── types.py # Shared types +│ ├── server-mural-api/ +│ │ ├── pyproject.toml +│ │ ├── fastmcp.json +│ │ └── src/ +│ │ └── server_mural_api/ +│ │ ├── __init__.py +│ │ ├── server.py +│ │ ├── routes.py # RouteMap config +│ │ └── tools/ +│ └── server-tools/ +│ ├── pyproject.toml +│ ├── fastmcp.json +│ └── src/ +│ └── server_tools/ +│ ├── __init__.py +│ ├── server.py +│ └── tools/ +└── tests/ + ├── conftest.py # Shared fixtures + ├── test_common/ + ├── test_server_mural_api/ + └── test_server_tools/ +``` + +--- + +## 8. Gotchas & Best Practices + +### FastMCP-Specific + +1. **Don't open Client in fixtures** — Create the server in a fixture, open the client in test body to avoid event loop issues +2. **`from_openapi()` is for bootstrapping** — Curate tools for production; auto-converted APIs have worse LLM performance +3. **`fastmcp.json` is for deployment, not app config** — Use pydantic-settings for app-level configuration +4. **Entry point naming matters** — FastMCP auto-discovers `mcp`, `server`, or `app` variable names in modules +5. **DI params are schema-invisible** — Any param using `Depends()`, `Context` annotation, or `Current*()` is excluded from the MCP tool schema clients see +6. **Lifespan runs once** — Unlike per-session handlers, lifespans execute once at server start/stop regardless of client connections + +### uv Workspace-Specific + +1. **Single `requires-python`** — The workspace takes the intersection of all members; can't have member-specific Python version ranges +2. **No dependency isolation** — Python can't prevent packages from importing deps declared by other workspace members +3. **Build backend matters** — Use `uv_build` or `hatchling`; the workspace root needs a valid build system +4. **Workspace sources are inherited** — `tool.uv.sources` in root apply to all members; member-level sources override completely (not merge) + +### Testing Best Practices + +1. **Always use `asyncio_mode = "auto"`** in pytest config +2. **In-memory transport is the default** — Only use network transports when testing transports themselves +3. **Tests should be <1 second** unless marked `@pytest.mark.integration` +4. **Use `temporary_settings()`** to test config variations without side effects +5. **`inline-snapshot`** is the recommended way to test complex schemas + +### Configuration Best Practices + +1. **Layer configs**: `fastmcp.json` (deployment) + pydantic-settings (app) + `.env` (secrets) +2. **Multi-environment**: Use `dev.fastmcp.json`, `prod.fastmcp.json` for different environments +3. **Env var interpolation**: `${VAR_NAME}` in `fastmcp.json` deployment.env section +4. **CLI overrides**: Any `fastmcp.json` value can be overridden via CLI args + +--- + +## 9. Recommended Stack for New FastMCP Project + +| Concern | Tool | Notes | +|---------|------|-------| +| Package manager | uv | Workspace support, fast resolution | +| Build system | hatchling or uv_build | hatchling for single pkg, uv_build for workspaces | +| Framework | FastMCP >= 3.0 | `from_openapi()`, DI, lifespans | +| HTTP client | httpx | Required by `from_openapi()`, async support | +| App config | pydantic-settings | Env var loading, .env support, type-safe | +| Deployment config | fastmcp.json | Declarative, portable | +| Testing | pytest + pytest-asyncio | In-memory client testing | +| Snapshot testing | inline-snapshot | Complex data structure assertions | +| Flexible assertions | dirty-equals | Dynamic/non-deterministic values | +| Monorepo | uv workspace | Single lockfile, cross-package deps | +| Linting | ruff | Fast, comprehensive | +| Type checking | pyright or mypy | FastMCP uses standard typing | diff --git a/fastmcp-migration/research-fastmcp-structuredContent-support.md b/fastmcp-migration/research-fastmcp-structuredContent-support.md new file mode 100644 index 0000000..e69dc36 --- /dev/null +++ b/fastmcp-migration/research-fastmcp-structuredContent-support.md @@ -0,0 +1,163 @@ +# Research: fastmcp `structuredContent` Support + +**Date**: 2026-03-09 +**Status**: Complete +**Prompt**: `.cursor/prompts/research-fastmcp-structuredContent-support.md` + +## Verdict + +**fastmcp v3.34.0 does NOT support `structuredContent` on the success path.** The `outputSchema` feature added in v3.34.0 is incomplete -- it only handles the tool listing (definition) side, not the tool response side. + +**Why**: `structuredContent` and `outputSchema` are **not in the 2025-03-26 MCP spec** that fastmcp was likely targeting. They were added in the **2025-06-18 revision** (major change #8: "Add support for structured tool output", PR [#371](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/371)). So fastmcp's omission is not a bug or oversight -- it's a newer spec feature that hasn't been fully adopted yet. The v3.34.0 `outputSchema` addition was a partial step toward 2025-06-18 compliance, but the response-path `structuredContent` handling was not included. + +The spec requirements (from the draft, which supersedes 2025-06-18) are: +- "Servers **MUST** provide structured results that conform to this schema" (when `outputSchema` is present) +- "Clients **SHOULD** validate structured results against this schema" +- "For backwards compatibility, a tool that returns structured content **SHOULD** also return the serialized JSON in a TextContent block" + +xmcp already supports this because it works directly with the `@modelcontextprotocol/sdk` TypeScript types, which track the draft spec and include `structuredContent` on `CallToolResult`. + +--- + +## Question 1: Is `structuredContent` supported in a newer version? + +**No.** v3.34.0 is the latest release (March 6, 2026). It added `outputSchema` to tool definitions via PR [#247](https://github.com/punkpeye/fastmcp/pull/247), but this PR only touched: + +- The `Tool` type (added `outputSchema?: Params`) +- The `listTools` handler (converts `outputSchema` to JSON Schema in listing response) + +It did **not** change the execute/response handling path at all (+91/-0 lines, only in type definitions and listing). + +## Question 2: Does `outputSchema` exist, and what does it do? + +**Yes, `outputSchema` exists but is half-implemented.** You can declare it on `addTool()`: + +```typescript +server.addTool({ + name: "get-weather", + parameters: z.object({ city: z.string() }), + outputSchema: z.object({ + humidity: z.number(), + temperature: z.number(), + }), + execute: async (args) => { + // NOTE: returns JSON.stringify, NOT structuredContent + return JSON.stringify({ humidity: 65, temperature: 72 }); + }, +}); +``` + +The `outputSchema` appears in `listTools` responses as JSON Schema. But the `execute` handler still returns a plain string -- there is no mechanism to return `structuredContent`. + +## Question 3: What's the exact blocking code path? + +The blocker is `ContentResultZodSchema` with `.strict()` at [FastMCP.ts:388-393](https://github.com/punkpeye/fastmcp/blob/v3.34.0/src/FastMCP.ts): + +```typescript +type ContentResult = { + content: Content[]; + isError?: boolean; +}; + +const ContentResultZodSchema = z + .object({ + content: ContentZodSchema.array(), + isError: z.boolean().optional(), + }) + .strict() satisfies z.ZodType; +``` + +When `execute` returns an object (not string/Content), it hits this validation at line ~2128: + +```typescript +} else { + result = ContentResultZodSchema.parse(maybeStringResult); +} +``` + +`.strict()` rejects any property not in the schema. Returning `{ content: [...], structuredContent: {...} }` throws a ZodError. + +There is **no separate code path** for tools with `outputSchema`. No middleware. No bypass. The underlying `@modelcontextprotocol/sdk` never sees `structuredContent` because fastmcp's validation strips it before passing through. + +The **error path** (line ~2131-2136) does use `structuredContent` via `UserError.extras`, but this bypasses the schema (it's in the catch block) and marks the response as `isError: true`. + +## Question 4: Has this been raised? + +- [Issue #207](https://github.com/punkpeye/fastmcp/issues/207) requested `outputSchema` and was closed by PR #247, but the issue only asked for the listing side +- No open issue specifically addresses `structuredContent` in the success response path +- The VS Code issue [#290063](https://github.com/microsoft/vscode/issues/290063) and MCP spec issue [#1624](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/1624) show the ecosystem is actively debating `structuredContent` semantics + +## Question 6: Why doesn't the spec require fastmcp to implement this? + +**`structuredContent`/`outputSchema` was not in the MCP spec revision (2025-03-26) that fastmcp was targeting.** It was introduced in the **2025-06-18 revision** as major change #8 via [PR #371](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/371). The spec revision history: + +- **2025-03-26** -- No `structuredContent`, no `outputSchema`. Tool results only have `content` and `isError`. +- **2025-06-18** -- Adds `structuredContent` + `outputSchema` (PR #371). Also adds resource links in tool results, elicitation, and security improvements. +- **2025-11-25** (latest stable) -- Does not change `structuredContent` behavior; adds tasks, OAuth improvements, tool naming guidance. +- **Draft** -- Same `structuredContent` semantics as 2025-06-18. Adds the normative language: servers MUST conform to `outputSchema`, clients SHOULD validate, tools SHOULD also return serialized JSON in a TextContent block for backwards compatibility. + +The `@modelcontextprotocol/sdk` TypeScript package tracks the draft spec and already includes `structuredContent` on `CallToolResult`. This is why xmcp (which uses the SDK types directly) supports it today. fastmcp defines its own `ContentResultZodSchema` independently, which is still aligned with the 2025-03-26 shape. + +fastmcp is not non-compliant -- it simply hasn't caught up to the 2025-06-18 spec revision on the response side. The `outputSchema` addition in v3.34.0 shows intent to adopt the feature; the response handling likely just hasn't been prioritized yet. + +## Question 5: How does xmcp differ? + +xmcp's `transformToolHandler` in `packages/xmcp/src/runtime/utils/transformers/tool.ts` handles `structuredContent` comprehensively: + +- **Lines 168-197**: Accepts `structuredContent` alone, `content` alone, or both together +- **Lines 220-244**: Validates `structuredContent` is a plain object (not array/primitive) +- **Line 246**: Passes the full response through to the MCP SDK without stripping fields +- Uses `CallToolResult` type from `@modelcontextprotocol/sdk/types` directly, which includes `structuredContent` + +The architecture is fundamentally different: xmcp validates and passes through, while fastmcp validates through a restrictive Zod schema that strips unknown fields. + +--- + +## Impact on the `get-upload-url` Use Case + +The banksy `get-upload-url` tool generates a signed Azure SAS URL for blob uploads. The multi-step image upload flow is: + +1. LLM calls `get-upload-url` with `{ muralId, filename }` -- returns `{ name, url }` +2. LLM uses the signed URL to upload the image binary (via another tool call) +3. LLM calls `update-widgets` to create a `TitledImageWidget` with `imageUrl` set to the blob name + +The signed URL is an Azure SAS URL -- long, opaque, with base64-encoded signature tokens (query params like `sig=`, `se=`, `sp=`). Exact character preservation is critical: a single wrong character yields a silent 403 from Azure. + +### `structuredContent` path (xmcp today) + +The `{ name, url }` response goes directly to the calling code as typed data. The client code extracts the URL programmatically and passes it to the next operation. The LLM never sees or handles the raw URL -- it is routed through code, not through the model's context window. Zero risk of corruption. + +### `JSON.stringify` text path (fastmcp workaround) + +The `{ name, url }` response is serialized as a text string in `content[0].text`. The LLM receives the full SAS URL in its context and must: + +- Parse the JSON from text (or recognize the structure) +- Carry the URL forward correctly into the next tool call argument +- Preserve every character of a ~200-character URL with base64 tokens + +**Reliability risk: moderate-to-low.** Modern LLMs are generally good at echoing URLs verbatim from tool results into subsequent tool calls. The primary risk is not hallucination of the URL itself, but rather: + +- **Context window noise**: The SAS URL adds ~200 tokens of opaque garbage to the LLM's context. It provides no useful information to the model -- it is purely machine-to-machine data. +- **Multi-step fragility**: If the conversation is long or the LLM's context is crowded, the URL must survive across multiple reasoning steps without being summarized, truncated, or dropped from working memory. +- **No validation**: Without `outputSchema`/`structuredContent`, there is no schema-level guarantee that the response contains the expected `{ name, url }` shape. A malformed API response would be silently passed as text, and the LLM would need to figure out the error from unstructured text. + +### Verdict for this use case + +For a tool where the response is purely **machine-to-machine data** (signed URLs, tokens, opaque identifiers) that the LLM has no reason to interpret -- only relay -- `structuredContent` is the right abstraction. Routing it through text content works in practice but is architecturally wrong: it forces data through a natural-language channel that exists for human-readable content. + +That said, this is unlikely to be a blocking reliability problem. LLMs handle URL relay well in practice. The bigger concern is the design principle: as banksy adds more tools that return programmatic data (not just upload URLs), the lack of `structuredContent` becomes a pattern problem rather than a one-off inconvenience. + +--- + +## Workarounds + +- **`JSON.stringify()` as text** -- Works today, simple. Clients must parse; no typed validation; loses `structuredContent` semantics. +- **`UserError` with `extras`** -- Uses `structuredContent` field. Marks response as `isError: true` (semantically wrong). +- **Stay on xmcp** -- Full support today. Blocks fastmcp migration. + +## Recommendation + +**Do not rely on fastmcp supporting `structuredContent` by migration time.** The feature hasn't caught up to the 2025-06-18 spec revision on the response side, and there's no open issue or PR targeting it. Options in order of preference: + +1. **Use `JSON.stringify` as the interim pattern** in banksy tools. Declare `outputSchema` for documentation purposes, return stringified JSON from `execute`. This is what fastmcp's own test does. When fastmcp eventually adopts the 2025-06-18 spec fully, migrate to `structuredContent` at that point. +2. **Keep xmcp for tools that need `structuredContent`** and migrate other tools to fastmcp first. Migrate the structured tools once fastmcp adds support. diff --git a/fastmcp-migration/research-mcp-server-frameworks.md b/fastmcp-migration/research-mcp-server-frameworks.md new file mode 100644 index 0000000..45fdba2 --- /dev/null +++ b/fastmcp-migration/research-mcp-server-frameworks.md @@ -0,0 +1,395 @@ +# MCP Server Framework Landscape — Research (Feb 2026) + +## Executive Summary + +The MCP server framework ecosystem has matured rapidly. Every major language now has at least one production-grade option. The landscape splits into two tiers: + +1. **Official SDKs** — maintained by Anthropic's `modelcontextprotocol` org. Low-level, flexible, "library" approach. +2. **Community frameworks** — opinionated layers on top of official SDKs (or standalone). Higher-level, "batteries-included" approach. + +Python and TypeScript are the most mature ecosystems by far. Go is solid but singular (one dominant library). Rust is the most fragmented with the most compile-time magic. + +--- + +## TypeScript + +### @modelcontextprotocol/sdk (Official) + +| Attribute | Value | +|---|---| +| **GitHub** | `modelcontextprotocol/typescript-sdk` | +| **Stars** | ~11k | +| **npm** | `@modelcontextprotocol/sdk` (v1.13.2+, 7k+ dependents) | +| **Tier** | Tier 1 (official) | +| **License** | MIT | + +**Design philosophy**: Library, not framework. Maximum flexibility and control. You wire everything yourself. + +**What it gives you**: +- Full MCP spec implementation (tools, resources, prompts, sampling) +- stdio + Streamable HTTP + SSE transports +- MCP client *and* server support +- TypeScript types for the full protocol + +**What it doesn't give you**: +- No file-based routing or auto-discovery +- No built-in auth +- No CLI scaffolding +- No hot reload / DX sugar +- No deployment helpers + +**Best for**: Library authors, framework builders, projects needing full protocol control, or when you need an MCP *client* (not just server). + +--- + +### xmcp + +| Attribute | Value | +|---|---| +| **GitHub** | ~1.2k stars | +| **npm** | `xmcp` | +| **License** | MIT | + +**Design philosophy**: "Next.js for MCP." Convention-over-configuration, file-based routing, best DX. + +**What it gives you**: +- File-system routing — drop a file in `tools/`, it's a tool +- Three-export convention: `schema` (Zod), `metadata`, `default` handler +- Hot reload in dev +- Plugin system (auth0, better-auth, clerk, etc.) +- `npx create-xmcp-app` scaffolding +- Vercel / Cloudflare / Express deployment targets +- Middleware support +- Built on top of the official SDK + +**What it doesn't give you**: +- No built-in MCP client +- Smaller ecosystem/community than official SDK +- Plugin system is still growing + +**Opinionatedness**: High. Enforces file structure, export conventions, config file shape. + +**Best for**: Teams that want fast iteration and a familiar web-framework DX. Good for "I want an MCP server running in 5 minutes." + +**Note**: This is what **banksy** uses. + +--- + +### fastmcp (TypeScript — punkpeye) + +| Attribute | Value | +|---|---| +| **GitHub** | ~1.4k stars | +| **npm** | `fastmcp` (v1.23.1) | +| **License** | MIT | + +**Design philosophy**: Session-centric, production-ready. Inspired by the Python FastMCP but for TypeScript. + +**What it gives you**: +- Simple tool/resource/prompt definition API +- Built-in authentication and session management +- Image and audio content handling +- Logging, error handling, progress notifications +- SSE with automatic pings +- CORS enabled by default +- Prompt argument auto-completion +- Sampling support +- Built on official SDK + Hono + Zod + +**What it doesn't give you**: +- No file-based routing (imperative registration) +- No CLI scaffolding +- No deployment helpers + +**Opinionatedness**: Medium. Provides conventions but doesn't enforce file structure. + +**Best for**: Production servers that need session management, auth, and streaming out of the box. + +--- + +### mcp-framework + +| Attribute | Value | +|---|---| +| **GitHub** | ~880 stars | +| **npm** | `mcp-framework` (v0.2.18, ~78k weekly downloads) | +| **License** | MIT | + +**Design philosophy**: Rails-like scaffolding for MCP. CLI-first workflow. + +**What it gives you**: +- `mcp create my-server` / `mcp add tool ` CLI +- Directory-based auto-discovery +- Built-in auth: JWT, API keys, OAuth 2.1 +- Schema validation (Zod) +- stdio + SSE + HTTP Stream transports +- `mcp validate` for schema checking + +**What it doesn't give you**: +- Less flexible than raw SDK +- Smaller community +- More "framework lock-in" than the others + +**Opinionatedness**: High. Strong conventions for project structure and component shapes. + +**Best for**: Enterprise teams that want structure, conventions, and built-in auth/validation from day one. + +--- + +### Cloudflare Agents SDK + +| Attribute | Value | +|---|---| +| **Docs** | `developers.cloudflare.com/agents` | +| **Version** | v0.6.0 (Feb 2026) | +| **License** | Proprietary (Cloudflare platform) | + +**Design philosophy**: Stateful edge agents with MCP support as a feature (not the whole product). + +**What it gives you**: +- `McpAgent` class with Durable Object–backed state +- Built-in SQL database per agent +- WebSocket + SSE support +- Cloudflare Access / OAuth integration +- RPC transport (zero-overhead intra-Worker MCP calls) +- Global edge deployment with auto-scaling +- Hibernation and wake-up lifecycle + +**What it doesn't give you**: +- Cloudflare-only deployment (vendor lock-in) +- No stdio transport (meaningless on edge) +- More complex mental model (Durable Objects, Workers, bindings) + +**Opinionatedness**: Very high (it's a platform, not just a library). + +**Best for**: Teams already on Cloudflare who want globally distributed, stateful MCP servers. + +--- + +## Python + +### Official Python SDK (`mcp`) + +| Attribute | Value | +|---|---| +| **GitHub** | `modelcontextprotocol/python-sdk` | +| **PyPI** | `mcp` | +| **Tier** | Tier 1 (official) | +| **License** | MIT | + +**Design philosophy**: Two-layer SDK — low-level `Server` class for full control, high-level `FastMCP` class for quick starts. + +**Key detail**: FastMCP v1.0 was **merged into the official SDK** in late 2024. The official SDK's recommended entry point *is* the FastMCP API. This is unusual — most languages keep the official SDK minimal and leave frameworks to third parties. + +**What it gives you**: +- `FastMCP` high-level API (decorators: `@mcp.tool()`, `@mcp.resource()`, `@mcp.prompt()`) +- Low-level `Server` API for advanced customization +- Automatic schema generation from type hints + docstrings +- Pydantic model support for complex inputs +- Lifespan management (startup/shutdown hooks) +- stdio + SSE + Streamable HTTP transports +- Async-first (`asyncio`) +- MCP client support + +**What it doesn't give you**: +- No CLI scaffolding +- No built-in auth (beyond what transports provide) +- No file-based routing + +--- + +### FastMCP (Standalone — jlowin) + +| Attribute | Value | +|---|---| +| **GitHub** | `jlowin/fastmcp` (~19.5k stars) | +| **PyPI** | `fastmcp` (v2.2.5) | +| **License** | MIT | + +**Design philosophy**: "The fast, Pythonic way to build MCP servers and clients." Batteries-included but still Pythonic. + +**Key relationship**: FastMCP v1 was donated to the official SDK. FastMCP v2+ is the standalone continuation with more features. Claims to power ~70% of all MCP servers across all languages. + +**What it gives you (beyond the official SDK)**: +- MCP client + server +- "Apps" — interactive UIs rendered in conversations +- More advanced composition patterns +- Richer middleware/plugin support +- Active standalone community +- 1M+ daily downloads + +**What it doesn't give you**: +- Two similar-but-different packages exist (`mcp` official vs `fastmcp` standalone) which can confuse dependency management +- No CLI scaffolding or project generators + +**Opinionatedness**: Medium. Decorators are opinionated; project structure is up to you. + +**Best for**: Python-first teams. The default choice for Python MCP servers. + +--- + +## Go + +### mcp-go (mark3labs) + +| Attribute | Value | +|---|---| +| **GitHub** | `mark3labs/mcp-go` (~4.3k stars) | +| **Docs** | `mcp-go.dev` | +| **License** | MIT | + +**Design philosophy**: "Fast, simple, and complete." The de facto standard Go MCP library. No real competitors. + +**What it gives you**: +- Complete MCP spec implementation +- Typed tools with compile-time safety and auto-validation +- Middleware support +- Session management with per-session tool filtering +- Hooks for lifecycle management +- Icons for servers/tools/resources +- stdio + HTTP transports +- Minimal boilerplate + +**What it doesn't give you**: +- No CLI scaffolding +- No built-in auth (middleware-based approach) +- No file-based routing +- Smaller ecosystem than TS/Python (fewer plugins, examples) + +**Opinionatedness**: Low-medium. Provides the primitives, you compose them. Go-idiomatic. + +**Best for**: Go teams that want a single, well-maintained, complete library. No framework decision paralysis — there's really only one serious choice. + +--- + +## Rust + +### rmcp (Official) + +| Attribute | Value | +|---|---| +| **GitHub** | `modelcontextprotocol/rust-sdk` | +| **Crates.io** | `rmcp` (v0.16.0, 4.2M downloads) | +| **Tier** | Tier 1 (official) | +| **License** | MIT | + +**Design philosophy**: Feature-flag driven, composable. Lean core with opt-in capabilities. + +**What it gives you**: +- Procedural macros for tool generation (`#[tool]`) +- Extensive feature flags (server, client, transports, auth, schemars, tower integration) +- OAuth2 auth support +- Streamable HTTP server/client +- Async-first (Tokio) +- JSON Schema generation via `schemars` +- Tower service integration (middleware ecosystem) + +**What it doesn't give you**: +- Steep learning curve (feature flags, macros, async Rust) +- No CLI scaffolding +- No file-based routing +- Documentation is sparse compared to TS/Python + +**Opinionatedness**: Low. Very modular — you pick the features you need. + +**Best for**: Rust teams that want official support, maximum performance, and are comfortable with the Rust MCP learning curve. + +--- + +### Turul MCP Framework + +| Attribute | Value | +|---|---| +| **GitHub** | `aussierobots/turul-mcp-framework` | +| **Crates.io** | `turul-mcp-server` (v0.2.1) | +| **License** | MIT | + +**Design philosophy**: Zero-configuration, batteries-included Rust MCP. The "framework" answer to rmcp's "library" approach. + +**What it gives you**: +- Four tool creation approaches (function macros → derive → builder → manual) +- Automatic method determination from types +- Compile-time schema generation +- Pluggable storage backends (InMemory, SQLite, PostgreSQL, DynamoDB) +- SSE streaming notifications +- UUID v7 session management +- Full MCP 2025-06-18 spec compliance +- Transport-agnostic architecture (serverless-ready) + +**What it doesn't give you**: +- Smaller community and fewer production deployments +- Less documentation than rmcp +- Newer, less battle-tested + +**Opinionatedness**: High for Rust. Automates away boilerplate via derive macros and convention. + +**Best for**: Rust teams that want a higher-level framework with storage backends and less ceremony. + +--- + +## Comparison Matrix + +| | Official TS SDK | xmcp | fastmcp (TS) | mcp-framework | FastMCP (Py) | mcp-go | rmcp (Rust) | Turul (Rust) | +|---|---|---|---|---|---|---|---|---| +| **Stars** | ~11k | ~1.2k | ~1.4k | ~880 | ~19.5k | ~4.3k | ~4.2M dl | new | +| **Maturity** | High | Medium | Medium | Medium | High | High | High | Low | +| **Approach** | Library | Framework | Framework | Framework | Framework | Library+ | Library | Framework | +| **File routing** | No | Yes | No | Yes (dir) | No | No | No | No | +| **CLI scaffold** | No | Yes | No | Yes | No | No | No | No | +| **Built-in auth** | No | Plugin | Yes | Yes | No | No | Feature flag | No | +| **Session mgmt** | Manual | Manual | Yes | Manual | Manual | Yes | Manual | Yes | +| **Hot reload** | No | Yes | No | No | No | No | No | No | +| **Client support** | Yes | No | No | No | Yes | Yes | Yes | Yes | +| **Storage backends** | No | No | No | No | No | No | No | Yes | +| **Deploy helpers** | No | Vercel/CF | No | No | No | No | No | No | +| **Opinionatedness** | Low | High | Medium | High | Medium | Low | Low | High | + +--- + +## Key Takeaways + +### If you're in TypeScript (as banksy/xmcp is): + +1. **xmcp** occupies a unique niche — it's the only TS framework with file-based routing, hot reload, and a Next.js-style DX. No other TS framework matches this developer experience. + +2. **fastmcp (TS)** is the closest alternative if you want a framework without file-based routing but with better session management and auth built in. It's more of a "production server library" than a "DX framework." + +3. **The official SDK** remains the foundation all others build on. Directly using it makes sense if you're building your own framework (which is what xmcp does). + +4. **mcp-framework** is worth watching for its CLI tooling and OAuth 2.1 support but has the smallest community. + +### If you were starting fresh in Python: + +- **FastMCP (standalone)** is the dominant choice. 19.5k stars, 1M+ daily downloads, powers ~70% of MCP servers. The official Python SDK actually embeds FastMCP v1 as its high-level API. You'd be swimming against the current to not use it. + +### Go and Rust: + +- **Go**: `mcp-go` is the clear and only real choice. Well-maintained, complete, Go-idiomatic. +- **Rust**: `rmcp` (official) for maximum control, `turul` if you want a framework. Both are solid but Rust MCP is the least mature ecosystem overall. + +### The "FastMCP" brand confusion: + +There are **three different things** called FastMCP: +1. **FastMCP Python (jlowin)** — the standalone Python framework (19.5k stars) +2. **FastMCP in the official Python SDK** — v1.0 was donated/merged in +3. **fastmcp TypeScript (punkpeye)** — a completely separate TypeScript project (1.4k stars) + +These are unrelated codebases that share a name. The Python one came first and is far more popular. + +--- + +## Relevance to Banksy + +Banksy uses **xmcp**, which sits in the "opinionated TypeScript framework" tier. Among the frameworks surveyed, xmcp's closest analogs are: + +- **mcp-framework** (similar opinionatedness, different conventions) +- **FastMCP Python** (similar philosophy — decorators instead of file routing — but different language) + +The main things banksy doesn't get from xmcp that other frameworks provide: +- **Session management** (fastmcp TS has this built in) +- **Storage backends** (Turul Rust has this) +- **Edge deployment with stateful agents** (Cloudflare Agents SDK) + +None of these are current pain points — they'd matter if banksy needed persistent server-side state or multi-turn session tracking. + +See also: [xmcp vs FastMCP deep dive](research-xmcp-vs-fastmcp-deep-dive.md) for an in-depth comparison in the context of banksy. diff --git a/fastmcp-migration/research-xmcp-vs-fastmcp-deep-dive.md b/fastmcp-migration/research-xmcp-vs-fastmcp-deep-dive.md new file mode 100644 index 0000000..2e30602 --- /dev/null +++ b/fastmcp-migration/research-xmcp-vs-fastmcp-deep-dive.md @@ -0,0 +1,366 @@ +# Deep Dive: xmcp vs FastMCP (Standalone Python) — In Context of Banksy + +Related: [MCP Server Framework Landscape](./research-mcp-server-frameworks.md) + +## How Banksy Works Today + +Before comparing frameworks, it's worth being precise about banksy's architecture because it's unusual. Banksy is not a simple MCP server — it's a **three-process orchestration system**: + +``` +LLM Client + ↓ MCP (HTTP) +banksy-core (xmcp, port 3001) + ↓ MCP (HTTP) ↓ MCP (HTTP) +banksy-mural-api (5678) banksy-public-api (5679) + ↓ REST ↓ REST +MURAL Internal API MURAL Public API +``` + +1. **banksy-mural-api** and **banksy-public-api** are thin wrappers using `@ivotoby/openapi-mcp-server` to convert OpenAPI specs into MCP tools. They handle REST-to-MCP translation. +2. **banksy-core** is the user-facing xmcp server. It imports tool schemas/metadata *generated from* those API wrappers, manages per-user OAuth tokens in PostgreSQL, and proxies tool calls back to the API wrappers with the correct auth headers via `MuralToolCaller`. + +The xmcp framework provides the file-based routing, Zod schemas, and DX for banksy-core. But the heavy lifting — token management, multi-server coordination, code generation — is all custom banksy code. + +--- + +## Comparing the Two Frameworks (Language Aside) + +### Tool Definition + +**xmcp**: File-convention based. Each tool is a file with three named exports. +```typescript +// tools/public/get-mural-by-id.ts +export const schema = getMuralByIdSchema.shape; +export const metadata: ToolMetadata = getMuralByIdMetadata; +export default async function handler(args: InferSchema) { + return callMuralTool('get-mural-by-id', args); +} +``` + +**FastMCP**: Decorator based. Each tool is a function with a decorator. +```python +@mcp.tool() +async def get_mural_by_id(mural_id: str) -> dict: + """Retrieve a mural by its ID.""" + return await mural_client.get(f"/murals/{mural_id}") +``` + +**Takeaway**: FastMCP is more concise — the schema is inferred from type hints and docstrings, no separate schema/metadata exports needed. xmcp is more explicit — you control exactly what the schema and metadata look like. For auto-generated tools (like banksy's), xmcp's explicitness is fine since the code is generated anyway. For hand-written tools, FastMCP's decorator approach is faster to iterate on. + +--- + +### Schema & Validation + +| | xmcp | FastMCP | +|---|---|---| +| Schema system | Zod (explicit) | Python type hints + Pydantic (inferred) | +| Validation | At tool boundary via Zod | Automatic from type annotations | +| Complex types | Manual Zod composition | Pydantic models with nested validation | +| Schema generation | Manual or code-gen | Automatic from function signatures + docstrings | +| Documentation | Metadata object | Docstrings become tool descriptions | + +FastMCP's approach is significantly less ceremony for the common case. You write a typed function, and the framework handles schema generation, validation, and documentation. xmcp gives you more control, but you pay for it in boilerplate (or code generation, as banksy does). + +--- + +### Dependency Injection & Context + +**xmcp**: No built-in DI. Banksy handles context manually — `callMuralTool()` reaches into a global config, fetches the current user from... somewhere (auth middleware), retrieves tokens from PostgreSQL, and constructs a `MuralToolCaller`. All hand-wired. + +**FastMCP**: First-class DI system. The `Context` object is automatically injected into any tool that declares it: +```python +@mcp.tool() +async def get_mural(mural_id: str, ctx: Context = CurrentContext()) -> dict: + user = await ctx.get_state("user") + await ctx.info(f"Fetching mural {mural_id} for {user.id}") + await ctx.report_progress(0.5, 1.0) + # ... +``` + +The DI system also supports custom dependencies, session-scoped state, and accessing the raw HTTP request/headers. Dependencies are automatically excluded from the tool's MCP schema. + +**Takeaway**: This is a meaningful gap. Banksy's hand-wired context propagation (AsyncLocalStorage for tokens, manual userId lookups) is exactly the kind of cross-cutting concern that FastMCP's DI was designed to handle. If banksy were in FastMCP, the token-refresh-and-inject pattern could be a single injectable dependency rather than the multi-layer `MuralToolCaller` → `getAndRefreshTokens()` → `PerRequestAuthProvider` → `AsyncLocalStorage` chain. + +--- + +### Server Composition & Proxying + +This is where the comparison gets most relevant to banksy's architecture. + +**xmcp**: No built-in composition. Banksy's three-server architecture is entirely custom. The code generation step (`xmcp-dev-cli generate`) connects to the API wrapper servers, discovers their tools, and generates TypeScript client code. At runtime, banksy-core is an MCP server that makes MCP client calls to the API wrappers. + +**FastMCP**: First-class composition primitives: + +| Pattern | What it does | +|---|---| +| `mount()` | Live-link a sub-server. Requests are delegated at runtime. Changes propagate instantly. | +| `import_server()` | One-time copy of tools/resources from another server. Static, fast. | +| `FastMCP.as_proxy()` | Create a transparent proxy to any MCP server (local or remote). | +| Namespace prefixes | Automatic name-prefixing to avoid collisions when composing. | + +A FastMCP equivalent of banksy's architecture could look like: + +```python +from fastmcp import FastMCP +from fastmcp.client import ProxyClient + +main = FastMCP("Banksy") + +# Mount the API wrappers as live sub-servers +main.mount( + FastMCP.as_proxy(ProxyClient("http://mural-api:5678/mcp")), + namespace="internal" +) +main.mount( + FastMCP.as_proxy(ProxyClient("http://public-api:5679/mcp")), + namespace="public" +) +``` + +No code generation. No generated client files. No promise-proxy pattern. The composition is declarative and the proxy handles discovery, forwarding, and session isolation. + +**Or even more directly** — since FastMCP has `from_openapi()`: + +```python +import httpx +from fastmcp import FastMCP + +main = FastMCP("Banksy") + +public_api_client = httpx.AsyncClient( + base_url="https://api.mural.co", + headers={"Authorization": f"Bearer {token}"} +) +public_api_spec = httpx.get("https://api.mural.co/openapi.json").json() + +api_server = FastMCP.from_openapi( + openapi_spec=public_api_spec, + client=public_api_client, + name="MURAL Public API" +) +main.mount(api_server, namespace="mural") +``` + +This collapses banksy's entire three-process architecture into a single process. The "API wrapper" layer disappears — FastMCP creates tools directly from the OpenAPI spec and handles the HTTP calls internally. + +**Takeaway**: FastMCP's composition model is banksy's architecture made into a framework primitive. What banksy builds with three processes, code generation, and custom proxy code, FastMCP provides as `mount()` + `from_openapi()`. + +--- + +### Tool Transformation & Curation + +This speaks directly to the "API wrapper first approach is limiting" observation. + +**The problem with pure API wrapping**: An auto-generated MCP tool for `POST /api/v1/murals/{muralId}/widgets` exposes every parameter of the REST endpoint. The tool name is derived from the URL. The description comes from the OpenAPI spec (written for human developers, not LLMs). The result is an MCP server that's a 1:1 mirror of the REST API — technically complete but not optimized for how an LLM thinks about tasks. + +**FastMCP's answer — Tool Transforms**: + +```python +from fastmcp import Tool, ArgTransform + +# Start with the auto-generated tool from OpenAPI +raw_tool = api_server.get_tool("post-widgets") + +# Create a curated version +curated = Tool.from_tool( + raw_tool, + name="add_sticky_note", + description="Add a sticky note to a mural with text content", + transform_args={ + "muralId": ArgTransform(hide=True), # injected from context + "type": ArgTransform(hide=True, default="sticky_note"), + "text": ArgTransform(description="The text to put on the sticky"), + "x": ArgTransform(description="X position", default=0), + "y": ArgTransform(description="Y position", default=0), + } +) +``` + +This lets you start from the auto-generated API tool and reshape it for LLM consumption — hiding parameters, renaming things, providing defaults, improving descriptions — without reimplementing the underlying HTTP call. The transform is a layer over the raw tool, not a replacement. + +xmcp has no equivalent. In banksy, the "curation" happens either in the generated code (which gets overwritten) or in the banksy-core tool handler (which is a thin proxy). There's no in-between layer for reshaping tool interfaces. + +--- + +## Net New Capabilities in FastMCP (Beyond the Comparison) + +These are capabilities FastMCP has that don't map to anything in xmcp or the current banksy architecture. They represent genuinely new patterns. + +### 1. Elicitation — Tools That Ask Questions + +```python +@mcp.tool() +async def create_mural(ctx: Context = CurrentContext()) -> str: + """Create a new mural with guided setup.""" + + title = await ctx.elicit("What should the mural be titled?", str) + if title.action != "accept": + return "Cancelled." + + template = await ctx.elicit( + "Which template?", + Literal["brainstorm", "retrospective", "blank"] + ) + + size = await ctx.elicit("How large? (width x height)", tuple[int, int]) + + return await create_mural_impl(title.data, template.data, size.data) +``` + +Elicitation allows a tool to **pause execution and ask the user for input** mid-flight. The tool doesn't need all parameters upfront — it can progressively gather information through a conversational flow. The user can accept, decline, or cancel at each step. + +**Why this matters for banksy-like use cases**: Today, creating complex MURAL content (a structured workshop, a facilitated session) requires the LLM to gather all parameters in the prompt, then fire a single tool call with everything. Elicitation lets a tool guide the user through a multi-step creation flow, validating and adjusting at each step. It moves from "one-shot tool call" to "interactive wizard." + +### 2. Dynamic Tool Visibility + +```python +@mcp.tool(tags={"admin"}) +async def delete_workspace() -> str: ... + +@mcp.tool(tags={"facilitator"}) +async def start_voting_session() -> str: ... + +@mcp.tool(tags={"viewer"}) +async def get_mural() -> str: ... + +# At session setup, based on user role: +mcp.enable(tags={"viewer"}, only=True) # viewers see only viewer tools +mcp.enable(tags={"viewer", "facilitator"}) # facilitators see more +``` + +Tools can be shown or hidden per-session based on tags. Combined with session-scoped state, this means different users see different tool sets from the same server — without running separate server instances. + +**Why this matters**: Banksy currently has two separate tool directories (`internal` vs `public`) selected by config at startup time. FastMCP's visibility system would let a single server instance dynamically adjust which tools are available based on the authenticated user's role, plan tier, or permissions — at the session level, not the deployment level. + +### 3. Apps — Interactive UIs in Conversations + +FastMCP tools can return interactive HTML/JS UIs that render directly in the conversation as sandboxed iframes. Version 3.1 will add a Python-native UI framework for this. + +**Why this matters**: Imagine a "create workshop" tool that returns an interactive canvas preview, or a "review mural" tool that renders a thumbnail with clickable regions. This goes well beyond text responses. + +### 4. OpenAPI-to-MCP with Route Mapping + +```python +from fastmcp import FastMCP, RouteMap + +mcp = FastMCP.from_openapi( + openapi_spec=spec, + client=client, + route_map=[ + RouteMap(methods=["GET"], pattern="/murals/*", type="RESOURCE"), + RouteMap(methods=["POST", "PUT"], pattern="/murals/*", type="TOOL"), + RouteMap(tags=["deprecated"], type="EXCLUDE"), + RouteMap(pattern="/internal/*", type="EXCLUDE"), + ] +) +``` + +Not just "convert all endpoints to tools" — you can declaratively classify endpoints as resources (read-only, cacheable), tools (actions), or excluded. This maps to how MCP *should* model an API: GETs as resources, mutations as tools. + +Banksy's current approach treats everything as a tool regardless of HTTP method. + +### 5. Progress Reporting & Streaming + +```python +@mcp.tool() +async def export_mural(mural_id: str, ctx: Context = CurrentContext()) -> str: + await ctx.report_progress(0, 100) + data = await fetch_mural_data(mural_id) + await ctx.report_progress(30, 100) + rendered = await render_export(data) + await ctx.report_progress(80, 100) + url = await upload_export(rendered) + await ctx.report_progress(100, 100) + return f"Export ready: {url}" +``` + +Long-running tools can report progress back to the client in real-time. Combined with the `handleOperation`/`publishEventToUser` pattern from mural-api's worker system, this could provide end-to-end progress visibility for async operations. + +### 6. First-Class Testing + +```python +# test_tools.py +from fastmcp.utilities.tests import run_server_async + +async def test_get_mural(): + async with run_server_async(mcp) as url: + async with Client(url) as client: + result = await client.call_tool("get_mural", {"mural_id": "123"}) + assert result.content[0].text == "..." +``` + +FastMCP ships with `run_server_async()` (in-process, no subprocess), `run_server_in_process()` (separate process), `HeadlessOAuth` (OAuth without a browser), and `temporary_settings()` (config overrides). Testing MCP servers is a first-class concern. + +Banksy has no equivalent testing infrastructure for its MCP layer. + +--- + +## Could Banksy Be Replicated in FastMCP? + +**Short answer**: Yes, and it would be architecturally simpler. + +**What would be straightforward**: +- OpenAPI-to-MCP conversion: `FastMCP.from_openapi()` replaces both `banksy-mural-api` and `banksy-public-api` entirely. No separate processes needed. +- Tool definition: Decorators replace file-based routing. Less boilerplate per tool. +- Server composition: `mount()` replaces the generated client + promise-proxy pattern. +- Auth token injection: DI + session-scoped state replaces AsyncLocalStorage + custom PerRequestAuthProvider. + +**What would require custom work (same as today)**: +- Per-user OAuth flow (Google SSO → Mural token exchange → PostgreSQL storage → refresh logic). FastMCP has OAuth primitives, but the specific multi-provider flow is custom business logic regardless of framework. +- The code generation step would disappear (which is good), but tool curation would still need thought — `from_openapi()` gives you raw API tools, not LLM-optimized tools. FastMCP's transform system helps but doesn't eliminate the curation work. + +**What you'd gain**: +- Single-process architecture instead of three +- No code generation step +- Built-in testing utilities +- Elicitation for interactive multi-step tools +- Dynamic tool visibility per user/role +- Progress reporting +- Route mapping (resources vs tools) +- Simpler DI for cross-cutting concerns + +**What you'd lose**: +- TypeScript (the rest of the MURAL ecosystem is TypeScript/Node) +- xmcp's hot reload DX +- File-based routing (subjective — some prefer it) +- Direct integration with xmcp plugins (auth0, better-auth, clerk) + +--- + +## Beyond Banksy: Where FastMCP Opens New Doors + +The "API wrapper first approach is limiting" observation is the key thread here. Banksy today is fundamentally a **REST API mirror** — every tool maps 1:1 to an API endpoint. The LLM gets the same interface a developer would get from reading Swagger docs. This works, but it constrains what the MCP server can be. + +FastMCP's design enables a different model: **tools as domain operations, not API calls**. + +| API-Wrapper Pattern (Today) | Domain-Operation Pattern (Possible) | +|---|---| +| `create-widget` takes muralId, type, x, y, width, height, text, style... | `add_sticky_note` takes text, position (optional) | +| `update-widget` takes muralId, widgetId, full widget payload | `move_sticky` takes note name/text, direction or target area | +| `create-mural` takes title, workspaceId, templateId, height, width... | `create_workshop` guides user through purpose → template → participants | +| Tools mirror the API surface | Tools mirror user intent | + +FastMCP's combination of tool transforms (reshape auto-generated tools), elicitation (progressive input gathering), composition (mix curated + raw tools), and visibility (show different tools to different users) makes the domain-operation pattern practical. You can start from the OpenAPI spec, transform the tools into higher-level operations, and compose them into a server that thinks in terms of "help the user run a workshop" rather than "call these 7 API endpoints in sequence." + +This isn't exclusive to FastMCP — you could build these patterns in xmcp too, with enough custom code. But FastMCP has them as framework primitives, which means the difference between "architecturally possible" and "practically encouraged." + +--- + +## Summary + +| Dimension | xmcp (banksy today) | FastMCP (what it enables) | +|---|---|---| +| **Core pattern** | File-based tools wrapping API calls | Decorated functions with DI, transforms, composition | +| **API integration** | Separate process + code gen | `from_openapi()` — single process, no gen | +| **Tool curation** | Manual in handler or generated code | Transform layer over auto-generated tools | +| **Multi-server** | Three processes, custom proxy | `mount()` / `import_server()` / `as_proxy()` | +| **Auth/context** | AsyncLocalStorage + manual wiring | DI + session-scoped state | +| **User interaction** | One-shot tool calls | Elicitation — multi-step conversations | +| **Tool visibility** | Config-time (internal vs public dirs) | Runtime per-session tags | +| **Testing** | No framework support | `run_server_async()`, HeadlessOAuth, etc. | +| **Progress** | Not supported | `ctx.report_progress()` | +| **Interactive UI** | Not supported | Apps (iframe UIs in conversation) | +| **Language** | TypeScript | Python | +| **Ecosystem fit** | Matches MURAL stack | Would be a new runtime | + +FastMCP is not just "xmcp but in Python." It's a more mature framework with genuinely different capabilities — composition, transforms, elicitation, visibility, DI — that address the exact limitations of the API-wrapper-first approach. The language difference is real and non-trivial for a TypeScript shop, but the capability gap is worth understanding regardless of whether you'd adopt it directly, port its ideas to TypeScript, or use it for new standalone use cases where the MURAL TypeScript ecosystem isn't a constraint. diff --git a/fastmcp-migration/resource-server-migration-eval.md b/fastmcp-migration/resource-server-migration-eval.md new file mode 100644 index 0000000..670eca9 --- /dev/null +++ b/fastmcp-migration/resource-server-migration-eval.md @@ -0,0 +1,373 @@ +# Banksy Resource Server Migration: Evaluation and Path Forward + +## Table of Contents + +- [Executive Summary](#executive-summary) +- [Background](#background) +- [The Resource Server Model](#the-resource-server-model) + - [What Changes and What Doesn't](#what-changes-and-what-doesnt) + - [Why Mural Cannot Serve as the IdP](#why-mural-cannot-serve-as-the-idp) + - [Mode Convergence](#mode-convergence) +- [External IdP Selection](#external-idp-selection) + - [The User Coverage Constraint](#the-user-coverage-constraint) + - [Candidate Assessment](#candidate-assessment) + - [Recommended Path: Dedicated IdP with Mural as Custom Social Connection](#recommended-path-dedicated-idp-with-mural-as-custom-social-connection) + - [Token Capture for Layer 2](#token-capture-for-layer-2) + - [FastMCP Auth Class Selection](#fastmcp-auth-class-selection) +- [IDE Compatibility](#ide-compatibility) +- [Security Hardening (Pre-Migration)](#security-hardening-pre-migration) + - [Ticket Landscape](#ticket-landscape) + - [Key Vulnerabilities to Fix Now](#key-vulnerabilities-to-fix-now) + - [Mural-OAuth-Specific Hardening](#mural-oauth-specific-hardening) + - [Migration Ticket](#migration-ticket) +- [Recommended Sequencing](#recommended-sequencing) +- [Relationship to FastMCP Migration](#relationship-to-fastmcp-migration) +- [Appendices](#appendices) + - [Appendix A: Ticket-by-Ticket Assessment](#appendix-a-ticket-by-ticket-assessment) + - [Appendix B: IDE Client Support Details](#appendix-b-ide-client-support-details) + - [Appendix C: Mural-as-IdP Detailed Blocker Analysis](#appendix-c-mural-as-idp-detailed-blocker-analysis) + - [Appendix D: Mural-OAuth Security Concerns Beyond Audit](#appendix-d-mural-oauth-security-concerns-beyond-audit) + - [Appendix E: Mural Infrastructure Evolution — Per-Blocker Resolution](#appendix-e-mural-infrastructure-evolution--per-blocker-resolution) + +## Executive Summary + +Banksy must migrate from OAuth Authorization Server to Resource Server — a requirement driven by MCP specification compliance and the elimination of unnecessary security attack surface. The migration affects only Layer 1 (IDE → Banksy authentication); Layer 2 (Banksy → Mural API access) is unchanged. Mural cannot serve as the external IdP due to three independent blockers: HS256 tokens without JWKS, no OAuth discovery metadata, and the MCP token passthrough prohibition. The recommended path is a dedicated IdP (Auth0 or Descope) with Mural configured as a custom social connection, which covers all Mural user segments and preserves single-step UX if the IdP supports upstream token storage. Both auth modes (sso-proxy and mural-oauth) converge to the same target architecture under RS, making mode divergence a transitional artifact. Seven of eight audit tickets are backward-compatible security hardening that should ship now, independent of migration timeline. The primary remaining decision is IdP selection, pending a PoC that validates custom social connection flow and upstream token storage. + +--- + +## Background + +Banksy is the MCP server that connects AI-powered IDEs to Mural's collaboration platform. It authenticates IDE users, then executes tool calls against Mural's API on their behalf. Authentication operates across two stacked OAuth-like layers: + +- **Layer 1 (IDE → Banksy):** Establishes user identity. Today, Banksy operates as an OAuth Authorization Server (AS) — Better Auth's MCP plugin serves `/.well-known/oauth-authorization-server`, handles Dynamic Client Registration (DCR), and issues MCP tokens. +- **Layer 2 (Banksy → Mural):** Provides API access. Banksy stores Mural OAuth tokens (access + refresh) and uses them server-side to execute tool calls against Mural's API. + +Banksy supports three auth modes (configured via `AUTH_MODE`, one per deployment): + +- **sso-proxy:** Layer 1 uses Google OAuth via an SSO proxy. Layer 2 uses a session-activation code/nonce pattern where the browser performs Mural OAuth and Banksy claims the tokens. +- **mural-oauth:** Layer 1 redirects to Mural's consent page — Mural serves as both IdP and API token source, collapsing two layers into one user-facing flow. Layer 2 is embedded: the authorization code grant with Mural yields identity (via `/api/public/v1/users/me`) and API tokens stored for Mural API calls. +- **m2m:** Machine-to-machine, out of scope for this analysis. + +A security audit in early 2026 produced eight tickets (critical, high, medium) targeting sso-proxy and a design preamble recommending Banksy adopt an OAuth Resource Server posture under FastMCP. The mural-oauth mode was introduced after the audit; this analysis covers both modes. + +The audit's central recommendation — and the MCP specification's normative requirement — is the migration from Authorization Server to Resource Server. This document evaluates that migration: what it means, what it costs, what decisions remain, and what risks exist. + +--- + +## The Resource Server Model + +The audit recommends Banksy stop operating as an OAuth **Authorization Server** (AS) — issuing tokens — and instead become a **Resource Server** (RS) that validates tokens issued by an external Identity Provider (IdP). ("Resource server" is OAuth 2.0 terminology, unrelated to MCP Resources.) + +### What Changes and What Doesn't + +The RS migration replaces Layer 1 only: + +- **Removed:** Better Auth's MCP plugin, `/.well-known/oauth-authorization-server`, `/api/auth/mcp/*`, DCR registration handling, token issuance. +- **Added:** `/.well-known/oauth-protected-resource` (RFC 9728) pointing IDEs to an external IdP. JWT validation (signature via JWKS, issuer, audience, expiration). Scope-based authorization. +- **Unchanged:** Layer 2. Banksy still stores and uses Mural API tokens. The IDE-presented JWT answers "is this a legitimate user?" but doesn't provide Mural tokens. + +The rationale is both security and compliance. Running an AS exposes confused-deputy attacks, DCR abuse, and authorization code interception; eliminating the AS eliminates those surfaces. The MCP specification (2025-11-25 revision) makes this normative: servers MUST implement Protected Resource Metadata (PRM, RFC 9728), clients MUST use it for AS discovery. No spec-defined path remains for MCP servers as authorization servers. Banksy's current AS model works only because IDEs still attempt legacy discovery as fallback — this will erode. + +Post-migration flow (both modes): + +1. **One-time setup (browser):** User authenticates with external IdP (Layer 1), completes Mural OAuth (Layer 2). Banksy stores Mural tokens. +2. **Every MCP request (IDE):** IDE presents an IdP-issued JWT. Banksy validates it, looks up Mural tokens in Postgres, executes the tool call, returns the result. + +### Why Mural Cannot Serve as the IdP + +The only IdP candidate that could collapse both layers is Mural itself. Three independent blockers, each individually fatal, prevent this: + +1. **HS256 tokens with no JWKS.** Mural OAuth tokens are JWTs signed with a symmetric secret (HS256). No JWKS endpoint, no asymmetric keys, no issuer/audience claims. Banksy cannot validate them without possessing Mural's signing secret — a security boundary violation. +2. **No OAuth discovery.** Mural serves no `/.well-known/oauth-authorization-server`, `/.well-known/openid-configuration`, or RFC 8414 metadata. No DCR. IDEs following PRM `authorization_servers` links would find nothing. +3. **MCP token passthrough prohibition.** The MCP spec states: "The MCP server MUST NOT pass through the token it received from the MCP client." RFC 8707 audience binding makes this structural: a token audience-bound to Banksy is rejected by Mural; a Mural-audience token fails Banksy's validation. The two-layer architecture (IdP JWT for L1, stored Mural tokens for L2) is the spec-compliant pattern. + +Blockers 1-2 could theoretically be resolved by Mural platform changes (see [Appendix E](#appendix-e-mural-infrastructure-evolution--per-blocker-resolution)). Blocker 3 is a fundamental MCP constraint. The two-layer separation is inherent, not a consequence of IdP choice. See [Appendix C](#appendix-c-mural-as-idp-detailed-blocker-analysis) for code-level evidence. + +### Mode Convergence + +The RS migration causes mural-oauth and sso-proxy to converge to the same target architecture — the most significant structural finding of this analysis. + +Today mural-oauth provides single-step UX: one Mural consent yields identity + API access. Under RS, Layer 1 requires an external IdP with JWKS-validatable tokens. Mural doesn't qualify (three blockers above apply regardless of mode). Post-migration, mural-oauth loses single-step UX — unless the external IdP supports upstream token storage (see [Token Capture for Layer 2](#token-capture-for-layer-2)). + +Without token storage: users authenticate with the external IdP (Layer 1), then separately complete Mural OAuth in the browser (Layer 2) — structurally identical to sso-proxy. The `mural-oauth.ts` token exchange/storage logic persists as the Layer 2 mechanism; Better Auth's MCP plugin is replaced by PRM + external IdP. + +| Aspect | Current mural-oauth | Post-migration (without token storage) | Post-migration (with token storage) | +|---|---|---|---| +| Layer 1 (IDE auth) | Better Auth MCP plugin (AS) | External IdP via PRM (RS) | External IdP via PRM (RS) | +| Layer 2 (Mural API) | Mural OAuth (same flow as L1) | Mural OAuth (separate browser step) | IdP retrieves Mural tokens during social login | +| User steps | 1 (Mural consent) | 2 (IdP auth + Mural connect) | 1 (Mural consent via IdP) | +| Token storage | Needed (`muralOauthToken`) | Still needed (Banksy stores) | IdP stores upstream tokens; Banksy retrieves | +| SPA callback | Needed for OAuth callback | Still needed for Layer 2 | Not needed (no separate browser step) | +| `/.well-known` | `oauth-authorization-server` | `oauth-protected-resource` | `oauth-protected-resource` | + +The convergence means a single target architecture regardless of starting mode. Same external IdP, same FastMCP auth class, same PRM config. The mural-oauth token exchange/storage becomes the universal Layer 2 mechanism — its use of Mural Public API and standard OAuth tokens makes it the more portable implementation. + +--- + +## External IdP Selection + +IdP selection is the primary remaining architectural decision. It determines JWT validation configuration, FastMCP auth class, and — critically — which Mural user segments can use Banksy at all. + +### The User Coverage Constraint + +The IdP assessment assumes users have accounts with the external provider. This holds for enterprise SSO and Google users but fails for self-serve Mural users (email/password, individual/team plans). Mural supports five auth methods: email/password (`api/src/api/session/signin.ts`), Google social (`data/src/data/models/idp/providers/google.ts`), Microsoft social (`data/src/data/models/idp/providers/microsoft.ts`), SAML SSO (`api/src/api/authenticate/saml2/`), OAuth2 SSO (`api/src/api/authenticate/oauth2/authorization/`). The mural-oauth mode reaches all segments via Mural's consent page. The RS migration would regress this. + +Google as IdP excludes email/password users, Microsoft social users, and non-Google enterprise SSO users — likely a majority of Mural's user base. A dedicated IdP without custom connections requires new signups decoupled from Mural identity, creating user-mapping problems. Mural-as-IdP covers all users but is blocked (see [Why Mural Cannot Serve as the IdP](#why-mural-cannot-serve-as-the-idp)). This is an access issue: under RS with Google or a standard dedicated IdP, an entire class of users loses access to Banksy. + +### Candidate Assessment + +**Google:** Already used for sso-proxy Layer 1. ID tokens are JWTs validatable via JWKS, but no custom audiences/scopes (audience = client ID, scopes limited to openid/email/profile). No DCR → requires `OAuthProxy`. Sufficient for "is this a legitimate Google user?" but no fine-grained MCP scope control. Severe user coverage regression. + +**Dedicated IdP** (Auth0, Azure AD/Entra ID, WorkOS, Descope): Full control over JWT shape, JWKS, custom audiences/scopes, token lifetimes. Auth0, WorkOS, Descope support DCR → `RemoteAuthProvider` (pure RS). Azure AD lacks DCR → `OAuthProxy`. Adds infrastructure but eliminates Google's format limitations and enables Banksy-specific scopes (`mcp:tools`, `mural:read`, `mural:write`). + +**Mural infrastructure evolution:** Mural could eliminate the need for an external IdP by adding JWKS, discovery, and optionally RFC 8693 (Token Exchange) — significant effort, uncertain timeline, outside Banksy team's control. No evidence of this work in mural-api. Viable as a long-term ideal but not a near-term option. See [Appendix E](#appendix-e-mural-infrastructure-evolution--per-blocker-resolution) for per-blocker resolution details. + +**Dual auth (fallback):** AS model for non-enterprise deployments, RS model for enterprise. Per-instance compliant (each serves one model) but Ticket 3 only applies to enterprise, leaving the AS attack surface active elsewhere. Sustainable only as a transitional state — AS becomes non-compliant as IDEs drop legacy fallbacks, doubles security surface area, and contradicts the audit recommendation. + +The choice cascades into JWT validation config, scope design, user account linking, and operational burden. But the most fundamental implication is user coverage: IdP choice determines which Mural user segments can use Banksy at all. + +### Recommended Path: Dedicated IdP with Mural as Custom Social Connection + +A dedicated IdP (Auth0, Descope) configured with Mural as a custom upstream OAuth provider. The IdP provides protocol infrastructure (JWKS, discovery, DCR, RS256 JWTs); Mural handles actual authentication. Auth0 and Descope support custom social connections (manually configure upstream auth/token/userinfo URLs — no upstream discovery needed). + +Flow: +1. IDE discovers Banksy's PRM → follows `authorization_servers` to dedicated IdP +2. IdP redirects to Mural (configurable to skip own login page via "Home Realm Discovery") — user sees only Mural's consent page +3. User authenticates with Mural (any method) and authorizes Banksy +4. Mural redirects to IdP with authorization code +5. IdP exchanges code for Mural tokens, fetches user info, issues RS256 JWT +6. IDE presents IdP JWT to Banksy → validates via JWKS + +Addresses Blockers 1-2 (IdP issues proper JWTs with JWKS + discovery), sidesteps Blocker 3 (token is IdP-issued, no passthrough). Covers all Mural user segments. This is the only option that covers all users, is spec-compliant, doesn't depend on Mural platform team, and is implementable today. + +With upstream token storage: single-step UX. Without: two steps, but no one excluded. + +**Long-term ideal:** Mural infrastructure evolution (no vendor dependency) — but uncertain timeline, outside Banksy's control. Pursue as parallel conversation; don't block on it. + +**Next step:** PoC validating (a) custom social connection flow with Mural upstream, (b) upstream token storage for custom connections, (c) end-to-end with Cursor and VS Code. Provider evaluation (Auth0, Descope, others) with pricing: `banksy/.cursor/prompts/research-auth-provider-alternatives.md`. + +### Token Capture for Layer 2 + +During step 5 of the recommended flow, the IdP obtains Mural tokens. Auth0's Token Vault stores these upstream tokens; Banksy retrieves them server-to-server (federated connection access token exchange). If this works for custom social connections (needs PoC), it eliminates the separate "Mural connect" browser step — single-step UX preserved. + +The Mural token never comes from the MCP client. The IDE sends an IdP JWT (audience-bound to Banksy via RFC 8707); Banksy retrieves Mural tokens from the IdP's store via a server-to-server channel. Identical in pattern to the current approach (Banksy stores Mural tokens in Postgres after browser OAuth). The spec's passthrough prohibition targets client-to-upstream forwarding, not this pattern. + +Limitations: Auth0 Token Vault is Enterprise-only (unpublished pricing, requires sales). Custom Token Exchange in Early Access (2026). Needs PoC: refresh token storage + automatic refresh for custom connections. WorkOS doesn't support custom upstream OAuth (predefined providers only). Descope supports custom OAuth + DCR — evaluate alongside Auth0. Full provider evaluation: `banksy/.cursor/prompts/research-auth-provider-alternatives.md`. + +Without token storage, the dedicated IdP still solves user coverage. Layer 2 reverts to separate browser Mural OAuth — minor UX regression (two steps), not an access regression. + +### FastMCP Auth Class Selection + +The choice of external IdP determines which FastMCP auth class Banksy uses: + +**`RemoteAuthProvider`** requires DCR (RFC 7591) — IDEs auto-register with the IdP. Supported by WorkOS AuthKit, Descope, Auth0 (if configured). Composes `JWTVerifier` + automatic PRM endpoints = pure RS with no AS surface. Architecturally cleanest. + +**`OAuthProxy`** bridges non-DCR IdPs (Google, Azure AD, GitHub) by presenting a DCR interface to IDEs while holding pre-registered upstream credentials. Reintroduces some AS surface (DCR registrations, proxied tokens) but far less than Better Auth, and token validation is still delegated to the IdP. + +Bare `JWTVerifier` does not serve PRM — IDEs would have no discovery metadata. Must use one of the two classes above. Both serve PRM and produce spec-compliant resource servers. DCR + `RemoteAuthProvider` is architecturally cleanest (highest setup cost). Google + `OAuthProxy` is lowest friction (no scope control). + +--- + +## IDE Compatibility + +The RS model is viable today for the three major IDEs. Cursor (v1.0+), VS Code/Copilot (v1.102+), and Claude Desktop all support Protected Resource Metadata discovery and OAuth 2.1 with PKCE. Requirements: serve PRM at `/.well-known/oauth-protected-resource`, use `RemoteAuthProvider` or `OAuthProxy` (not bare `JWTVerifier`). + +Zed and Continue.dev lack remote MCP OAuth entirely — they cannot use Banksy's current AS model either, so the migration causes no regression. Windsurf has no clear remote OAuth PRM support. + +Per-client details (Cursor redirect bug, SDK status, version specifics) are in [Appendix B](#appendix-b-ide-client-support-details). + +--- + +## Security Hardening (Pre-Migration) + +### Ticket Landscape + +The audit produced eight tickets. Seven have "fix in current flow" recommendations that are backward-compatible and safe to ship independently of the RS migration. The eighth (Ticket 3) is the migration itself — remove Better Auth's MCP plugin and the AS surface. + +Six of eight tickets apply to mural-oauth (introduced after the audit) with varying severity: four apply directly with the same fix, one is substantially mitigated, and one has reduced applicability. Ticket 3 applies equally — same plugin, removed from both modes simultaneously. + +### Key Vulnerabilities to Fix Now + +**Open redirect (Ticket 2, Critical):** `url.startsWith('/')` accepts protocol-relative URLs like `//evil.com`. Present in both modes (sso-proxy and mural-oauth's `isValidCallbackUrl`). Fix: parse with `new URL(url, baseOrigin)`, require `u.origin === baseOrigin`. + +**Plaintext refresh tokens (Ticket 6, High):** Mural tokens stored unencrypted in Postgres (`muralSessionToken` and `muralOauthToken`). Persists post-migration — this fix is needed regardless of timeline. Fix: envelope encryption with Azure Key Vault (transparent encrypt-on-write, decrypt-on-read). Shared abstraction in `mural-tokens.ts` means one fix covers both tables. + +**OAuth login CSRF (Ticket 1, Critical):** sso-proxy's Google OAuth callback logs a warning on state validation failure but continues to create a session. Mural-oauth is substantially stronger (HMAC-SHA256 state with `crypto.timingSafeEqual` and 10-minute TTL) but state is not single-use. Fix for sso-proxy: reject on mismatch. Fix for mural-oauth: server-side nonce tracking for single-use enforcement. + +**Security headers (Ticket 4, High):** Auth pages lack CSP, HSTS, Referrer-Policy, anti-framing headers. Purely additive (Azure Front Door or middleware). Applies equally to both modes. + +**OAuth codes in URL history (Ticket 7, Medium):** SPA doesn't call `history.replaceState()` after reading the authorization code. One-line fix in both modes. + +**Sensitive auth logging (Ticket 8, Medium):** Replace truncated state/code values with correlation IDs. Mural-oauth: stop logging full `callbackURL`, stop passing `error_description` to client. + +**Mural claim race condition (Ticket 5, High):** Applies to sso-proxy's `/auth/mural/claim` only. Mural-oauth's atomic `INSERT ... ON CONFLICT DO UPDATE` is sufficient. + +### Mural-OAuth-Specific Hardening + +Beyond the audit tickets, mural-oauth introduces concerns requiring additional hardening: a 10-minute state replay window (fix with server-side nonce tracking), SPA error parameter reflection (fix with generic error messages), and overly broad default scopes (audit for minimum privilege). Full details in [Appendix D](#appendix-d-mural-oauth-security-concerns-beyond-audit). + +### Migration Ticket + +**Ticket 3 (High) — Remove Legacy MCP OAuth Surface:** Remove Better Auth's MCP plugin, `/api/auth/mcp/*`, and `/.well-known/oauth-authorization-server`. Execute only after the RS replacement is deployed and tested. Applies equally to both modes. + +Full ticket-by-ticket detail (both modes, with mural-oauth extrapolations) is in [Appendix A](#appendix-a-ticket-by-ticket-assessment). + +--- + +## Recommended Sequencing + +1. **Now (both modes):** Tickets 1, 2, 4, 6, 7, 8 as backward-compatible hardening. Ticket 5: fix sso-proxy's claim endpoint (mural-oauth's atomic upsert sufficient). Mural-oauth extras: server-side nonce tracking (Ticket 1 completion), generic errors (not `error_description`), scope audit. +2. **Auth provider evaluation + PoC:** IdP must support custom upstream OAuth with Mural (otherwise email/password and non-Google users excluded). Evaluate Auth0, Descope, others (`banksy/.cursor/prompts/research-auth-provider-alternatives.md`) for fit, pricing, complexity. PoC: (a) custom social connection with Mural upstream, (b) upstream token storage for custom connections, (c) end-to-end with Cursor and VS Code. In parallel: Mural platform team conversation re JWKS, discovery, RFC 8693. +3. **IdP decision:** Select provider based on PoC. DCR-capable (Auth0, Descope) → `RemoteAuthProvider`; non-DCR → `OAuthProxy`. Both modes converge on same architecture. +4. **Migration:** Implement FastMCP RS with chosen provider. Mural-oauth token exchange/storage becomes universal Layer 2. With upstream token storage: single-step UX. Validate end-to-end (PRM → IdP auth → token validation → Mural connect → tool invocation) with three major IDEs. +5. **Ticket 3:** Remove Better Auth MCP plugin and `/api/auth/mcp/*` only after replacement is deployed and proven. + +--- + +## Relationship to FastMCP Migration + +The RS migration and the FastMCP migration are related but separable. Security hardening (the seven backward-compatible tickets) is independent work regardless of migration timeline. Ticket 3 is migration planning. + +Overlap with existing FastMCP auth strategy plan (`fastmcp_auth_strategy_f355d421.plan.md`): Risk 1 (Mural token format) is resolved — HS256 JWTs with no JWKS, ruling out Mural-as-IdP. Risks 2 and 4 (`get_access_token()` return value, token refresh lifecycle) remain open. The `OAuthProxy` vs `RemoteAuthProvider` choice is a new dimension that depends on IdP DCR capability. IDE compatibility is confirmed (Cursor, VS Code, Claude Desktop). The remaining decision is IdP selection. + +--- + +## Appendices + +### Appendix A: Ticket-by-Ticket Assessment + +#### sso-proxy Mode + +**Ticket 1 (Critical) — OAuth Login CSRF:** sso-proxy's Google OAuth callback logs a warning on state validation failure but continues to create a session. Fix: generate crypto-secure state server-side, store with TTL, reject on mismatch/expiration, enforce single-use. + +**Ticket 2 (Critical) — Open Redirect:** `url.startsWith('/')` accepts protocol-relative URLs like `//evil.com`. Fix: parse with `new URL(url, baseOrigin)`, require `u.origin === baseOrigin`. + +**Ticket 3 (High) — Remove Legacy MCP OAuth Surface:** The only ticket that breaks existing behavior — remove Better Auth's MCP plugin, `/api/auth/mcp/*`, and `/.well-known/oauth-authorization-server`. This is the migration itself, scoped as a ticket. IDE compatibility confirmed for Cursor, VS Code, Claude Desktop. Prerequisites: select external IdP, implement replacement via `RemoteAuthProvider` or `OAuthProxy`. Execute only after replacement is deployed and tested. + +**Ticket 4 (High) — Security Headers:** Auth pages lack CSP, HSTS, Referrer-Policy, anti-framing headers. Purely additive (Azure Front Door or middleware). + +**Ticket 5 (High) — Mural Claim Race Condition:** `/auth/mural/claim` reads, claims, saves, deletes in non-atomic steps — concurrent claims can both succeed. Fix: atomic compare-and-set with `claimedAt`/`status` column. + +**Ticket 6 (High) — Plaintext Refresh Tokens:** Mural tokens stored unencrypted in Postgres. Fix: envelope encryption with Azure Key Vault (transparent encrypt-on-write, decrypt-on-read). + +**Ticket 7 (Medium) — OAuth Codes in URL History:** SPA doesn't call `history.replaceState()` after reading the authorization code. One-line fix. + +**Ticket 8 (Medium) — Sensitive Auth Logging:** Auth flows log truncated state/code values. Fix: replace with correlation IDs. + +#### mural-oauth Mode Extrapolation + +Six of eight tickets apply to mural-oauth with varying severity. + +**Direct apply (same fix):** Ticket 2 (open redirect — identical `startsWith('/')` bug in `isValidCallbackUrl`), Ticket 4 (security headers — same infrastructure, same gap), Ticket 6 (plaintext tokens in `muralOauthToken` — persists post-migration; shared abstraction in `mural-tokens.ts` means one fix covers both tables), Ticket 7 (code not scrubbed from history in `oauth-callback.tsx`). + +**Partially addressed:** Ticket 1 (CSRF) — substantially mitigated via HMAC-SHA256 state with nonce, timestamp, 10-minute TTL, and `crypto.timingSafeEqual`. Materially stronger than sso-proxy's "log and continue." Gap: state is not single-use (replayable within TTL). Mitigated by Mural's single-use codes; defense-in-depth calls for server-side nonce tracking. + +**Reduced applicability:** Ticket 5 — no `/auth/mural/claim` endpoint; token storage uses atomic `INSERT ... ON CONFLICT DO UPDATE`. Ticket 8 — doesn't log tokens/codes but logs `callbackURL` and passes `error_description` to client; clean up both. + +**Identical:** Ticket 3 applies equally — same Better Auth MCP plugin, removed from both modes simultaneously. + +#### mural-oauth Priority Assessment + +**Most urgent:** Tickets 2 (open redirect) and 6 (plaintext tokens). **Next:** Ticket 1 (add single-use nonce). **Quick wins:** Ticket 7 (one-line `history.replaceState()`). **Cleanup:** Ticket 8 (logging). **No fix needed:** Ticket 5 (atomic upsert sufficient). + +--- + +### Appendix B: IDE Client Support Details + +The RS model is the only auth model in the current MCP spec. PR #338 (April 2025) separated MCP servers from authorization servers; the 2025-11-25 revision made it normative: servers MUST implement PRM (RFC 9728), clients MUST use it for AS discovery. No spec-defined path remains for MCP servers as authorization servers. + +**Cursor** (v1.0+, June 2025): Full PRM discovery, follows `authorization_servers` links, OAuth 2.1 + PKCE. Known bug: `resource_metadata` URL from `WWW-Authenticate` header lost after redirect — only affects non-standard metadata paths, not `/.well-known/oauth-protected-resource`. + +**VS Code / Copilot** (v1.102, July 2025): Full PRM, OIDC Discovery, RFC 8414. Microsoft documents Entra ID integration with RS model. + +**Claude Desktop:** OAuth 2.1 with PRM over streamable HTTP. + +**Zed:** Bug #43162 — lacks remote MCP OAuth entirely. Cannot use Banksy's current AS model either; no regression from RS migration. + +**Continue.dev:** Enhancement #6282 — lacks remote MCP OAuth entirely. Same assessment as Zed. + +**Windsurf:** No clear remote OAuth PRM support. + +**TypeScript MCP SDK** (v1.27.1): Implements `discoverOAuthProtectedResourceMetadata()` with a known redirect bug (Issue #1234, fix in PR #1350). + +**Python MCP SDK:** Open PR #982 for AS/RS separation. + +**Existing end-to-end examples:** Microsoft/Entra ID + VS Code, mcp-auth.dev reference implementations, Quarkus tutorial — pattern works end-to-end when serving PRM at standard well-known path. + +--- + +### Appendix C: Mural-as-IdP Detailed Blocker Analysis + +#### Blocker 1: HS256 Tokens with No JWKS + +Mural OAuth tokens are JWTs signed HS256 with a symmetric secret (`jwt.sign(claims, config.jwt.secret)` in `api/src/core/session/tokens/index.ts`). Validation is hardcoded to `algorithms: ['HS256']` in `api/src/security/jwt/index.ts`. No JWKS endpoint exists. No asymmetric keys. No issuer or audience claims in the token payload. + +Banksy cannot validate these tokens without possessing Mural's `config.jwt.secret` — a security boundary violation that would enable token forgery. The tokens are functionally opaque to any external validator (resolves Risk 1 from the FastMCP auth strategy plan). + +#### Blocker 2: No OAuth Discovery + +Mural serves no `/.well-known/oauth-authorization-server`, `/.well-known/openid-configuration`, or RFC 8414 metadata. DCR is not supported — clients are pre-registered. IDEs following PRM `authorization_servers` links would find nothing. Neither `RemoteAuthProvider` nor `OAuthProxy` works without a discovery endpoint. + +Auth endpoints exist (`api/src/api/authenticate/oauth2/authorization/`) but no metadata document aggregates them. + +#### Blocker 3: MCP Token Passthrough Prohibition + +The MCP spec states: "The MCP server MUST NOT pass through the token it received from the MCP client." Using Mural-as-IdP does exactly this. RFC 8707 audience binding makes the prohibition structural: + +- A token with `resource=https://banksy.example.com` is audience-bound to Banksy — Mural rejects it. +- A Mural-audience token fails Banksy's validation. +- No audience value satisfies both. + +The two-layer architecture (IdP JWT for L1, stored Mural tokens for L2) is the spec-compliant pattern. This blocker persists even if Mural resolves Blockers 1-2. + +#### Mural-OAuth Mode Implications + +The existence of `mural-oauth` mode doesn't help bypass these blockers. `OAuthProxy` delegates to the upstream IdP's JWKS (Mural has none). Mural-oauth's single-step UX depends on Banksy acting as an AS intermediary, which the RS migration eliminates. + +--- + +### Appendix D: Mural-OAuth Security Concerns Beyond Audit + +Concerns introduced by the mural-oauth mode beyond the eight audit tickets: + +**State replay window.** HMAC-signed state is replayable for 10 minutes — no server-side nonce storage or consumption tracking. Mural's single-use authorization codes limit practical impact, but the window is wider than necessary. Fix: store used nonces in Redis/DB with TTL matching `STATE_MAX_AGE_MS`, reject seen nonces. + +**Better Auth secret as CSRF root of trust.** State HMAC is keyed with `ctx.context.secret`. Compromise of this secret enables forging valid state for arbitrary callbacks (login CSRF). Not a vulnerability per se — any HMAC scheme depends on secret integrity — but makes this secret a higher-value target than in sso-proxy mode. + +**SPA error parameter reflection.** `oauth-callback.tsx` renders `error` and `error_description` from URL params. React's text escaping prevents XSS when rendered as text, but unsanitized values risk phishing if ever rendered as HTML. Fix: generic error to user, log `error_description` server-side only. + +**Scope breadth.** Default scopes (`murals:read`, `murals:write`, `workspaces:read`, `rooms:read`, `identity:read`, `templates:read`) are broad. Overridable via `MURAL_OAUTH_SCOPES` but no runtime validation against tool requirements. Audit for minimum privilege. + +**Client credentials in environment.** `MURAL_OAUTH_CLIENT_ID` and `MURAL_OAUTH_CLIENT_SECRET` as env vars. Handled appropriately (secret used only for token exchange and HMAC, client ID truncated in logs). Should be rotated periodically and stored in Azure Key Vault for production. + +--- + +### Appendix E: Mural Infrastructure Evolution — Per-Blocker Resolution + +"Build" alternative: Mural's platform team adds RS model infrastructure, enabling Mural as the IdP directly. + +#### Blocker 1 Fix: Asymmetric Signing + JWKS + +Migrate HS256 → RS256/ES256. The current implementation is deeply embedded: + +- `jwt.sign(claims, config.jwt.secret)` in `api/src/core/session/tokens/index.ts` +- Validation hardcoded to `algorithms: ['HS256']` in `api/src/security/jwt/index.ts` +- Multiple token types use separate HS256 secrets in `api/config/defaults.json` + +Requires: RSA/EC key pair management, update all sign/verify paths, expose JWKS endpoint, handle key rotation, transition period accepting both algorithms. + +#### Blocker 2 Fix: OAuth Discovery + +Serve `/.well-known/oauth-authorization-server` or `/.well-known/openid-configuration` (RFC 8414/OIDC). Auth endpoints exist (`api/src/api/authenticate/oauth2/authorization/`) but no metadata document aggregates them today. + +#### Blocker 3: Persists + +Even with JWKS + discovery, the passthrough prohibition prevents using IDE-presented Mural tokens for API calls. Layer 2 is still needed. However, RFC 8693 (Token Exchange) would let Banksy exchange the Layer 1 token for separate Layer 2 API tokens server-to-server — preserving single-step UX without passthrough. Mural doesn't implement RFC 8693 today. + +#### DCR + +Not supported by Mural. Would require `OAuthProxy` (workable but adds some AS surface). + +#### Assessment + +Covers all users, no vendor cost. But depends on Mural platform team prioritization — no evidence of JWKS/RS256/discovery work in mural-api. Significant engineering scope, uncertain timeline, outside Banksy team's control. Viable as a long-term ideal; pursue as a parallel conversation without blocking near-term migration on it. diff --git a/fastmcp-migration/security-audit-analysis.md b/fastmcp-migration/security-audit-analysis.md new file mode 100644 index 0000000..536ff57 --- /dev/null +++ b/fastmcp-migration/security-audit-analysis.md @@ -0,0 +1,297 @@ +# Security Audit Analysis: Banksy Auth Architecture + +## Table of Contents + +- [Background](#background) +- [The Audit's Central Recommendation: Banksy as a Resource Server](#the-audits-central-recommendation-banksy-as-a-resource-server) + - [What This Changes and What It Doesn't](#what-this-changes-and-what-it-doesnt) + - [Open Questions](#open-questions) +- [Ticket-by-Ticket Assessment](#ticket-by-ticket-assessment) + - [Tickets Safe to Implement Now](#tickets-safe-to-implement-now) + - [The Migration Ticket](#the-migration-ticket) +- [Mural-OAuth Mode: Security and Migration Analysis](#mural-oauth-mode-security-and-migration-analysis) + - [How mural-oauth Differs from sso-proxy](#how-mural-oauth-differs-from-sso-proxy) + - [Security Posture: Ticket Extrapolation Summary](#security-posture-ticket-extrapolation-summary) + - [New Security Concerns](#new-security-concerns) + - [Resource Server Migration: Impact on mural-oauth](#resource-server-migration-impact-on-mural-oauth) + - [Practical Assessment](#practical-assessment) +- [Outstanding Research](#outstanding-research) + - [IDE Support for Protected Resource Metadata (Resolved)](#ide-support-for-protected-resource-metadata-resolved) + - [External IdP Selection](#external-idp-selection) + - [The Non-Enterprise User Gap](#the-non-enterprise-user-gap) + - [Resolution Option 1: Dedicated IdP with Mural as a Custom Social Connection](#resolution-option-1-dedicated-idp-with-mural-as-a-custom-social-connection) + - [Resolution Option 2: Mural Evolves Its OAuth Infrastructure](#resolution-option-2-mural-evolves-its-oauth-infrastructure) + - [Resolution Option 3: Dual Auth Architecture](#resolution-option-3-dual-auth-architecture) + - [Resolution Synthesis](#resolution-synthesis) +- [Relationship to the FastMCP Migration](#relationship-to-the-fastmcp-migration) +- [Summary of Recommended Sequencing](#summary-of-recommended-sequencing) + +## Background + +A security audit in early 2026 produced eight tickets (critical, high, medium) and a design preamble recommending Banksy adopt an OAuth Resource Server posture under FastMCP. This document analyzes those findings and the research completed to inform migration direction. + +Banksy supports auth modes `sso-proxy`, `mural-oauth`, and `m2m` (configured via `AUTH_MODE`, one per deployment). The audit targeted `sso-proxy` only — `mural-oauth` was introduced afterward. This document covers both: the original audit analysis and an extrapolation to mural-oauth. + +Banksy's auth is two stacked OAuth-like layers: + +- **Layer 1 (IDE → Banksy):** In sso-proxy, MCP OAuth + Google OAuth (via SSO proxy) establishes a session and issues MCP tokens. In mural-oauth, the MCP flow redirects to Mural's consent page — Mural serves as both IdP and API token source, collapsing two layers into one user-facing flow. +- **Layer 2 (Banksy → Mural):** In sso-proxy, a session-activation code/nonce pattern lets the browser perform Mural OAuth; Banksy claims tokens (code + server-only nonce) for server-side use. In mural-oauth, Layer 2 is embedded in the Layer 1 flow — the authorization code grant with Mural yields identity (via `/api/public/v1/users/me`) and API tokens (access + refresh) stored for Mural API calls. + +Each ticket has two categories: "fix in current flow" (harden without changing architecture) and "FastMCP design direction" (assumes migration). They carry different risk profiles and timelines. + +--- + +## The Audit's Central Recommendation: Banksy as a Resource Server + +The audit recommends Banksy stop operating as an OAuth **Authorization Server** (AS) — issuing tokens — and instead become a **Resource Server** (RS) that validates tokens issued by an external IdP. ("Resource server" is OAuth 2.0 terminology, unrelated to MCP Resources.) + +Today Banksy is an AS: Better Auth's MCP plugin serves `/.well-known/oauth-authorization-server`, handles DCR, and issues MCP tokens. As an RS, Banksy would serve `/.well-known/oauth-protected-resource` (RFC 9728) pointing IDEs to an external IdP. The IDE authenticates with the IdP, gets a JWT, and presents it on every MCP request. Banksy validates signature (via JWKS), issuer, audience, and expiration, then authorizes tool calls based on claims and scopes. + +The rationale: running an AS exposes confused-deputy attacks, DCR abuse, and authorization code interception. Eliminating the AS eliminates those surfaces. Ticket 3 is the concrete expression — remove Better Auth's MCP plugin and `/api/auth/mcp/*`. + +### What This Changes and What It Doesn't + +The RS model only affects Layer 1 (IDE → Banksy authentication). Layer 2 is unchanged: Banksy still stores and uses Mural API tokens. The IDE-presented JWT answers "is this a legitimate user?" but doesn't provide Mural tokens. + +This two-layer separation is inherent, not a consequence of IdP choice. Using Mural as the IdP would eliminate Layer 2 but is not viable (see three blockers in External IdP Selection). This applies to mural-oauth as well — despite combining both layers today, the RS migration causes the two modes to converge (see Resource Server Migration: Impact on mural-oauth). + +Post-migration flow (both modes): + +1. **One-time setup (browser):** User authenticates with external IdP (Layer 1), completes Mural OAuth (Layer 2). Banksy stores Mural tokens. +2. **Every MCP request (IDE):** IDE presents an IdP-issued JWT. Banksy validates it, looks up Mural tokens in Postgres, executes the tool call, returns the result. + +### Open Questions + +Three questions the audit doesn't answer. The first and third are resolved; the second remains open. + +**Who is the external IdP? (Resolved — see External IdP Selection)** Google ID tokens are JWTs but lack custom audiences/scopes. Mural's tokens are functionally opaque to external validators (resolves Risk 1 from the FastMCP auth strategy plan). A dedicated IdP (Auth0, Azure AD) gives full control but adds infrastructure. Full assessment in External IdP Selection. + +**Does the IdP cover all Mural user segments? (Open — see Non-Enterprise User Gap)** Fails for Mural self-serve users (email/password). Google excludes non-Google users; a standalone dedicated IdP requires a new account. Resolution: dedicated IdP with Mural as a custom upstream OAuth provider — depends on PoC validation. Provider evaluation scoped separately (`banksy/.cursor/prompts/research-auth-provider-alternatives.md`). + +**Do IDEs support PRM? (Resolved — see IDE Support for PRM)** The 2025-11-25 MCP spec makes PRM mandatory. Cursor, VS Code (Copilot), and Claude Desktop all support it. Zed and Continue.dev lack remote MCP OAuth entirely but can't use Banksy's current AS model either — no regression. + +--- + +## Ticket-by-Ticket Assessment + +### Tickets Safe to Implement Now + +Seven of eight tickets have "fix in current flow" recommendations safe to ship independently of FastMCP. + +**Ticket 1 (Critical) — OAuth Login CSRF:** sso-proxy's Google OAuth callback logs a warning on state validation failure but continues to create a session. Fix: generate crypto-secure state server-side, store with TTL, reject on mismatch/expiration, enforce single-use. *Mural-oauth:* Substantially addressed — uses HMAC-SHA256 state with `crypto.timingSafeEqual` and 10-minute TTL, a major improvement. Gap: state is not single-use (replayable within TTL window). Mitigated by Mural's single-use authorization codes, but defense-in-depth calls for server-side nonce tracking. + +**Ticket 2 (Critical) — Open Redirect:** `url.startsWith('/')` accepts protocol-relative URLs like `//evil.com`. Fix: parse with `new URL(url, baseOrigin)`, require `u.origin === baseOrigin`. *Mural-oauth:* Same vulnerability in `isValidCallbackUrl` — the `startsWith('/')` early return bypasses the origin comparison branch. Same fix. + +**Ticket 4 (High) — Security Headers:** Auth pages lack CSP, HSTS, Referrer-Policy, anti-framing headers. Purely additive (Azure Front Door or middleware). *Mural-oauth:* Applies equally — same infrastructure, same gap. + +**Ticket 5 (High) — Mural Claim Race Condition:** `/auth/mural/claim` reads, claims, saves, deletes in non-atomic steps — concurrent claims can both succeed. Fix: atomic compare-and-set with `claimedAt`/`status` column. *Mural-oauth:* No `/auth/mural/claim` endpoint. Token storage uses `INSERT ... ON CONFLICT DO UPDATE` (atomic) — significantly safer. Lower risk, verify under concurrent load. + +**Ticket 6 (High) — Plaintext Refresh Tokens:** Mural tokens stored unencrypted in Postgres. Fix: envelope encryption with Azure Key Vault (transparent encrypt-on-write, decrypt-on-read). *Mural-oauth:* `muralOauthToken` stores tokens as plain text, same as `muralSessionToken`. Shared abstraction in `mural-tokens.ts` means one fix covers both tables. + +**Ticket 7 (Medium) — OAuth Codes in URL History:** SPA doesn't call `history.replaceState()` after reading the authorization code. One-line fix. *Mural-oauth:* Same gap in `oauth-callback.tsx`. + +**Ticket 8 (Medium) — Sensitive Auth Logging:** Auth flows log truncated state/code values. Fix: replace with correlation IDs. *Mural-oauth:* Less severe (doesn't log tokens or codes) but logs full `callbackURL` and passes `error_description` to client. Fix: hash or boolean for callback URLs, generic error to browser, `error_description` server-side only. + +### The Migration Ticket + +**Ticket 3 (High) — Remove Legacy MCP OAuth Surface:** The only ticket that breaks existing behavior — remove Better Auth's MCP plugin, `/api/auth/mcp/*`, and `/.well-known/oauth-authorization-server`. This is the migration itself, scoped as a ticket. IDE compatibility confirmed for Cursor, VS Code, Claude Desktop. Prerequisites: select external IdP, implement replacement via `RemoteAuthProvider` or `OAuthProxy`. Execute only after replacement is deployed and tested. *Mural-oauth:* Applies equally — same plugin, removed from both modes simultaneously. + +--- + +## Mural-OAuth Mode: Security and Migration Analysis + +### How mural-oauth Differs from sso-proxy + +Introduced after the audit. The MCP OAuth flow redirects to Mural's consent page; after authorization, an SPA exchanges the code for Mural OAuth tokens (access + refresh). The callback plugin fetches user info from Mural's public API, creates/finds a Better Auth user, stores tokens in Postgres, and completes the MCP flow. + +Key difference from sso-proxy: identity and API access come from the same source (Mural) via a single OAuth flow, vs. sso-proxy's two-step flow (Google sign-in + Mural session activation). Uses the Mural Public API (`/api/public/v1/`) rather than internal content endpoints. Tokens stored in `muralOauthToken` (vs. `muralSessionToken`), token type `oauth` (vs. `session`). + +Despite appearing to collapse the two layers, Better Auth's MCP plugin still serves as the AS (`/.well-known/oauth-authorization-server`), and the Mural OAuth flow is embedded within the MCP flow. This matters for the RS migration. + +### Security Posture: Ticket Extrapolation Summary + +Six of eight tickets apply to mural-oauth with varying severity. + +**Direct apply (same fix):** Ticket 2 (open redirect — identical `startsWith('/')` bug), Ticket 4 (security headers), Ticket 6 (plaintext tokens in `muralOauthToken` — persists post-migration), Ticket 7 (code not scrubbed from history in `oauth-callback.tsx`). + +**Partially addressed:** Ticket 1 (CSRF) — substantially mitigated via HMAC-SHA256 state with nonce, timestamp, 10-minute TTL, and `crypto.timingSafeEqual`. Materially stronger than sso-proxy's "log and continue." Gap: state is not single-use (replayable within TTL). Mitigated by Mural's single-use codes; defense-in-depth calls for server-side nonce tracking. + +**Reduced applicability:** Ticket 5 — no `/auth/mural/claim` endpoint; token storage uses atomic `INSERT ... ON CONFLICT DO UPDATE`. Ticket 8 — doesn't log tokens/codes but logs `callbackURL` and passes `error_description` to client; clean up both. + +**Identical:** Ticket 3 applies equally — same Better Auth MCP plugin. + +### New Security Concerns + +Concerns introduced by mural-oauth beyond the eight audit tickets: + +**State replay window.** HMAC-signed state is replayable for 10 minutes — no server-side nonce storage or consumption tracking. Mural's single-use codes limit practical impact, but the window is wider than necessary. Fix: store used nonces in Redis/DB with TTL matching `STATE_MAX_AGE_MS`, reject seen nonces. + +**Better Auth secret as CSRF root of trust.** State HMAC is keyed with `ctx.context.secret`. Compromise enables forging valid state for arbitrary callbacks (login CSRF). Not a vulnerability per se — any HMAC scheme depends on secret integrity — but makes this secret a higher-value target than in sso-proxy mode. + +**SPA error parameter reflection.** `oauth-callback.tsx` renders `error` and `error_description` from URL params. React's text escaping prevents XSS when rendered as text, but unsanitized values risk phishing if ever rendered as HTML. Fix: generic error to user, log `error_description` server-side only. + +**Scope breadth.** Default scopes (`murals:read`, `murals:write`, `workspaces:read`, `rooms:read`, `identity:read`, `templates:read`) are broad. Overridable via `MURAL_OAUTH_SCOPES` but no runtime validation against tool requirements. Audit for minimum privilege. + +**Client credentials in environment.** `MURAL_OAUTH_CLIENT_ID` and `MURAL_OAUTH_CLIENT_SECRET` as env vars. Handled appropriately (secret used only for token exchange and HMAC, client ID truncated in logs). Should be rotated periodically and stored in Azure Key Vault for production. + +### Resource Server Migration: Impact on mural-oauth + +The RS migration causes mural-oauth and sso-proxy to converge architecturally — the most significant finding of this analysis. + +Today mural-oauth provides single-step UX: one Mural consent yields identity + API access. Under RS, Layer 1 requires an external IdP with JWKS-validatable tokens. Mural doesn't qualify (three blockers in External IdP Selection apply regardless of mode). + +Post-migration, mural-oauth loses single-step UX — unless the external IdP supports upstream token storage. With a dedicated IdP configured with Mural as a custom social connection (e.g., Auth0's Token Vault), the IdP stores Mural tokens during social login and Banksy retrieves them server-to-server. One step yields both IdP JWT (Layer 1) and Mural tokens (Layer 2). See Non-Enterprise User Gap for spec compliance and limitations. + +Without token storage: users authenticate with the external IdP (Layer 1), then separately complete Mural OAuth in the browser (Layer 2) — structurally identical to sso-proxy. The `mural-oauth.ts` token exchange/storage logic persists as the Layer 2 mechanism; Better Auth's MCP plugin is replaced by PRM + external IdP. + +| Aspect | Current mural-oauth | Post-migration (without token storage) | Post-migration (with token storage) | +|---|---|---|---| +| Layer 1 (IDE auth) | Better Auth MCP plugin (AS) | External IdP via PRM (RS) | External IdP via PRM (RS) | +| Layer 2 (Mural API) | Mural OAuth (same flow as L1) | Mural OAuth (separate browser step) | IdP retrieves Mural tokens during social login | +| User steps | 1 (Mural consent) | 2 (IdP auth + Mural connect) | 1 (Mural consent via IdP) | +| Token storage | Needed (`muralOauthToken`) | Still needed (Banksy stores) | IdP stores upstream tokens; Banksy retrieves | +| SPA callback | Needed for OAuth callback | Still needed for Layer 2 | Not needed (no separate browser step) | +| `/.well-known` | `oauth-authorization-server` | `oauth-protected-resource` | `oauth-protected-resource` | + +The convergence means a single target architecture regardless of starting mode. Same external IdP, same FastMCP auth class, same PRM config. The mural-oauth token exchange/storage becomes the universal Layer 2 mechanism — its use of Mural Public API and standard OAuth tokens makes it the more portable implementation. + +### Practical Assessment + +**Mural-oauth fixes needed:** Tickets 2 (open redirect) and 6 (plaintext tokens) most urgent. Ticket 1: add single-use nonce. Ticket 7: one-line `history.replaceState()`. Ticket 8: logging cleanup. Ticket 5: no fix needed (atomic upsert sufficient). + +**New hardening:** Server-side nonce tracking for state, generic error messages (not `error_description`), scope audit for minimum privilege. + +**Migration:** Same target architecture for both modes. Subsumes Ticket 3. Mural-oauth token exchange/storage survives as Layer 2. Fix tickets first — vulnerabilities exist in production now, token encryption persists post-migration, and migration timeline is uncertain. + +--- + +## Outstanding Research + +### IDE Support for Protected Resource Metadata (Resolved) + +The RS model is the only auth model in the current MCP spec. PR #338 (April 2025) separated MCP servers from authorization servers; the 2025-11-25 revision made it normative: servers MUST implement PRM (RFC 9728), clients MUST use it for AS discovery. No spec-defined path remains for MCP servers as authorization servers. Banksy's current AS model works only because IDEs still attempt legacy discovery as fallback — this will erode. + +**IDE support:** All three major clients support PRM today: +- **Cursor** (v1.0+, June 2025): Full PRM discovery, follows `authorization_servers` links, OAuth 2.1 + PKCE. Known bug: `resource_metadata` URL from `WWW-Authenticate` header lost after redirect — only affects non-standard metadata paths, not `/.well-known/oauth-protected-resource`. +- **VS Code / Copilot** (v1.102, July 2025): Full PRM, OIDC Discovery, RFC 8414. Microsoft documents Entra ID integration with RS model. +- **Claude Desktop:** OAuth 2.1 with PRM over streamable HTTP. + +**Unsupported clients:** Zed (bug #43162) and Continue.dev (enhancement #6282) lack remote MCP OAuth entirely — can't use Banksy's current AS model either, no regression. Windsurf has no clear remote OAuth PRM support. + +**SDKs:** TypeScript MCP SDK (v1.27.1) implements `discoverOAuthProtectedResourceMetadata()` with a known redirect bug (Issue #1234, fix in PR #1350). Python SDK has open PR #982 for AS/RS separation. + +**Critical: FastMCP auth classes.** Bare `JWTVerifier` does not serve PRM — IDEs would have no discovery metadata. Must use `RemoteAuthProvider` (DCR-capable IdPs) or `OAuthProxy` (non-DCR IdPs). Both serve PRM and produce spec-compliant resource servers. See External IdP Selection. + +**Existing examples:** Microsoft/Entra ID + VS Code, mcp-auth.dev reference implementations, Quarkus tutorial — pattern works end-to-end when serving PRM at standard well-known path. + +**Verdict:** RS model viable today for the three major IDEs. Requirements: serve PRM at `/.well-known/oauth-protected-resource`, use `RemoteAuthProvider` or `OAuthProxy` (not bare `JWTVerifier`). + +### External IdP Selection + +Primary remaining architectural decision. Determines JWT validation config and FastMCP auth class. + +**`RemoteAuthProvider`** requires DCR (RFC 7591) — IDEs auto-register with the IdP. Supported by WorkOS AuthKit, Descope, Auth0 (if configured). Composes `JWTVerifier` + automatic PRM endpoints = pure RS with no AS surface. + +**`OAuthProxy`** bridges non-DCR IdPs (Google, Azure AD, GitHub) by presenting a DCR interface to IDEs while holding pre-registered upstream credentials. Reintroduces some AS surface (DCR registrations, proxied tokens) but far less than Better Auth, and token validation is still delegated to the IdP. + +Layer 2 is unchanged regardless of IdP choice (see What This Changes). The only candidate that could collapse both layers is Mural-as-IdP, assessed below. + +**Google:** Already used for sso-proxy Layer 1. ID tokens are JWTs validatable via JWKS, but no custom audiences/scopes (audience = client ID, scopes limited to openid/email/profile). No DCR → requires `OAuthProxy`. Sufficient for "is this a legitimate Google user?" but no fine-grained MCP scope control. + +**Mural OAuth:** Investigated as a layer-collapsing candidate (resolves Risk 1 from FastMCP auth strategy plan). Three independent blockers, each individually fatal: + +*Blocker 1: HS256 tokens with no JWKS.* Mural OAuth tokens are JWTs signed HS256 with a symmetric secret (`jwt.sign(claims, config.jwt.secret)`). No JWKS endpoint, no asymmetric keys, no issuer/audience claims. Banksy can't validate without Mural's `config.jwt.secret` — a security boundary violation enabling token forgery. Functionally opaque to external validators. + +*Blocker 2: No OAuth discovery.* Mural serves no `/.well-known/oauth-authorization-server`, `/.well-known/openid-configuration`, or RFC 8414 metadata. No DCR — clients are pre-registered. IDEs following PRM `authorization_servers` links would find nothing. Neither `RemoteAuthProvider` nor `OAuthProxy` works without a discovery endpoint. + +*Blocker 3: MCP token passthrough prohibition.* Even with JWKS and discovery, the spec states: "The MCP server MUST NOT pass through the token it received from the MCP client." Mural-as-IdP does exactly this. RFC 8707 audience binding makes it structural: a token with `resource=https://banksy.example.com` is audience-bound to Banksy (Mural rejects it); a Mural-audience token fails Banksy's validation. No audience satisfies both. The two-layer architecture (IdP JWT for L1, stored Mural tokens for L2) is the spec-compliant pattern. + +**Verdict:** Mural-as-IdP is not viable. Blockers 1-2 require Mural infrastructure changes; Blocker 3 is a fundamental MCP constraint. The existence of `mural-oauth` mode doesn't help — `OAuthProxy` delegates to the upstream IdP's JWKS (Mural has none). Mural-oauth's single-step UX depends on Banksy acting as an AS intermediary, which the RS migration eliminates. + +**Dedicated IdP** (Auth0, Azure AD/Entra ID, WorkOS, Descope): Full control over JWT shape, JWKS, custom audiences/scopes, token lifetimes. Auth0, WorkOS, Descope support DCR → `RemoteAuthProvider` (pure RS). Azure AD lacks DCR → `OAuthProxy`. Adds infrastructure but eliminates Google's format limitations and enables Banksy-specific scopes (`mcp:tools`, `mural:read`, `mural:write`). + +The choice cascades into JWT validation config, scope design, user account linking, and operational burden. DCR + `RemoteAuthProvider` is architecturally cleanest (highest setup cost). Google + `OAuthProxy` is lowest friction (no scope control). But the more fundamental implication: IdP choice determines which Mural user segments can use Banksy at all. + +### The Non-Enterprise User Gap + +The IdP assessment assumes users have accounts with the external provider. This holds for enterprise SSO and Google users but fails for self-serve Mural users (email/password, individual/team plans). + +Mural supports five auth methods: email/password (`api/src/api/session/signin.ts`), Google social (`data/src/data/models/idp/providers/google.ts`), Microsoft social (`data/src/data/models/idp/providers/microsoft.ts`), SAML SSO (`api/src/api/authenticate/saml2/`), OAuth2 SSO (`api/src/api/authenticate/oauth2/authorization/`). The mural-oauth mode reaches all segments via Mural's consent page. The RS migration would regress this. + +**Google as IdP:** Excludes email/password users, Microsoft social users, non-Google enterprise SSO users — likely a majority of Mural's user base. Severe regression. + +**Dedicated IdP without custom connections:** No Mural user has a pre-existing account. Requires new signup, decouples identity from Mural, creates user-mapping problems. + +**Mural-as-IdP:** Covers all users but blocked by the three technical issues above. + +This is an access issue. Under RS with Google or a standard dedicated IdP, an entire class of users loses access to Banksy. + +#### Resolution Option 1: Dedicated IdP with Mural as a Custom Social Connection + +Most promising: a dedicated IdP (Auth0, Descope) configured with Mural as a custom upstream OAuth provider. The IdP provides protocol infrastructure (JWKS, discovery, DCR, RS256 JWTs); Mural handles actual authentication. Auth0 and Descope support custom social connections (manually configure upstream auth/token/userinfo URLs — no upstream discovery needed). + +Flow: +1. IDE discovers Banksy's PRM → follows `authorization_servers` to dedicated IdP +2. IdP redirects to Mural (configurable to skip own login page via "Home Realm Discovery") — user sees only Mural's consent page +3. User authenticates with Mural (any method) and authorizes Banksy +4. Mural redirects to IdP with authorization code +5. IdP exchanges code for Mural tokens, fetches user info, issues RS256 JWT +6. IDE presents IdP JWT to Banksy → validates via JWKS + +Addresses Blockers 1-2 (IdP issues proper JWTs with JWKS + discovery), sidesteps Blocker 3 (token is IdP-issued, no passthrough). Covers all Mural user segments. + +**Token capture for Layer 2.** During step 5, the IdP obtains Mural tokens. Auth0's Token Vault stores these upstream tokens; Banksy retrieves them server-to-server (federated connection access token exchange). If this works for custom social connections (needs PoC), it eliminates the separate "Mural connect" browser step — single-step UX preserved. + +**Spec compliance.** The Mural token never comes from the MCP client. IDE sends IdP JWT (audience-bound to Banksy via RFC 8707); Banksy retrieves Mural tokens from IdP's store via server-to-server channel. Identical to current pattern (Banksy stores Mural tokens in Postgres after browser OAuth). The spec's passthrough prohibition targets client-to-upstream forwarding, not this pattern. + +**Limitations.** Auth0 Token Vault is Enterprise-only (unpublished pricing, requires sales). Custom Token Exchange in Early Access (2026). Needs PoC: refresh token storage + automatic refresh for custom connections. WorkOS doesn't support custom upstream OAuth (predefined providers only). Descope supports custom OAuth + DCR — evaluate alongside Auth0. Full provider evaluation: `banksy/.cursor/prompts/research-auth-provider-alternatives.md`. + +Without token storage, the dedicated IdP still solves user coverage. Layer 2 reverts to separate browser Mural OAuth — minor UX regression (two steps), not an access regression. + +#### Resolution Option 2: Mural Evolves Its OAuth Infrastructure + +"Build" alternative: Mural's platform team adds RS model infrastructure, enabling Mural as the IdP directly. + +*Blocker 1 fix: asymmetric signing + JWKS.* Migrate HS256 → RS256/ES256. Deeply embedded: `jwt.sign(claims, config.jwt.secret)` in `api/src/core/session/tokens/index.ts`, validation hardcoded to `algorithms: ['HS256']` in `api/src/security/jwt/index.ts`. Multiple token types use separate HS256 secrets in `api/config/defaults.json`. Requires: RSA/EC key pair management, update all sign/verify paths, expose JWKS endpoint, handle key rotation, transition period accepting both algorithms. + +*Blocker 2 fix: OAuth discovery.* Serve `/.well-known/oauth-authorization-server` or `/.well-known/openid-configuration` (RFC 8414/OIDC). Auth endpoints exist (`api/src/api/authenticate/oauth2/authorization/`) but no metadata document today. + +*Blocker 3: persists.* Even with JWKS + discovery, passthrough prohibition prevents using IDE-presented Mural tokens for API calls. Layer 2 still needed. However, RFC 8693 (Token Exchange) would let Banksy exchange the Layer 1 token for separate Layer 2 API tokens server-to-server — preserving single-step UX without passthrough. Mural doesn't implement RFC 8693 today. + +*DCR:* Not supported. Would require `OAuthProxy` (workable but adds some AS surface). + +Covers all users, no vendor cost. But depends on Mural platform team prioritization — no evidence of JWKS/RS256/discovery work in mural-api. Significant engineering scope, uncertain timeline, outside Banksy team's control. + +#### Resolution Option 3: Dual Auth Architecture + +Pragmatic fallback: AS model for non-enterprise deployments, RS model for enterprise. Per-instance compliant (each serves one model) but creates two codepaths, two security surfaces, two token validation mechanisms. Ticket 3 only applies to enterprise, leaving AS attack surface active elsewhere. + +Sustainable as a transitional state. Not viable long-term: (a) AS model becomes non-compliant as IDEs drop legacy fallbacks, (b) doubles security surface area, (c) audit recommends eliminating AS. + +#### Resolution Synthesis + +**Recommended:** Dedicated IdP with Mural as custom social connection. Only option that covers all users, is spec-compliant, doesn't depend on Mural platform team, and is implementable today. With upstream token storage: single-step UX. Without: two steps, but no one excluded. + +**Long-term ideal:** Mural infrastructure evolution (no vendor dependency) — but uncertain timeline, outside Banksy's control. Pursue as parallel conversation; don't block on it. + +**Fallback:** Dual auth as transitional state only, not target architecture. + +**Next step:** PoC validating (a) custom social connection flow with Mural upstream, (b) upstream token storage for custom connections, (c) end-to-end with Cursor and VS Code. Provider evaluation (Auth0, Descope, others) with pricing: `banksy/.cursor/prompts/research-auth-provider-alternatives.md`. + +--- + +## Relationship to the FastMCP Migration + +Related but separable. "Fix in current flow" items are independent hardening regardless of migration timeline. "FastMCP design direction" (Ticket 3) is migration planning. + +Overlap with existing FastMCP auth strategy plan (`fastmcp_auth_strategy_f355d421.plan.md`): Risk 1 (Mural token format) resolved — HS256 JWTs with no JWKS, ruling out Mural-as-IdP. Risks 2 and 4 (`get_access_token()` return value, token refresh lifecycle) remain open. New dimension: `OAuthProxy` vs `RemoteAuthProvider` choice depends on IdP DCR capability. IDE compatibility confirmed (Cursor, VS Code, Claude Desktop). Remaining decision: IdP selection. + +--- + +## Summary of Recommended Sequencing + +1. **Now (both modes):** Tickets 1, 2, 4, 6, 7, 8 as backward-compatible hardening. Ticket 5: fix sso-proxy's claim endpoint (mural-oauth's atomic upsert sufficient). Mural-oauth extras: server-side nonce tracking (Ticket 1 completion), generic errors (not `error_description`), scope audit. +2. **Auth provider evaluation + PoC:** IdP must support custom upstream OAuth with Mural (otherwise email/password and non-Google users excluded). Evaluate Auth0, Descope, others (`banksy/.cursor/prompts/research-auth-provider-alternatives.md`) for fit, pricing, complexity. PoC: (a) custom social connection with Mural upstream, (b) upstream token storage for custom connections, (c) end-to-end with Cursor and VS Code. In parallel: Mural platform team conversation re JWKS, discovery, RFC 8693. +3. **IdP decision:** Select provider based on PoC. DCR-capable (Auth0, Descope) → `RemoteAuthProvider`; non-DCR → `OAuthProxy`. Both modes converge on same architecture. +4. **Migration:** Implement FastMCP RS with chosen provider. Mural-oauth token exchange/storage becomes universal Layer 2. With upstream token storage: single-step UX. Validate end-to-end (PRM → IdP auth → token validation → Mural connect → tool invocation) with three major IDEs. +5. **Ticket 3:** Remove Better Auth MCP plugin and `/api/auth/mcp/*` only after replacement is deployed and proven. From b72a9124567863c94e2a7d13a2fa36f642cfc75e Mon Sep 17 00:00:00 2001 From: Willis Kirkham Date: Mon, 16 Mar 2026 17:19:10 -0700 Subject: [PATCH 2/4] Updates plan with AUTH_MODE naming and living document instructions Reverts agent-proposed BANKSY_MODE (internal/public/dev) to existing AUTH_MODE (sso-proxy/mural-oauth/dev) for migration simplicity. Adds "How to use this plan" section establishing the document as a living plan that should be revised in-place during implementation. --- .../00-migration-execution-strategy.md | 65 +++++++++++-------- 1 file changed, 38 insertions(+), 27 deletions(-) diff --git a/fastmcp-migration/execution-strategy-research/00-migration-execution-strategy.md b/fastmcp-migration/execution-strategy-research/00-migration-execution-strategy.md index d6daa3e..a2bf3a1 100644 --- a/fastmcp-migration/execution-strategy-research/00-migration-execution-strategy.md +++ b/fastmcp-migration/execution-strategy-research/00-migration-execution-strategy.md @@ -1,9 +1,18 @@ # Banksy xmcp-to-FastMCP Migration +## How to use this plan + +This is a **living document**. It reflects the intended final state of the migration and should be updated in-place as implementation reveals better approaches, new constraints, or scope changes. + +- When deviating from the plan during implementation, **update the plan first** before proceeding. The plan should always describe what we're actually building, not what we originally thought we'd build. +- Mark revisions inline with a brief **`Revised:`** annotation so readers can tell what changed and why (e.g. *"Revised: switched from X to Y because Z"*). Don't silently overwrite — the revision trail is useful context. +- The `.plan.md` is the working copy. The `.md` preview on the PR is a sharing snapshot and does not need to stay in sync during implementation. +- Research documents linked in the Deep Research Index are **not revised** — they capture point-in-time analysis. If a research finding turns out to be wrong, note it in the plan rather than editing the research doc. + ## Summary -Rewrite banksy from a 3-process TypeScript/xmcp architecture to a Python/FastMCP server. `BANKSY_MODE` (internal/public/dev) selects the auth provider and tool set at runtime — one Docker image, multiple deployments. Two `FastMCP.from_openapi()` calls replace both `banksy-mural-api` (internal API, 39 tools) and `banksy-public-api` (Public API, 87 tools) code-gen pipelines. Auth uses FastMCP's built-in OAuth with Google as the initial IdP for Layer 1 (IDE to banksy) plus custom Python for Layer 2 (banksy to Mural API) token management. Database is a fresh PostgreSQL schema (no data migration). A React SPA is preserved for browser-facing pages (home, Session Activation, error) and served from the same process via Starlette's `StaticFiles`. +Rewrite banksy from a 3-process TypeScript/xmcp architecture to a Python/FastMCP server. `AUTH_MODE` (sso-proxy/mural-oauth/dev) selects the auth provider and tool set at runtime — one Docker image, multiple deployments. Two `FastMCP.from_openapi()` calls replace both `banksy-mural-api` (internal API, 39 tools) and `banksy-public-api` (Public API, 87 tools) code-gen pipelines. Auth uses FastMCP's built-in OAuth with Google as the initial IdP for Layer 1 (IDE to banksy) plus custom Python for Layer 2 (banksy to Mural API) token management. Database is a fresh PostgreSQL schema (no data migration). A React SPA is preserved for browser-facing pages (home, Session Activation, error) and served from the same process via Starlette's `StaticFiles`. The repo uses a uv workspace structure under `pypackages/` — only `banksy-server` is created now. The workspace is ready to expand with `banksy-shared` (extracted shared code) and `banksy-harness` (agent orchestration) when those consumers are needed. Existing TS code in `packages/` stays as read-only reference until the final cleanup removes all TypeScript artifacts. @@ -19,9 +28,9 @@ graph TD Core -.->|"code-gen at build time"| PublicAPI end - subgraph after ["Target (FastMCP, 1 image, BANKSY_MODE per deploy)"] - ClientInt["LLM Client"] -->|"MCP HTTP"| InternalDeploy["banksy BANKSY_MODE=internal"] - ClientPub["LLM Client"] -->|"MCP HTTP"| PublicDeploy["banksy BANKSY_MODE=public"] + subgraph after ["Target (FastMCP, 1 image, AUTH_MODE per deploy)"] + ClientInt["LLM Client"] -->|"MCP HTTP"| InternalDeploy["banksy AUTH_MODE=sso-proxy"] + ClientPub["LLM Client"] -->|"MCP HTTP"| PublicDeploy["banksy AUTH_MODE=mural-oauth"] InternalDeploy -->|REST| MURAL2I["Mural API (internal)"] PublicDeploy -->|REST| MURAL2P["Mural API (public)"] Browser["Browser"] -->|"SPA + auth routes"| PublicDeploy @@ -53,7 +62,7 @@ graph LR | Phase | What It Delivers | Depends On | Parallelism | |-------|-----------------|------------|-------------| -| 1 Bootstrap | uv workspace skeleton (root + `banksy-server` under `pypackages/`), echo tool, health endpoint, `BANKSY_MODE` config, CI | Nothing | -- | +| 1 Bootstrap | uv workspace skeleton (root + `banksy-server` under `pypackages/`), echo tool, health endpoint, `AUTH_MODE` config, CI | Nothing | -- | | 2 OpenAPI Tools | `from_openapi()` integration, Mural API tools | 1 | Parallel with 4, 5 | | 3 Tool Curation | LLM-friendly names, descriptions, transforms, composites | 2 | -- | | 4 Database | PostgreSQL schema, Alembic migrations, token storage | 1 | Parallel with 2, 5 | @@ -76,12 +85,12 @@ Two hard constraints shape the FastMCP migration: ### Deployment Mode Selection (Option E) -Build one Docker image. At runtime, `BANKSY_MODE` selects the auth provider and tool set. Within each mode, tags provide finer-grained client-side filtering. +Build one Docker image. At runtime, `AUTH_MODE` selects the auth provider and tool set. Within each mode, tags provide finer-grained client-side filtering. ``` -BANKSY_MODE=internal -> FastMCP(auth=SSOProxyAuth) + internal tools + internal tags -BANKSY_MODE=public -> FastMCP(auth=MuralOAuthAuth) + public tools + public tags -BANKSY_MODE=dev -> FastMCP(auth=None) + all tools + all tags +AUTH_MODE=sso-proxy -> FastMCP(auth=SSOProxyAuth) + internal tools + internal tags +AUTH_MODE=mural-oauth -> FastMCP(auth=MuralOAuthAuth) + public tools + public tags +AUTH_MODE=dev -> FastMCP(auth=None) + all tools + all tags ``` ### Startup Flow @@ -96,10 +105,10 @@ def create_server() -> FastMCP: register_common_routes(mcp) # /health, /version match settings.banksy_mode: - case "internal": + case "sso-proxy": register_internal_tools(mcp) register_session_activation_routes(mcp) - case "public": + case "mural-oauth": register_public_tools(mcp) register_mural_oauth_routes(mcp) case "dev": @@ -116,9 +125,9 @@ def create_server() -> FastMCP: ### Auth Provider per Mode -**Internal mode (`sso-proxy`):** Layer 1 uses `OAuthProxy` or `RemoteAuthProvider` with Google IdP via SSO proxy. Layer 2 stores session JWTs in `mural_tokens`. Tools call `banksy-mural-api` (internal REST) with session JWTs. +**sso-proxy mode:** Layer 1 uses `OAuthProxy` or `RemoteAuthProvider` with Google IdP via SSO proxy. Layer 2 stores session JWTs in `mural_tokens`. Tools call `banksy-mural-api` (internal REST) with session JWTs. -**Public mode (`mural-oauth`):** Layer 1 uses `OAuthProxy` wrapping Mural's OAuth authorization server. Layer 2 stores Mural OAuth access/refresh tokens in `mural_tokens`. Tools call mural-api's public API with OAuth access tokens. +**mural-oauth mode:** Layer 1 uses `OAuthProxy` wrapping Mural's OAuth authorization server. Layer 2 stores Mural OAuth access/refresh tokens in `mural_tokens`. Tools call mural-api's public API with OAuth access tokens. **Dev mode:** Layer 1 has no auth (`auth=None` or `StaticTokenVerifier`). Layer 2 tokens loaded from dev seed data. Both tool sets registered; backend URLs configurable. @@ -163,8 +172,8 @@ banksy/ │ ├── src/ │ │ └── banksy_server/ │ │ ├── __init__.py -│ │ ├── server.py # Entry point: reads BANKSY_MODE, wires auth + domains -│ │ ├── config.py # pydantic-settings with BANKSY_MODE, DB URLs, auth +│ │ ├── server.py # Entry point: reads AUTH_MODE, wires auth + domains +│ │ ├── config.py # pydantic-settings with AUTH_MODE, DB URLs, auth │ │ ├── mural_api.py # FastMCP.from_openapi() integration │ │ ├── spa.py # SpaStaticFiles class │ │ ├── auth/ # providers.py, sso_proxy.py, mural_oauth.py, token_manager.py @@ -339,12 +348,12 @@ Two separate `from_openapi()` sub-servers, one per API spec: - Filter to the operation IDs currently exposed by `banksy-public-api` - Uses standard OAuth tokens for all operations -Both use `RouteMap`: GET → RESOURCE, POST/PUT/DELETE → TOOL, deprecated/internal → EXCLUDE. Each mounts onto the server within its respective `BANKSY_MODE` — `mount()` organizes tools by namespace within a single mode, not across auth modes (see Server Topology). +Both use `RouteMap`: GET → RESOURCE, POST/PUT/DELETE → TOOL, deprecated/internal → EXCLUDE. Each mounts onto the server within its respective `AUTH_MODE` — `mount()` organizes tools by namespace within a single mode, not across auth modes (see Server Topology). -**Phasing**: Start with the Public API spec in Phase 2 (when `BANKSY_MODE=public` or `dev`). Add the internal API spec as a follow-on (when `BANKSY_MODE=internal` or `dev`). The plumbing is identical — `from_openapi()` is called with different specs and different httpx clients (different base URLs, different auth injection per mode). +**Phasing**: Start with the Public API spec in Phase 2 (when `AUTH_MODE=mural-oauth` or `dev`). Add the internal API spec as a follow-on (when `AUTH_MODE=sso-proxy` or `dev`). The plumbing is identical — `from_openapi()` is called with different specs and different httpx clients (different base URLs, different auth injection per mode). ```python -# In BANKSY_MODE=public (or dev) +# In AUTH_MODE=mural-oauth (or dev) public_api = FastMCP.from_openapi( openapi_spec=public_spec, client=public_http_client, @@ -353,7 +362,7 @@ public_api = FastMCP.from_openapi( ) mcp.mount(public_api, namespace="mural") -# In BANKSY_MODE=internal (or dev) +# In AUTH_MODE=sso-proxy (or dev) internal_api = FastMCP.from_openapi( openapi_spec=internal_spec, client=internal_http_client, @@ -427,9 +436,9 @@ mcp.enable(tags={"murals"}, only=True) # Mural-focused deployment ### Deployment Modes (Resolved) -Mode merging is not recommended. `BANKSY_MODE` is preserved as a runtime configuration flag. Auth modes are capability constraints — internal and public tools call different APIs with incompatible token types. FastMCP's one-auth-per-server constraint means a single server cannot cleanly handle multiple auth strategies. MCP clients support multiple servers, so separate deployments per auth mode is transparent to users. +Mode merging is not recommended. `AUTH_MODE` is preserved as a runtime configuration flag. Auth modes are capability constraints — internal and public tools call different APIs with incompatible token types. FastMCP's one-auth-per-server constraint means a single server cannot cleanly handle multiple auth strategies. MCP clients support multiple servers, so separate deployments per auth mode is transparent to users. -The current two TS Dockerfiles (`Dockerfile` for sso-proxy, `Dockerfile.mural-oauth` for mural-oauth) are replaced by a single `Dockerfile.server` — the mode is runtime config (`BANKSY_MODE` env var), not build-time. See Server Topology for the full design. +The current two TS Dockerfiles (`Dockerfile` for sso-proxy, `Dockerfile.mural-oauth` for mural-oauth) are replaced by a single `Dockerfile.server` — the mode is runtime config (`AUTH_MODE` env var), not build-time. See Server Topology for the full design. --- @@ -612,7 +621,7 @@ This `get_authenticated_user` helper belongs in the auth module and is reused ac | (new) `IDP_ISSUER` | Expected JWT issuer | | (new) `IDP_AUDIENCE` | Expected JWT audience | | (new) `IDP_AUTHORIZATION_SERVER` | IdP URL for PRM metadata | -| (new) `BANKSY_MODE` | `internal`, `public`, or `dev` — selects auth provider and tool set (see Server Topology) | +| (new) `AUTH_MODE` | `sso-proxy`, `mural-oauth`, or `dev` — selects auth provider and tool set (see Server Topology) | | (new) `ENABLED_TAGS` | Optional comma-separated tag filter for specialized deployments (e.g., `read`) | **Key TS reference**: @@ -765,7 +774,7 @@ pypackages/server/src/banksy_server/domains/ └── tools.py ``` -Each domain's `register_*_tools(mcp)` function takes a `FastMCP` instance and registers all tools for that domain, including tags and metadata. The domain owns its tool definitions, schemas, and any domain-specific helpers. `server.py` calls the appropriate registration functions based on `BANKSY_MODE` (see Server Topology). +Each domain's `register_*_tools(mcp)` function takes a `FastMCP` instance and registers all tools for that domain, including tags and metadata. The domain owns its tool definitions, schemas, and any domain-specific helpers. `server.py` calls the appropriate registration functions based on `AUTH_MODE` (see Server Topology). ### from_openapi() in Domain Context @@ -789,7 +798,7 @@ def register_public_tools(mcp: FastMCP) -> None: ### Routes by Concern -Non-MCP HTTP routes (`routes/`) are organized by concern, not by mode. Mode-specific routes are registered conditionally in `server.py` based on `BANKSY_MODE` — for example, Session Activation routes are only registered in `internal` and `dev` modes. +Non-MCP HTTP routes (`routes/`) are organized by concern, not by mode. Mode-specific routes are registered conditionally in `server.py` based on `AUTH_MODE` — for example, Session Activation routes are only registered in `sso-proxy` and `dev` modes. ### Canvas-MCP Absorption @@ -908,7 +917,7 @@ pypackages/server/tests/ ├── test_token_refresh.py # Token refresh logic ├── test_auth_flow.py # OAuth flow (HeadlessOAuth) ├── test_session_activation.py # Session Activation routes -├── test_mode_selection.py # BANKSY_MODE startup paths +├── test_mode_selection.py # AUTH_MODE startup paths └── test_integration/ # End-to-end tests ``` @@ -926,7 +935,7 @@ The TS codebase has ~15 Vitest test files. These should be reviewed as reference ### Dockerfile.server -Workspace-aware multi-stage build using uv. One Docker image serves all modes — `BANKSY_MODE` is a runtime env var, not build-time. This replaces the two current TS Dockerfiles (`Dockerfile` for sso-proxy, `Dockerfile.mural-oauth` for mural-oauth). +Workspace-aware multi-stage build using uv. One Docker image serves all modes — `AUTH_MODE` is a runtime env var, not build-time. This replaces the two current TS Dockerfiles (`Dockerfile` for sso-proxy, `Dockerfile.mural-oauth` for mural-oauth). ```dockerfile # Stage 1: Build SPA @@ -1124,7 +1133,7 @@ banksy-shared = { workspace = true } - **Single workspace member → multi-member**: When a second Python service is needed (e.g., agent harness), add a directory under `pypackages/` with its own `pyproject.toml`. Extract shared code into `banksy-shared` at that time. The workspace glob auto-discovers new members. - **pre-commit → CI only**: If hooks cause friction during rapid iteration and the team is 1–2 developers, rely on CI alone. - **custom_route() → raw Starlette routing**: If HTTP routes grow complex, use `starlette.routing.Router` for grouping, `Mount` for sub-apps, or Starlette middleware wrappers for per-route concerns. FastAPI is a last resort — Starlette is already underneath FastMCP. -- **`BANKSY_MODE` per-deployment → mode merging**: If a future need requires multi-auth in a single process, revisit Option B (protocol-level routing) or Option D (middleware-based auth) from the [server topology analysis](../banksy-research/tool-visibility-server-topology-research.md). +- **`AUTH_MODE` per-deployment → mode merging**: If a future need requires multi-auth in a single process, revisit Option B (protocol-level routing) or Option D (middleware-based auth) from the [server topology analysis](../banksy-research/tool-visibility-server-topology-research.md). --- @@ -1144,6 +1153,8 @@ Items from the canvas-mcp alignment assessment and architecture research that ar | 12 | When to extract `banksy-shared` | Trigger: when a second consumer (agent harness) needs shared code (models, auth utils, Mural client) | Open (deferred) | | 13 | When to create `banksy-harness` | Trigger: when agent orchestration work begins | Open (deferred) | +**Future naming consideration:** The research documents proposed renaming `AUTH_MODE` to something like `BANKSY_MODE` with more semantic values (`internal`/`public`/`dev`). That has merit for clarity, but we keep the existing naming (`AUTH_MODE` with `sso-proxy`/`mural-oauth`/`dev`) for migration simplicity. Consider revisiting the rename once the migration stabilizes. + --- ## Deep Research Index From f2bdabafa9d1f02c76b941bed765a9d08df53a95 Mon Sep 17 00:00:00 2001 From: Willis Kirkham Date: Wed, 18 Mar 2026 14:59:33 -0700 Subject: [PATCH 3/4] Converts migration plan to standalone roadmap Strips Cursor plan frontmatter, renames to plain .md, updates "how to use" section to reference phase-specific .plan.md files for execution. --- .../00-migration-execution-strategy.md | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/fastmcp-migration/execution-strategy-research/00-migration-execution-strategy.md b/fastmcp-migration/execution-strategy-research/00-migration-execution-strategy.md index a2bf3a1..36ab7a0 100644 --- a/fastmcp-migration/execution-strategy-research/00-migration-execution-strategy.md +++ b/fastmcp-migration/execution-strategy-research/00-migration-execution-strategy.md @@ -1,14 +1,13 @@ - # Banksy xmcp-to-FastMCP Migration -## How to use this plan +## How to use this roadmap This is a **living document**. It reflects the intended final state of the migration and should be updated in-place as implementation reveals better approaches, new constraints, or scope changes. -- When deviating from the plan during implementation, **update the plan first** before proceeding. The plan should always describe what we're actually building, not what we originally thought we'd build. +- When deviating from the roadmap during implementation, **update the roadmap first** before proceeding. It should always describe what we're actually building, not what we originally thought we'd build. - Mark revisions inline with a brief **`Revised:`** annotation so readers can tell what changed and why (e.g. *"Revised: switched from X to Y because Z"*). Don't silently overwrite — the revision trail is useful context. -- The `.plan.md` is the working copy. The `.md` preview on the PR is a sharing snapshot and does not need to stay in sync during implementation. -- Research documents linked in the Deep Research Index are **not revised** — they capture point-in-time analysis. If a research finding turns out to be wrong, note it in the plan rather than editing the research doc. +- Implementation is driven by **phase-specific `.plan.md` files** created from this roadmap when starting each phase. The roadmap defines what to build; phase plans define how to execute each chunk. +- Research documents linked in the Deep Research Index are **not revised** — they capture point-in-time analysis. If a research finding turns out to be wrong, note it in the roadmap rather than editing the research doc. ## Summary From f012d2f56974f4c22a044dde002fe396ee965693 Mon Sep 17 00:00:00 2001 From: Willis Kirkham Date: Thu, 26 Mar 2026 15:29:54 -0700 Subject: [PATCH 4/4] Adds canvas migration guidance document for external agent --- fastmcp-migration/canvas-migration-guide.md | 510 ++++++++++++++++++++ 1 file changed, 510 insertions(+) create mode 100644 fastmcp-migration/canvas-migration-guide.md diff --git a/fastmcp-migration/canvas-migration-guide.md b/fastmcp-migration/canvas-migration-guide.md new file mode 100644 index 0000000..7869438 --- /dev/null +++ b/fastmcp-migration/canvas-migration-guide.md @@ -0,0 +1,510 @@ +# Canvas Migration Guide — Banksy Python Project + +This document is a reference for an AI agent migrating canvas-mcp tools into the banksy Python monorepo. It describes the project's structure, conventions, and design intent so the agent can make decisions consistent with the project's vision without extensive codebase exploration. + +This is **not** a migration plan or step-by-step playbook. It is a map of the destination. + +--- + +## 1. Project Structure + +Banksy is a polyglot monorepo. Only the Python side is relevant — ignore everything under `packages/` (TypeScript). + +### Workspace layout + +``` +banksy/ # repo root +├── pyproject.toml # uv workspace root, shared tooling config +├── uv.lock # lockfile (committed) +├── pypackages/ +│ └── mcp/ # banksy-mcp — the only Python package +│ ├── pyproject.toml # package metadata, runtime deps, build config +│ ├── tests/ # top-level integration tests +│ └── src/ +│ └── banksy_mcp/ # importable package +│ ├── app.py # root FastMCP app, lifespan, surface mounts +│ ├── config.py # pydantic-settings config singleton +│ ├── server.py # CLI entrypoint (runs the app) +│ ├── conftest.py # shared test fixtures +│ ├── surfaces/ # MCP sub-servers ("surfaces") +│ │ ├── echo/ # simplest reference surface +│ │ ├── canvas/ # ← your destination +│ │ ├── public_api/ # OpenAPI-driven surface (factory pattern) +│ │ └── internal_api/ +│ ├── lib/ # shared utilities (use what exists, don't add) +│ ├── routes/ # custom HTTP routes (health check) +│ └── db/ # database pool, stores, migrations +├── packages/ # TypeScript — IGNORE entirely +└── ... +``` + +The uv workspace is declared in the root `pyproject.toml`: + +```toml +[tool.uv.workspace] +members = ["pypackages/*"] +``` + +There is currently one member: `pypackages/mcp` (`banksy-mcp`). Do not add new workspace members for canvas. + +### Where canvas code lives + +The vast majority of migrated code belongs in: + +``` +pypackages/mcp/src/banksy_mcp/surfaces/canvas/ +``` + +This directory is home base. When in doubt, put it here. + +--- + +## 2. Surface Anatomy + +A **surface** is a `FastMCP` instance that gets mounted onto the root Banksy app via `mcp.mount()`. Each surface is a self-contained sub-server with its own tools, resources, and optionally its own lifespan. + +### File layout + +The minimum contract between a surface and the rest of banksy is: + +``` +surfaces// +├── app.py # defines the FastMCP instance and registers tools +├── __init__.py # re-exports the FastMCP instance via __all__ +└── tests/ + └── test_*.py # co-located async tests +``` + +This is the external interface — `__init__.py` exports the `FastMCP` instance, the root `app.py` imports and mounts it. Everything else inside the surface directory is yours to organize. + +The echo surface is tiny enough that `app.py` holds everything. Canvas will be larger. The internal layout of `surfaces/canvas/` can expand and diverge from other surfaces to support whatever structure makes sense for the migrated code. For example: + +``` +surfaces/canvas/ +├── __init__.py # re-exports canvas_mcp (required) +├── app.py # FastMCP instance + tool registration (required) +├── config.py # surface-local CanvasSettings +├── models.py # pydantic models, types +├── bridge/ # subpackage for bridge logic +│ ├── __init__.py +│ └── ... +├── transports/ # subpackage for transport layer +│ ├── __init__.py +│ └── ... +├── utils.py # canvas-internal helpers +└── tests/ + ├── test_tools.py + ├── test_bridge.py + └── fixtures/ + └── ... +``` + +The constraint is not the directory layout — it's the boundary. Keep canvas internals inside `surfaces/canvas/` and export only `canvas_mcp` from `__init__.py`. + +### Reference: echo surface (simplest pattern) + +`surfaces/echo/app.py`: + +```python +from fastmcp import FastMCP + +echo_mcp = FastMCP("EchoTools") + + +@echo_mcp.tool() +async def echo(message: str) -> str: + """Echo a message back. Useful for testing connectivity.""" + return message +``` + +`surfaces/echo/__init__.py`: + +```python +from banksy_mcp.surfaces.echo.app import echo_mcp + +__all__ = ["echo_mcp"] +``` + +Canvas follows the same pattern. See `surfaces/canvas/app.py` and `surfaces/canvas/__init__.py` for the current scaffold. + +### Mounting + +There are two mounting patterns in use, depending on whether the surface needs resources that are set up during the server's lifespan. + +**Module-level mount** — for simple surfaces with no startup/teardown needs: + +```python +# app.py (root) +from banksy_mcp.surfaces.echo import echo_mcp + +mcp = FastMCP("Banksy", lifespan=app_lifespan) +mcp.mount(echo_mcp) +``` + +Canvas currently uses this pattern with a feature flag gate: + +```python +if settings.enable_canvas: + from banksy_mcp.surfaces.canvas import canvas_mcp + mcp.mount(canvas_mcp) +``` + +**Lifespan mount** — for surfaces that need resources created at server start and cleaned up at server stop (HTTP clients, DB connections, etc.): + +```python +# app.py (root) — inside app_lifespan +@lifespan +async def app_lifespan(server: FastMCP[Any]) -> AsyncIterator[dict[str, Any] | None]: + async with AsyncExitStack() as stack: + # ... create resources ... + if settings.mural_public_api_spec: + public_spec = load_spec(settings.mural_public_api_spec) + public_client = httpx.AsyncClient(...) + stack.push_async_callback(_close_quietly, "HTTP client", public_client.aclose) + server.mount(create_public_api_server(public_spec, public_client)) + yield {"db": pool} if pool else {} +``` + +See `surfaces/public_api/app.py` for the factory pattern used here: + +```python +def create_public_api_server( + spec: dict[str, Any], client: httpx.AsyncClient +) -> FastMCP: + return FastMCP.from_openapi(openapi_spec=spec, client=client, name="PublicAPI") +``` + +**If the canvas migration introduces resources that need setup/teardown** (persistent HTTP clients, WebSocket connection pools, DB sessions), the canvas mount should move from module-level into the root `app_lifespan`, following the public_api pattern. + +Alternatively, if the resources are purely canvas-internal (no dependency on parent-level resources like the DB pool), the canvas surface can define its own lifespan on its `FastMCP` instance in `surfaces/canvas/app.py`. If the resources depend on parent-level state, they belong in the root lifespan. + +### Naming convention + +- FastMCP instance: `canvas_mcp = FastMCP("Canvas")` +- Module re-export: `__all__ = ["canvas_mcp"]` +- Tools: decorated with `@canvas_mcp.tool()`, async functions, docstrings serve as tool descriptions + +--- + +## 3. Toolchain and Conventions + +All tooling is configured in the root `pyproject.toml` and/or `pypackages/mcp/pyproject.toml`. No additional config files are needed. + +### Python version + +``` +requires-python = ">=3.14" +``` + +### Ruff (linter + formatter) + +Configured at workspace root: + +```toml +[tool.ruff] +target-version = "py314" +line-length = 88 +src = ["pypackages/*/src", "pypackages/*/tests"] + +[tool.ruff.lint] +select = ["E", "W", "F", "I", "B", "UP", "S", "ASYNC", "RUF"] +ignore = ["B008"] + +[tool.ruff.lint.per-file-ignores] +"**/tests/**" = ["S101", "S105", "S106"] + +[tool.ruff.format] +quote-style = "double" +docstring-code-format = true +``` + +Key points: double quotes, 88-char lines, `assert` is allowed in tests (`S101` ignored), imports are sorted by ruff (`I`). + +### Pyright (type checker) + +```toml +[tool.pyright] +pythonVersion = "3.14" +typeCheckingMode = "strict" +reportUnnecessaryTypeIgnoreComment = true +include = ["pypackages"] +extraPaths = ["pypackages/mcp/src"] +``` + +Strict mode is non-negotiable. All code must pass pyright strict. + +### pytest + pytest-asyncio + +```toml +# root pyproject.toml +[tool.pytest.ini_options] +asyncio_mode = "auto" +addopts = "-v --tb=short" + +# pypackages/mcp/pyproject.toml +[tool.pytest.ini_options] +testpaths = ["tests", "src/banksy_mcp"] +asyncio_mode = "auto" +markers = ["integration: requires Docker (testcontainers)"] +``` + +`asyncio_mode = "auto"` means all `async def test_*` functions are automatically treated as async tests — no `@pytest.mark.asyncio` decorator needed. + +Test discovery includes both `tests/` (top-level) and `src/banksy_mcp/` (co-located surface tests). + +### Build backend + +```toml +[build-system] +requires = ["hatchling"] +build-backend = "hatchling.build" + +[tool.hatch.build.targets.wheel] +exclude = ["**/tests/**"] +``` + +Hatchling builds the wheel. Tests are excluded from the built package. + +### uv + +uv manages the workspace, dependencies, and lockfile. The standard workflow: + +```bash +uv sync # install all deps (runtime + dev) +uv run pytest # run tests through uv +uv run ruff check pypackages/ # lint +uv run pyright # type check +``` + +--- + +## 4. Testing Patterns + +### In-process transport + +Tests use `fastmcp.Client` with the surface's `FastMCP` instance as the transport — no network, no server process: + +```python +import json + +from fastmcp import Client +from mcp.types import TextContent + +from banksy_mcp.surfaces.canvas import canvas_mcp + + +async def test_canvas_health_returns_status() -> None: + async with Client(transport=canvas_mcp) as client: + result = await client.call_tool("canvas_health", {}) + content = result.content[0] + assert isinstance(content, TextContent) + parsed = json.loads(content.text) + assert parsed == {"surface": "canvas", "status": "ok"} +``` + +Key conventions: + +- **Import the surface**, not the root `mcp` — tests target the surface in isolation +- **`async def test_*`** — no `@pytest.mark.asyncio` needed (auto mode) +- **`Client(transport=canvas_mcp)`** — in-process, no HTTP +- **`result.content[0]`** — first content block is a `TextContent` +- Tools returning `str` → assert `content.text` directly +- Tools returning `dict` → `json.loads(content.text)` then assert the dict + +### Test file location + +``` +surfaces/canvas/tests/test_.py +``` + +See `surfaces/echo/tests/test_echo.py` for the simplest example. See `surfaces/canvas/tests/test_canvas.py` for the current canvas scaffold. + +### What to test + +- Each tool should have at least one test for the happy path +- Test the tool's return value structure and content +- Test edge cases (empty input, missing optional fields) +- For tools with side effects, mock external dependencies + +--- + +## 5. Config and Settings + +### How it works + +`config.py` defines a single `Settings` class using `pydantic-settings`: + +```python +from pydantic_settings import BaseSettings + + +class Settings(BaseSettings): + host: str = "0.0.0.0" # noqa: S104 + port: int = 8000 + + mural_public_api_spec: str | None = None + mural_internal_api_spec: str | None = None + mural_api_host: str = "http://localhost:8080" + + database_url: str | None = None + + enable_canvas: bool = False + + +settings = Settings() +``` + +A module-level singleton `settings` is created at import time. All app code imports and reads from this singleton: + +```python +from banksy_mcp.config import settings +``` + +### Conventions + +- **Feature flags**: `bool` with `False` default (e.g. `enable_canvas`) +- **Optional integrations**: `str | None` with `None` default (e.g. `database_url`, spec paths) +- **Required values**: typed with a sensible default (e.g. `host`, `port`) +- **Never** use `os.environ` directly — use pydantic-settings (either global or surface-local) +- Environment variables are mapped automatically by pydantic-settings (e.g. `ENABLE_CANVAS=true` sets `settings.enable_canvas`) + +### Global settings — when to add to `config.py` + +Settings that the root app needs to read belong in the global `Settings` class. The `enable_canvas` flag is the canonical example — `app.py` reads it to decide whether to mount the canvas surface. Other examples: a setting that gates lifespan resource creation in the root `app_lifespan`, or something another surface also needs. + +### Surface-local settings — preferred for canvas-only config + +If a setting is only consumed within `surfaces/canvas/`, define it in a surface-local settings class instead of adding to the global `config.py`. This keeps canvas self-contained and avoids polluting the global config with canvas-specific concerns. + +`surfaces/canvas/config.py`: + +```python +from pydantic_settings import BaseSettings + + +class CanvasSettings(BaseSettings): + model_config = {"env_prefix": "CANVAS_"} + + websocket_url: str | None = None + max_connections: int = 10 + + +canvas_settings = CanvasSettings() +``` + +Then import within canvas code: + +```python +from banksy_mcp.surfaces.canvas.config import canvas_settings +``` + +Environment variables are automatically prefixed: `CANVAS_WEBSOCKET_URL`, `CANVAS_MAX_CONNECTIONS`. + +**Decision rule**: if only canvas reads it, put it in `surfaces/canvas/config.py`. If the root `app.py` or another surface reads it, put it in the global `config.py`. + +--- + +## 6. Dependencies + +### Runtime dependencies + +Add to `pypackages/mcp/pyproject.toml` under `[project.dependencies]`: + +```toml +[project] +dependencies = [ + "alembic>=1.14", + "asyncpg>=0.30", + "fastmcp>=3.1", + "httpx>=0.28", + "psycopg[binary]>=3.2", + "pydantic-settings>=2.0", + "pyyaml>=6.0", +] +``` + +### Dev / test dependencies + +Add to the root `pyproject.toml` under `[dependency-groups.dev]`: + +```toml +[dependency-groups] +dev = [ + "asyncpg-stubs>=0.31", + "httpx>=0.28.0", + "pyright>=1.1.0", + "pytest>=8.0.0", + "pytest-asyncio>=0.24.0", + "ruff>=0.8.0", + "pre-commit>=4.0.0", + "testcontainers[postgres]>=4.0", + "psycopg[binary]>=3.2", +] +``` + +### Workflow + +After editing either `pyproject.toml`: + +```bash +uv sync +``` + +This resolves dependencies and updates `uv.lock`. The lockfile is committed. + +### Rules + +- Do NOT add separate dependency groups or optional extras for canvas +- Do NOT create a canvas-specific `pyproject.toml` or `requirements.txt` +- Check if a dependency is already available before adding a new one + +--- + +## 7. What NOT To Do + +- **Do NOT create shared code outside canvas.** Do not add files to `banksy_mcp/lib/`, `banksy_mcp/db/`, or any other shared location for utilities that only canvas uses. Keep everything in `surfaces/canvas/`. If something looks reusable, still put it in the canvas directory. Future surfaces can evaluate extraction when they actually need it. + +- **Do NOT touch TypeScript.** The `packages/` directory, `package.json`, and anything npm-related are irrelevant. Ignore them completely. + +- **Do NOT change the `mcp.mount()` composition pattern.** Surfaces are `FastMCP` instances that get mounted. This is settled. However, DO move the canvas mount from module-level into the root `app_lifespan` if the migration introduces resources that need setup/teardown — this is expected and follows the `public_api` / `internal_api` precedent in `app.py`. + +- **Do NOT add new workspace members.** Canvas is not a separate package. It lives inside `banksy-mcp` as a surface subdirectory. + +- **Do NOT use `os.environ` directly.** All environment-driven config goes through pydantic-settings — either the global `Settings` in `config.py` or a surface-local `CanvasSettings` in `surfaces/canvas/config.py`. + +- **Do NOT add a CLI entrypoint for canvas.** No `__main__.py`, no separate `run()` function. The root `server.py` owns the process entrypoint. + +- **Do NOT restructure the existing surface directories.** Echo, public_api, internal_api are not your concern. Work within `surfaces/canvas/`. + +- **DO use what already exists in shared locations.** If `lib/spec_loader.py` or `db/pool.py` does what you need, import and use it. The rule is: use shared code that's already there, but don't add to it. + +--- + +## 8. File Placement Decision Tree + +### Defaults — put it in `surfaces/canvas/` + +| What you're adding | Where it goes | +|---|---| +| Tool function | `surfaces/canvas/app.py` or a submodule that `app.py` imports and registers | +| Test | `surfaces/canvas/tests/test_.py` | +| Utility / helper only canvas uses | `surfaces/canvas/.py` or a subpackage (e.g. `surfaces/canvas/bridge/`) | +| Pydantic model / type definition | `surfaces/canvas/.py` | +| Test fixture / data file | `surfaces/canvas/tests/` or `surfaces/canvas/tests/fixtures/` | + +The internal directory structure of `surfaces/canvas/` is flexible — organize subpackages, modules, and tests however the migrated code demands. The only fixed points are `__init__.py` (exports `canvas_mcp`) and `app.py` (defines the `FastMCP` instance). + +### Exceptions — when something legitimately lives elsewhere + +| What you're adding | Where it goes | Justification | +|---|---|---| +| Canvas-only config field | `surfaces/canvas/config.py` as a `CanvasSettings(BaseSettings)` with `env_prefix = "CANVAS_"` | Surface-local config keeps canvas self-contained | +| Config field read by root `app.py` or other surfaces | Global `config.py` | Root-level decisions (feature flags, lifespan resources) must be globally visible | +| Canvas-internal lifespan resource (no dependency on parent state) | `surfaces/canvas/app.py` — as a lifespan on the `canvas_mcp` instance | Surface owns its own lifecycle for internal resources | +| Lifespan resource that depends on parent state (DB pool, shared HTTP client) or needs cross-surface sharing | Root `app.py` lifespan — move the canvas mount into the lifespan block | Follows the `public_api`/`internal_api` pattern; parent lifespan owns shared resources | +| Custom HTTP route for the whole app | `routes/` | Existing convention for app-wide routes | +| Canvas-specific HTTP route | On `canvas_mcp` directly via `canvas_mcp.custom_route()` | Scoped to the surface | + +### The justification bar + +If something doesn't fit the defaults table, ask: "Is there an existing pattern in the codebase for this?" If yes, follow it. If no, the answer is almost certainly `surfaces/canvas/`. Creating new shared abstractions requires clear evidence that multiple surfaces need the same thing — and right now, canvas is the only consumer.