feat: API key schema isolation — database-level tenant separation#855
Open
salvormallow wants to merge 1 commit intovectorize-io:mainfrom
Open
feat: API key schema isolation — database-level tenant separation#855salvormallow wants to merge 1 commit intovectorize-io:mainfrom
salvormallow wants to merge 1 commit intovectorize-io:mainfrom
Conversation
Adds ApiKeySchemaTenantExtension: maps API keys to isolated PostgreSQL schemas, providing database-level memory isolation between tenants. Threat model: prompt injection against AI agents. Agents execute tool calls based on conversation content. A prompt injection can trick an agent into querying another tenant's banks. Schema isolation scopes all SQL to the authenticated schema — banks from other schemas don't exist. Configuration: HINDSIGHT_API_TENANT_EXTENSION=...bank_scoped_tenant:ApiKeySchemaTenantExtension HINDSIGHT_API_TENANT_KEY_MAP=key_a:schema_a;key_b:schema_b Follows the SupabaseTenantExtension pattern. Opt-in, zero breaking changes. Includes 20 tests.
Author
Dashboard caveatWhen Root cause: Workaround: Set Longer-term: The dashboard should support multi-tenant awareness — a tenant selector that switches which API key is used for dataplane calls. I'm working on a follow-up PR for this. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
ApiKeySchemaTenantExtension— a built-in tenant extension that maps API keys to isolated PostgreSQL schemas, providing database-level memory isolation between tenants. Follows the same pattern asSupabaseTenantExtensionbut uses static API key mapping instead of JWT auth.Threat model: prompt injection against AI agents
AI agents execute tool calls — including Hindsight
recall,retain, andreflect— based on conversation content. A prompt injection delivered via chat message, email, or web search result can trick an agent into querying another tenant's memory banks.Example attack:
hindsight recall --bank tenant-b-bank --query "private data"Why application-layer bank filtering isn't enough:
RequestContext.allowed_bank_idsexists on the model but is not enforced by the engine. AnOperationValidatorExtensioncould check it, but:allowed_bank_idsisNone(the default), all access is grantedallowed_bank_idsis never set for background tasksWhy schema isolation works:
The API key determines the PostgreSQL schema at authentication time, before any bank lookup or query executes. The SQL itself is scoped via fully-qualified table names. Even a fully compromised agent can only access banks within its assigned schema. Banks from other schemas don't exist in its view of the database.
How it works
SupabaseTenantExtension)Configuration
Design decisions
Opt-in, zero breaking changes. If
HINDSIGHT_API_TENANT_EXTENSIONis not set, Hindsight usesDefaultTenantExtension— identical to current behavior. Existing deployments are unaffected.One key = one schema. Each API key maps to exactly one PostgreSQL schema. A single key cannot access multiple schemas. This is intentional: one key = one blast radius. The
TenantContextreturns a singleschema_name, and the engine scopes all queries to it. Cross-schema queries are not possible without direct Postgres access.Admin access. There is no "superuser key" that spans all schemas. Operators who need cross-tenant visibility should query Postgres directly or use separate keys per schema. This is a conscious trade-off: admin convenience vs. the guarantee that no single compromised key grants access to all tenants.
MCP auth disabled = default schema only. When
mcp_auth_disabled=true, MCP requests fall back to the default schema (fromHINDSIGHT_API_DATABASE_SCHEMA), not a tenant schema.Schema name validation. Schema names must be valid Postgres identifiers (letters, digits, underscores). Hyphens, spaces, and names starting with digits are rejected at startup.
Why not
allowed_bank_ids+OperationValidatorExtension? See threat model above. Application-layer checks are defense-in-depth, not a security boundary. Schema isolation moves the enforcement into the database where it can't be bypassed by missed code paths.Files changed
hindsight-api-slim/.../builtin/bank_scoped_tenant.pyApiKeySchemaTenantExtension(~170 lines)hindsight-api-slim/tests/test_bank_scoped.pyTest plan
run_migrationworks on first authenticated requestpublicschema reads existing data correctly