feat: API key schema isolation — database-level tenant separation by salvormallow · Pull Request #855 · vectorize-io/hindsight

salvormallow · 2026-04-03T04:48:08Z

Summary

Adds ApiKeySchemaTenantExtension — a built-in tenant extension that maps API keys to isolated PostgreSQL schemas, providing database-level memory isolation between tenants. Follows the same pattern as SupabaseTenantExtension but uses static API key mapping instead of JWT auth.

Threat model: prompt injection against AI agents

AI agents execute tool calls — including Hindsight recall, retain, and reflect — based on conversation content. A prompt injection delivered via chat message, email, or web search result can trick an agent into querying another tenant's memory banks.

Example attack:

Attacker sends a message to Agent A containing a crafted prompt injection
The injection tricks Agent A into calling hindsight recall --bank tenant-b-bank --query "private data"
Without schema isolation, this succeeds — both banks are in the same schema, and the single API key grants access to everything
Agent A returns Tenant B's private data in its response

Why application-layer bank filtering isn't enough:

RequestContext.allowed_bank_ids exists on the model but is not enforced by the engine. An OperationValidatorExtension could check it, but:

Requires configuring two extensions correctly (tenant + validator) — configuring only one gives a false sense of security
Fail-open: if allowed_bank_ids is None (the default), all access is granted
Internal operations skip tenant auth, so allowed_bank_ids is never set for background tasks
A single missed code path in the engine bypasses the check entirely

Why schema isolation works:

The API key determines the PostgreSQL schema at authentication time, before any bank lookup or query executes. The SQL itself is scoped via fully-qualified table names. Even a fully compromised agent can only access banks within its assigned schema. Banks from other schemas don't exist in its view of the database.

Attacker → prompt injection → Agent A → hindsight recall --bank tenant_b_bank
                                              ↓
                                API key resolves to schema "tenant_a"
                                              ↓
                                tenant_b_bank doesn't exist in this schema
                                              ↓
                                empty results → attack fails

How it works

Operator configures key-to-schema mapping via environment variable
Each request authenticates by API key → resolves to a dedicated PostgreSQL schema
All database operations are scoped to that schema
Schemas are auto-created with full table migrations on first access (same as SupabaseTenantExtension)
No separate validator extension needed — isolation is in the database

Configuration

HINDSIGHT_API_TENANT_EXTENSION=hindsight_api.extensions.builtin.bank_scoped_tenant:ApiKeySchemaTenantExtension
HINDSIGHT_API_TENANT_KEY_MAP=team_a_key:team_a;team_b_key:team_b

# Optional: prefix all schema names
HINDSIGHT_API_TENANT_SCHEMA_PREFIX=hs    # team_a becomes hs_team_a

# Optional: disable auth for MCP endpoints (falls back to default schema)
HINDSIGHT_API_TENANT_MCP_AUTH_DISABLED=true

Design decisions

Opt-in, zero breaking changes. If HINDSIGHT_API_TENANT_EXTENSION is not set, Hindsight uses DefaultTenantExtension — identical to current behavior. Existing deployments are unaffected.

One key = one schema. Each API key maps to exactly one PostgreSQL schema. A single key cannot access multiple schemas. This is intentional: one key = one blast radius. The TenantContext returns a single schema_name, and the engine scopes all queries to it. Cross-schema queries are not possible without direct Postgres access.

Admin access. There is no "superuser key" that spans all schemas. Operators who need cross-tenant visibility should query Postgres directly or use separate keys per schema. This is a conscious trade-off: admin convenience vs. the guarantee that no single compromised key grants access to all tenants.

MCP auth disabled = default schema only. When mcp_auth_disabled=true, MCP requests fall back to the default schema (from HINDSIGHT_API_DATABASE_SCHEMA), not a tenant schema.

Schema name validation. Schema names must be valid Postgres identifiers (letters, digits, underscores). Hyphens, spaces, and names starting with digits are rejected at startup.

Why not allowed_bank_ids + OperationValidatorExtension? See threat model above. Application-layer checks are defense-in-depth, not a security boundary. Schema isolation moves the enforcement into the database where it can't be bypassed by missed code paths.

Files changed

File	Description
`hindsight-api-slim/.../builtin/bank_scoped_tenant.py`	`ApiKeySchemaTenantExtension` (~170 lines)
`hindsight-api-slim/tests/test_bank_scoped.py`	20 unit tests + prompt injection defense tests

Test plan

Adds ApiKeySchemaTenantExtension: maps API keys to isolated PostgreSQL schemas, providing database-level memory isolation between tenants. Threat model: prompt injection against AI agents. Agents execute tool calls based on conversation content. A prompt injection can trick an agent into querying another tenant's banks. Schema isolation scopes all SQL to the authenticated schema — banks from other schemas don't exist. Configuration: HINDSIGHT_API_TENANT_EXTENSION=...bank_scoped_tenant:ApiKeySchemaTenantExtension HINDSIGHT_API_TENANT_KEY_MAP=key_a:schema_a;key_b:schema_b Follows the SupabaseTenantExtension pattern. Opt-in, zero breaking changes. Includes 20 tests.

salvormallow · 2026-04-03T08:44:17Z

Dashboard caveat

When bank_scoped_tenant is active, the control plane dashboard shows no banks because it calls the dataplane API without an API key.

Root cause: hindsight-client.ts reads HINDSIGHT_CP_DATAPLANE_API_KEY at startup. Without it, every SDK call returns Authentication failed: Missing API key. The dashboard's /api/banks route catches this and returns {"error":"Failed to fetch banks from API"}.

Workaround: Set HINDSIGHT_CP_DATAPLANE_API_KEY to one of the tenant API keys from your HINDSIGHT_API_TENANT_KEY_MAP. The dashboard will show that tenant's banks. Example:

HINDSIGHT_CP_DATAPLANE_API_KEY=<one-of-your-tenant-keys>

Longer-term: The dashboard should support multi-tenant awareness — a tenant selector that switches which API key is used for dataplane calls. I'm working on a follow-up PR for this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: API key schema isolation — database-level tenant separation#855

feat: API key schema isolation — database-level tenant separation#855
salvormallow wants to merge 1 commit intovectorize-io:mainfrom
salvormallow:feat/bank-scoped-access-control

salvormallow commented Apr 3, 2026

Uh oh!

salvormallow commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

salvormallow commented Apr 3, 2026

Summary

Threat model: prompt injection against AI agents

How it works

Configuration

Design decisions

Files changed

Test plan

Uh oh!

salvormallow commented Apr 3, 2026

Dashboard caveat

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant