Skip to content

feat(auth,prompts,inference): multi-tenancy MVP for MaaS deployments#5614

Draft
franciscojavierarceo wants to merge 3 commits intoogx-ai:mainfrom
franciscojavierarceo:worktree-multi-tenancy-mvp
Draft

feat(auth,prompts,inference): multi-tenancy MVP for MaaS deployments#5614
franciscojavierarceo wants to merge 3 commits intoogx-ai:mainfrom
franciscojavierarceo:worktree-multi-tenancy-mvp

Conversation

@franciscojavierarceo
Copy link
Copy Markdown
Collaborator

@franciscojavierarceo franciscojavierarceo commented Apr 24, 2026

Summary

Implements multi-tenancy support (Phase 1 + Phase 2) for MaaS, llm-d, and vLLM deployments.

Phase 1: Identity & fairness

  • attribute_headers on UpstreamHeaderAuthConfig — maps multiple HTTP headers to attribute categories (e.g., X-MaaS-Group -> teams, X-MaaS-Subscription -> namespaces). Values merge with the existing attributes_header field. Enables MaaS Authorino integration where identity is spread across multiple upstream headers.
  • fairness_header_attribute on vLLM config — injects x-gateway-inference-fairness-id on outgoing API calls from the authenticated user's attributes. Used by llm-d EPP Flow Control for per-tenant fair scheduling. Implemented as a _get_extra_request_headers() hook on OpenAIMixin so the pattern is reusable by other providers.

Phase 2: Data isolation (KVStore -> AuthorizedSqlStore)

Migrated all remaining stateful resources from plain KVStore (no access control) to AuthorizedSqlStore (row-level ABAC via owner_principal + access_attributes):

Resource Before After Breaking change
Prompts KVStore AuthorizedSqlStore Yes - recreate prompts
Connectors KVStore AuthorizedSqlStore Yes - re-register connectors
Batches KVStore (provider config) AuthorizedSqlStore Yes - in-progress batches lost

Not migrated (with rationale)

Resource Reason
Vector store metadata Routing table already enforces ABAC at the API boundary; KV metadata is defense-in-depth only
Distribution registry Admin-only infrastructure, not user-facing data
Quota tracking Per-client rate limiting, should NOT be tenant-scoped
Agent state persistence Dead code (persistence_store initialized but never read/written)

Design decisions

  1. _get_extra_request_headers() hook — added to OpenAIMixin rather than vLLM-specific injection. Any OpenAI-compatible provider can override it.
  2. set_default_version crash safety — new default set before clearing old defaults, so a crash leaves two defaults (recoverable) rather than zero (data loss).
  3. Distribution template updatedtemplate.py default stores now use SqlStoreReference for prompts and connectors, ensuring all codegen'd distribution configs are consistent.

Files changed (27 files)

Area Files
Auth core/datatypes.py, core/server/auth_providers.py
Prompts core/prompts/prompts.py, core/storage/datatypes.py, core/stack.py
Connectors core/connectors/connectors.py
Batches providers/inline/batches/reference/{__init__,batches,config}.py
Inference providers/remote/inference/vllm/{config,vllm}.py, providers/utils/inference/openai_mixin.py
Templates distributions/template.py
Distro configs 9 YAML files across ci-tests, starter, nvidia, oci, open-benchmark, watsonx
Docs docs/docs/providers/{batches/inline_reference,inference/remote_vllm}.mdx
Tests 9 test files

Test plan

  • 389 unit tests pass across all changed modules
  • All 40 pre-commit hooks pass
  • Integration tests (replay): uv run --no-sync ./scripts/integration-tests.sh --stack-config server:ci-tests --setup gpt --suite responses
  • Verify attribute_headers with MaaS Authorino headers
  • Verify fairness_header_attribute sends header to llm-d EPP

Generated with Claude Code

franciscojavierarceo and others added 2 commits April 23, 2026 23:02
…oyments

Add multi-header identity mapping for upstream gateway auth (attribute_headers),
migrate prompts from KVStore to AuthorizedSqlStore for tenant-scoped access
control, and add llm-d fairness header propagation through a per-request
header hook in OpenAIMixin.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
…rness header tests

Reorder set_default_version to set the new default before clearing old ones,
preventing a crash from leaving zero defaults. Add unit tests for the vLLM
fairness header injection via _get_extra_request_headers covering all code paths.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Apr 24, 2026

This pull request has merge conflicts that must be resolved before it can be merged. @franciscojavierarceo please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify Bot added the needs-rebase label Apr 24, 2026
… to AuthorizedSqlStore

Migrate connectors service and batches provider from plain KVStore to
AuthorizedSqlStore for Phase 2 multi-tenancy. Both now have row-level
access control via owner_principal and access_attributes columns.

Breaking change: existing KV-stored connectors and batch state must be
recreated. Distribution configs updated to reference sql_default backend.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant