You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add OAuth 2.0 support to OpenSearch via an authentication proxy, enabling secure machine-to-machine access, scoped API tokens, and third-party integrations. The proxy validates OAuth tokens (JWT), maps scopes to OpenSearch security roles, and forwards requests to the engine and Dashboards — with zero changes to existing components. This is the foundational layer that unlocks AI agent access, CI/CD automation, collaboration tool integration, and multi-tenant SaaS patterns for OpenSearch.
What users have asked for this feature?
OpenSearch has foundational auth primitives but lacks the developer experience layer that competitors offer for programmatic and machine-to-machine access:
OpenSearch already supports OIDC authentication and has a Service Account primitive (originally built for the extensions project). API Keys with direct permission scoping are in active development targeting 3.7. The gap is not in authentication primitives — it's in the developer experience layer: OAuth app registration, scoped token issuance via standard flows, unified auth across engine and Dashboards, and token governance at enterprise scale.
Relationship to Existing Capabilities
This RFC builds on — not replaces — existing OpenSearch security features:
Opaque tokens with cluster_permissions and index_permissions directly on the token
🔄 In progress for 3.7. Engine-only — doesn't cover Dashboards. Admin-only issuance in V1. No OIDC federation, no governance UI, no Cedar policies.
The OAuth proxy is the layer that connects these primitives to the outside world — it federates external OIDC tokens, provides unified auth across engine and Dashboards, and adds the governance/management layer that enterprises need. With API Keys landing in 3.7, the proxy becomes thinner: instead of mapping to pre-created backend users, it can programmatically issue API Keys via /_plugins/_security/api/apitokens with the exact permissions needed.
Stack Overflow: multiple questions about "OpenSearch API token" and "OpenSearch service account" with no good answers
The rise of AI coding agents (Claude Code, Cursor, GitHub Copilot) has created urgent demand for secure, scoped, machine-to-machine auth to data stores — OpenSearch has no answer today
What problems are you trying to solve?
When building an AI agent that queries logs, a developer wants to grant the agent read-only access to specific indices with an expiring token, so they don't have to share admin credentials that could be leaked or misused.
When deploying dashboards across environments, a DevOps engineer wants to authenticate CI/CD pipelines with scoped service accounts, so they don't have to store admin passwords in CI secrets with full cluster access.
When integrating OpenSearch alerts with Slack, a platform engineer wants to use standard OAuth to connect services, so they don't have to build fragile webhook workarounds with embedded credentials.
When building a multi-tenant SaaS application, a backend developer wants to issue per-customer scoped tokens, so they can isolate tenant data access and revoke individual customers without affecting others.
When querying OpenSearch from an IDE or CLI, a developer wants to authenticate once via browser-based OAuth flow and get an auto-expiring token, so they don't have to copy-paste credentials that never expire.
When forwarding logs via Fluent Bit or OpenTelemetry, an infrastructure engineer wants to give each pipeline a write-only token scoped to its target index, so they limit blast radius if a pipeline credential is compromised.
When managing multiple OpenSearch clusters, a platform team wants to use one identity provider with different scoped tokens per cluster, so they don't have to manage separate credentials for each environment.
What is the developer experience going to be?
REST API
The OAuth proxy introduces the following endpoints:
Token management:
POST /oauth/token — Issue a new token (client credentials flow)
DELETE /oauth/token/{token_id} — Revoke a token
GET /oauth/tokens — List active tokens
GET /oauth/token/{token_id} — Get token details
# Search via OAuth proxy — token scoped to read:logs-* only
curl -H "Authorization: Bearer eyJhbGciOi..." \
"https://opensearch.example.com:8443/logs-*/_search" \
-d '{"query": {"match": {"level": "error"}}}'
CLI tool:
# Browser-based login (authorization code flow)
$ opensearch-auth login --provider keycloak
🌐 Opening browser for authentication...
✅ Authenticated as developer@example.com
Token stored in~/.opensearch/token (expires in 8h)
Scopes: read:logs-*, read:metrics-*# Create a service account token (client credentials)
$ opensearch-auth create-token --scopes "read:logs-*" --expires 24h
✅ Token created: tok_abc123 (expires 2024-01-16T10:00:00Z)
# Revoke a token
$ opensearch-auth revoke-token --token-id tok_abc123
✅ Token revoked
# Check status
$ opensearch-auth status
✅ Authenticated as developer@example.com
Provider: keycloak | Expires: 6h remaining
Scopes: read:logs-*, read:metrics-*
Impact to existing APIs: None. All existing OpenSearch and Dashboards APIs continue to work unchanged. The proxy is an additive component — clients that don't use OAuth bypass it entirely.
Yes — this feature is fundamentally about security:
Token security — JWT tokens are signed by the OIDC provider and validated by the proxy via JWKS. Tokens have expiry, scopes, and can be individually revoked.
Scope enforcement — OAuth scopes are mapped to OpenSearch security roles. The proxy never grants more access than the mapped role allows.
No bypass — clients going through the proxy cannot access the engine directly (network policy enforced). Clients not using OAuth continue to authenticate directly with existing methods.
Audit trail — every proxied request is logged with: client_id, user_id (if delegated), scopes, action, target index, timestamp.
Integration with security plugin — the proxy maps tokens to existing security plugin users/roles. FGAC (field-level, document-level security) and workspace ACL continue to apply as additive restrictions.
Token storage — tokens are stored in an OpenSearch system index (.oauth-tokens) with encryption at rest.
graph TB
subgraph "Authorization Layers (each narrows access)"
A["OAuth Scope<br/>Token can access logs-*"] --> B["FGAC (Security Plugin)<br/>User can read logs-*, field masking on PII"]
B --> C["Workspace ACL<br/>User sees 'observability' workspace only"]
C --> D["✅ Final: Read logs-*, PII masked,<br/>observability workspace only"]
end
style A fill:#e1f5fe
style B fill:#fff3e0
style C fill:#f3e5f5
style D fill:#e8f5e9
Loading
Are there any breaking changes to the API?
No. Zero breaking changes.
All existing OpenSearch REST APIs remain unchanged
All existing Dashboards APIs remain unchanged
All existing authentication methods (basic auth, SAML, OIDC) continue to work
The OAuth proxy is a new, optional component — it does not modify or replace any existing functionality
Clients that don't use OAuth are completely unaffected
What is the user experience going to be?
Use Case 1: AI Agents (Claude Code, Cursor, Cody, Custom Agents)
AI coding agents need to query OpenSearch for log analysis, search, and observability. OAuth provides scoped, auditable, revocable access.
Example: LangChain agent with OAuth
fromlangchain.agentsimportToolfromopensearchpyimportOpenSearchimportrequests# Get scoped OAuth tokentoken_response=requests.post("https://keycloak.example.com/token", data={
"grant_type": "client_credentials",
"client_id": "langchain-agent",
"client_secret": "...",
"scope": "read:logs-*"
})
token=token_response.json()["access_token"]
# Connect to OpenSearch via OAuth proxyclient=OpenSearch(
hosts=[{"host": "opensearch.example.com", "port": 8443}],
headers={"Authorization": f"Bearer {token}"},
use_ssl=True
)
# Agent can search logs but CANNOT delete indices or access other datadefsearch_logs(query: str) ->str:
results=client.search(index="logs-*", body={
"query": {"query_string": {"query": query}},
"size": 10
})
returnstr(results["hits"]["hits"])
tools= [Tool(name="search_logs", func=search_logs, description="Search OpenSearch logs")]
sequenceDiagram
participant Dev as Developer
participant GH as GitHub
participant CI as GitHub Actions
participant Proxy as OAuth Proxy
participant OIDC as OIDC Provider
participant Stage as Staging OpenSearch
participant Prod as Production OpenSearch
Dev->>GH: git push (dashboards/*.ndjson)
GH->>CI: Trigger workflow
CI->>OIDC: POST /token (scope=write:dashboards)
OIDC-->>CI: access_token
CI->>Proxy: POST /api/saved_objects/_import (staging)
Proxy->>Stage: Import dashboard
Stage-->>CI: 200 OK ✅
CI->>Proxy: POST /api/saved_objects/_import (prod)
Proxy->>Prod: Import dashboard
Prod-->>CI: 200 OK ✅
CI->>Dev: Slack: "Dashboard deployed to prod"
[OUTPUT]
Name opensearch
Match *
Host opensearch.example.com
Port 8443
TLS On
Header Authorization Bearer eyJhbGciOi...
Index app-logs
# Token scope: write:app-logs-* — cannot read, cannot access other indices
Example: OpenTelemetry Collector
exporters:
opensearch:
http:
endpoint: "https://opensearch.example.com:8443"headers:
Authorization: "Bearer ${OPENSEARCH_OAUTH_TOKEN}"traces_index: "otel-traces"logs_index: "otel-logs"# Token scope: write:otel-* — isolated from application indices
Before vs after:
Before: Every pipeline uses admin basic auth → full access to everything
After: Fluent Bit → write:app-logs-* (can only write app logs)
OTel → write:otel-* (can only write traces/metrics)
Logstash → write:logstash-* (can only write its indices)
Each pipeline is isolated. Compromised Fluent Bit can't read OTel data.
Are there breaking changes to the User Experience?
No. The OAuth proxy is entirely opt-in:
Scenario
Impact
Existing users hitting OpenSearch directly
❌ No change
Existing OSD users logging in via SAML/OIDC
❌ No change
Existing basic auth scripts
❌ No change
New agent/CI/CD wanting OAuth
✅ Point at proxy endpoint
Why should it be built? Any reason not to?
Why build it:
Dashboards auth gap — The engine security plugin (including the upcoming API Keys) only covers the OpenSearch engine. OpenSearch Dashboards has its own API surface — saved objects, workspace management, UI settings, visualization export/import — that sits outside the engine's security scope. An AI agent or CI/CD pipeline that needs to import dashboards, manage workspaces, or interact with Dashboards APIs cannot authenticate with engine-level API Keys alone. The proxy provides a single authenticated entry point for the full platform. This gap widens as Dashboards evolves into a richer application layer.
Token governance at enterprise scale — Engine API Keys (PR #5443) provide the primitive to create and delete tokens, but V1 is admin-only issuance with no governance layer. Enterprises need: delegated issuance (team leads creating tokens for their scope), consent/approval workflows, centralized visibility across all tokens org-wide, automated rotation, bulk revocation per compromised client, and a full audit trail. This is critical for regulated industries (finance, healthcare, government) where "who has access to what and when was it granted" must be auditable.
OIDC federation — Enterprises with existing Keycloak/Auth0/Okta deployments want to use their existing identity infrastructure to access OpenSearch without creating separate API Keys. The proxy bridges external OIDC JWTs to engine-native credentials.
AI agent enablement — the AI agent ecosystem (Claude Code, Cursor, MCP, LangChain) is exploding. These tools need secure, scoped, machine-to-machine auth. Building OAuth makes OpenSearch AI-agent-ready with standard protocols.
Competitive parity — Grafana, Datadog, Splunk, and Elastic all have the full stack: API keys + OAuth apps + governance UI. OpenSearch has the engine primitive coming (API Keys) but not the developer experience layer.
Foundation for everything else — Slack integration, GitHub integration, CI/CD automation, multi-tenant SaaS — all converge on OAuth. Building this once unlocks all of them.
Reasons not to build:
Basic auth workaround exists — users can technically use basic auth for programmatic access, though it's insecure and unscoped.
Proxy adds complexity — another component to deploy and monitor. Mitigated by making it optional and lightweight.
Scope-to-role mapping is coarse — without native engine support, the proxy maps tokens to pre-defined backend users. Fine-grained per-token permissions require future engine integration.
Impact if not built: OpenSearch continues to fall behind competitors in enterprise and AI agent adoption. Users choose Grafana or Elastic for programmatic workflows. The gap widens as AI agent usage grows.
What will it take to execute?
Architecture
graph TB
subgraph Clients
A[AI Agent<br/>Claude Code / Cursor]
B[CI/CD Pipeline<br/>GitHub Actions / Terraform]
C[Slack Bot]
D[CLI / IDE Plugin]
end
subgraph "OAuth Proxy (Go)"
E[JWT Validation<br/>JWKS Auto-Discovery]
F[Scope → Role Mapping]
G[Cedar Policy Engine<br/>Optional]
H[API Key Lifecycle<br/>via Engine REST API]
I[Prometheus /metrics]
end
subgraph "OpenSearch Engine"
J[Engine API<br/>Search / Index / Admin]
L[Security Plugin<br/>FGAC / Workspace ACL]
P[API Keys<br/>PR #5443 — 3.7]
end
subgraph "OpenSearch Dashboards"
K[Dashboards API<br/>Saved Objects / Workspaces / UI]
end
subgraph "OIDC Providers"
M[Keycloak]
N[Auth0 / Okta]
O[Dex]
end
A -->|Bearer Token| E
B -->|Bearer Token| E
C -->|Bearer Token| E
D -->|Bearer Token| E
E --> F
F --> G
F -->|Mapped Credentials<br/>or API Key| J
F -->|Mapped Credentials| K
J --> L
L --> P
E -.->|JWKS Fetch| M
E -.->|JWKS Fetch| N
E -.->|JWKS Fetch| O
H -->|POST /_plugins/_security/api/apitokens| P
Loading
Key change from original design: The proxy delegates token issuance to the engine's native API Keys (PR #5443) rather than maintaining its own token store. This makes the proxy stateless and thinner. The proxy's unique value is: (1) sitting in front of both engine and Dashboards, and (2) bridging external OIDC tokens to engine-native API Keys.
Implementation stack
Layer
Choice
Why
Language
Go
Fast, small binary, standard for proxies (Envoy, Traefik, CoreDNS)
OAuth/JWT
golang-jwt/jwt + OIDC discovery
Standard JWT validation, auto-fetch JWKS
Cedar
cedar-go
Local fine-grained policy evaluation (Apache 2.0)
Config
YAML
Scope-to-role mappings, upstream endpoints
Deployment
Docker container / sidecar
Runs alongside OpenSearch
Distribution
GitHub repo + Docker Hub
Standard open source
Phased delivery
Phase
Scope
Effort (AI-first)
Effort (traditional)
Phase 0: API Keys (prerequisite)
Engine-native API Keys with direct permission scoping (PR #5443)
Total: ~3.5 months (AI-first) vs ~9 months (traditional) — Phase 0 runs in parallel.
Phase 0: API Keys lands in OpenSearch 3.7 (parallel, owned by security plugin team)
Weeks 1-2: OAuth proxy MVP (JWT validation, OIDC federation, forwarding to engine + Dashboards)
Weeks 3-4: CLI tool + API Key lifecycle integration (create/revoke via engine API)
Weeks 5-6: Testing, docs, Docker, open source release
Weeks 7-10: OSD plugin (token management UI, consent screen, governance dashboard)
Weeks 11-14: Cedar integration, policy evaluation
Assumptions and constraints
Proxy approach chosen over engine modification — faster to ship, no engine changes, independent release cycle. The proxy is complementary to the engine's API Keys (PR #5443) — it consumes API Keys as the engine-level primitive and adds OIDC federation, Dashboards coverage, and governance on top.
API Keys as the engine primitive — with API Keys landing in 3.7, the proxy delegates token creation to /_plugins/_security/api/apitokens rather than maintaining its own token store. This makes the proxy stateless. Before API Keys land, the proxy falls back to mapping OAuth scopes to pre-created backend users/roles.
Dashboards API coverage — the engine security plugin (including API Keys) only covers the OpenSearch engine. Dashboards has its own API surface (saved objects, workspaces, UI settings) that is not covered by engine-level auth. The proxy sits in front of both, providing unified auth for the full platform.
OIDC provider required — users must run an OIDC-compliant provider (Keycloak, Auth0, Okta, Dex). The proxy does not include a built-in identity provider.
Extra network hop — the proxy adds latency (~1-5ms per request). Acceptable for most use cases; high-throughput search workloads may want to bypass the proxy for internal traffic.
Built-in token issuer vs external-only? — Resolved: with API Keys (PR #5443) landing in the engine, the proxy delegates token issuance to the engine's /_plugins/_security/api/apitokens endpoint. No built-in issuer needed.
Scope naming convention — What should the standard scope format be? Options: read:index-pattern, action:resource, role:role-name. Need community input.
Token storage backend — Resolved: API Keys are stored in the engine's .opensearch_security_api_tokens system index. The proxy is stateless — no separate token store needed.
WebSocket support — Dashboards uses WebSockets for real-time features. How does the proxy handle WebSocket upgrade with OAuth tokens?
Rate limiting — Should the proxy include built-in rate limiting per token/client, or defer to external tools (Envoy, API Gateway)?
API Keys API surface alignment — Does the API Keys REST API support filtering by created_by? The proxy needs this to manage tokens per OAuth client. Does it support custom metadata fields for audit trail (e.g., client_id, oauth_provider)?
Cedar policy management UX — How do users author and test Cedar policies? CLI-only, or a visual editor in Dashboards?
Multi-cluster token federation — Can a token issued for one cluster be used across multiple clusters? What's the trust model?
Dashboards auth integration — How does the proxy authenticate requests to Dashboards APIs that expect session cookies? Does it need to establish a Dashboards session on behalf of the OAuth client, or can Dashboards be extended to accept bearer tokens directly?
Zero breaking changes — existing auth methods continue to work. OAuth is opt-in.
No engine modifications — implemented as a proxy, not a security plugin change.
Works with any OIDC provider — Keycloak, Auth0, Okta, Dex, or any compliant provider.
Open source — Apache 2.0 licensed, community-driven.
D. Phase 2: OSD Plugin details
Component
Description
Token management
Create, revoke, list tokens in Dashboards UI
Consent screen
"Grant Agent X access to logs-*?" approval flow
Scope/role mapping admin
Visual editor for scope-to-role configuration
Config storage
Stored in OpenSearch system index, proxy reads dynamically
E. Phase 3: Cedar Policy Engine details
Replace static scope-to-role mapping with fine-grained Cedar policies, evaluated locally (Apache 2.0, no external service dependency):
// Agent can read logs indices
permit(
principal == Agent::"my-ai-agent",
action in [Action::"search", Action::"get"],
resource in Index::"logs-*"
);
// CI/CD can manage dashboards but not delete indices
permit(
principal in Group::"cicd-service-accounts",
action in [Action::"create", Action::"update"],
resource in ResourceType::"dashboard"
);
// Deny all agents from accessing PII indices
forbid(
principal in Group::"agents",
action,
resource in Index::"pii-*"
);
RFC: OAuth 2.0 Support for OpenSearch
Status: Proposed
What/Why
What are you proposing?
Add OAuth 2.0 support to OpenSearch via an authentication proxy, enabling secure machine-to-machine access, scoped API tokens, and third-party integrations. The proxy validates OAuth tokens (JWT), maps scopes to OpenSearch security roles, and forwards requests to the engine and Dashboards — with zero changes to existing components. This is the foundational layer that unlocks AI agent access, CI/CD automation, collaboration tool integration, and multi-tenant SaaS patterns for OpenSearch.
What users have asked for this feature?
OpenSearch has foundational auth primitives but lacks the developer experience layer that competitors offer for programmatic and machine-to-machine access:
OpenSearch already supports OIDC authentication and has a Service Account primitive (originally built for the extensions project). API Keys with direct permission scoping are in active development targeting 3.7. The gap is not in authentication primitives — it's in the developer experience layer: OAuth app registration, scoped token issuance via standard flows, unified auth across engine and Dashboards, and token governance at enterprise scale.
Relationship to Existing Capabilities
This RFC builds on — not replaces — existing OpenSearch security features:
cluster_permissionsandindex_permissionsdirectly on the tokenThe OAuth proxy is the layer that connects these primitives to the outside world — it federates external OIDC tokens, provides unified auth across engine and Dashboards, and adds the governance/management layer that enterprises need. With API Keys landing in 3.7, the proxy becomes thinner: instead of mapping to pre-created backend users, it can programmatically issue API Keys via
/_plugins/_security/api/apitokenswith the exact permissions needed.Community signals:
What problems are you trying to solve?
When building an AI agent that queries logs, a developer wants to grant the agent read-only access to specific indices with an expiring token, so they don't have to share admin credentials that could be leaked or misused.
When deploying dashboards across environments, a DevOps engineer wants to authenticate CI/CD pipelines with scoped service accounts, so they don't have to store admin passwords in CI secrets with full cluster access.
When integrating OpenSearch alerts with Slack, a platform engineer wants to use standard OAuth to connect services, so they don't have to build fragile webhook workarounds with embedded credentials.
When building a multi-tenant SaaS application, a backend developer wants to issue per-customer scoped tokens, so they can isolate tenant data access and revoke individual customers without affecting others.
When querying OpenSearch from an IDE or CLI, a developer wants to authenticate once via browser-based OAuth flow and get an auto-expiring token, so they don't have to copy-paste credentials that never expire.
When forwarding logs via Fluent Bit or OpenTelemetry, an infrastructure engineer wants to give each pipeline a write-only token scoped to its target index, so they limit blast radius if a pipeline credential is compromised.
When managing multiple OpenSearch clusters, a platform team wants to use one identity provider with different scoped tokens per cluster, so they don't have to manage separate credentials for each environment.
What is the developer experience going to be?
REST API
The OAuth proxy introduces the following endpoints:
Token management:
Token issuance example:
Using the token:
CLI tool:
Impact to existing APIs: None. All existing OpenSearch and Dashboards APIs continue to work unchanged. The proxy is an additive component — clients that don't use OAuth bypass it entirely.
Configuration (YAML):
Are there any security considerations?
Yes — this feature is fundamentally about security:
.oauth-tokens) with encryption at rest.graph TB subgraph "Authorization Layers (each narrows access)" A["OAuth Scope<br/>Token can access logs-*"] --> B["FGAC (Security Plugin)<br/>User can read logs-*, field masking on PII"] B --> C["Workspace ACL<br/>User sees 'observability' workspace only"] C --> D["✅ Final: Read logs-*, PII masked,<br/>observability workspace only"] end style A fill:#e1f5fe style B fill:#fff3e0 style C fill:#f3e5f5 style D fill:#e8f5e9Are there any breaking changes to the API?
No. Zero breaking changes.
What is the user experience going to be?
Use Case 1: AI Agents (Claude Code, Cursor, Cody, Custom Agents)
AI coding agents need to query OpenSearch for log analysis, search, and observability. OAuth provides scoped, auditable, revocable access.
Example: LangChain agent with OAuth
Example: Claude Code / Cursor MCP config
{ "mcpServers": { "opensearch": { "command": "opensearch-mcp-server", "env": { "OPENSEARCH_URL": "https://opensearch.example.com:8443", "OPENSEARCH_OAUTH_TOKEN": "eyJhbGciOi..." } } } }Sequence diagram:
sequenceDiagram participant Dev as Developer participant Agent as AI Agent<br/>(Claude Code) participant MCP as OpenSearch<br/>MCP Server participant Proxy as OAuth Proxy participant OIDC as OIDC Provider<br/>(Keycloak) participant OS as OpenSearch<br/>Engine Dev->>Agent: "Find error patterns in logs" Agent->>MCP: search(index="logs-*", query="level:error") Note over MCP,OIDC: First request — get token MCP->>OIDC: POST /token (client_credentials, scope=read:logs-*) OIDC-->>MCP: access_token (JWT, expires 1h) MCP->>Proxy: GET /logs-*/_search<br/>Authorization: Bearer <token> Proxy->>OIDC: Fetch JWKS (cached) Proxy->>Proxy: Validate JWT + extract scopes Proxy->>Proxy: Map scope "read:logs-*" → role "logs_reader" Proxy->>OS: GET /logs-*/_search<br/>(as internal user "logs_reader") OS-->>Proxy: Search results (247 hits) Proxy-->>MCP: Search results MCP-->>Agent: Formatted results Agent-->>Dev: "Found 247 errors. Top services:<br/>payment-api (102), auth-service (89)..."Compatible agents:
Use Case 2: Model Context Protocol (MCP)
Example: OpenSearch MCP server (Python)
Use Case 3: Enterprise CI/CD Pipelines
Example: GitHub Actions workflow
Example: Terraform
Sequence diagram:
sequenceDiagram participant Dev as Developer participant GH as GitHub participant CI as GitHub Actions participant Proxy as OAuth Proxy participant OIDC as OIDC Provider participant Stage as Staging OpenSearch participant Prod as Production OpenSearch Dev->>GH: git push (dashboards/*.ndjson) GH->>CI: Trigger workflow CI->>OIDC: POST /token (scope=write:dashboards) OIDC-->>CI: access_token CI->>Proxy: POST /api/saved_objects/_import (staging) Proxy->>Stage: Import dashboard Stage-->>CI: 200 OK ✅ CI->>Proxy: POST /api/saved_objects/_import (prod) Proxy->>Prod: Import dashboard Prod-->>CI: 200 OK ✅ CI->>Dev: Slack: "Dashboard deployed to prod"Use Case 4: Collaboration & ChatOps
Example: Slack bot (Node.js)
Use Case 5: Multi-Tenant SaaS
Example: Per-tenant scoped tokens (Python/FastAPI)
Use Case 6: Observability Pipelines
Example: Fluent Bit with scoped write token
Example: OpenTelemetry Collector
Before vs after:
Are there breaking changes to the User Experience?
No. The OAuth proxy is entirely opt-in:
Why should it be built? Any reason not to?
Why build it:
Reasons not to build:
Impact if not built: OpenSearch continues to fall behind competitors in enterprise and AI agent adoption. Users choose Grafana or Elastic for programmatic workflows. The gap widens as AI agent usage grows.
What will it take to execute?
Architecture
graph TB subgraph Clients A[AI Agent<br/>Claude Code / Cursor] B[CI/CD Pipeline<br/>GitHub Actions / Terraform] C[Slack Bot] D[CLI / IDE Plugin] end subgraph "OAuth Proxy (Go)" E[JWT Validation<br/>JWKS Auto-Discovery] F[Scope → Role Mapping] G[Cedar Policy Engine<br/>Optional] H[API Key Lifecycle<br/>via Engine REST API] I[Prometheus /metrics] end subgraph "OpenSearch Engine" J[Engine API<br/>Search / Index / Admin] L[Security Plugin<br/>FGAC / Workspace ACL] P[API Keys<br/>PR #5443 — 3.7] end subgraph "OpenSearch Dashboards" K[Dashboards API<br/>Saved Objects / Workspaces / UI] end subgraph "OIDC Providers" M[Keycloak] N[Auth0 / Okta] O[Dex] end A -->|Bearer Token| E B -->|Bearer Token| E C -->|Bearer Token| E D -->|Bearer Token| E E --> F F --> G F -->|Mapped Credentials<br/>or API Key| J F -->|Mapped Credentials| K J --> L L --> P E -.->|JWKS Fetch| M E -.->|JWKS Fetch| N E -.->|JWKS Fetch| O H -->|POST /_plugins/_security/api/apitokens| PImplementation stack
golang-jwt/jwt+ OIDC discoverycedar-goPhased delivery
Total: ~3.5 months (AI-first) vs ~9 months (traditional) — Phase 0 runs in parallel.
Assumptions and constraints
/_plugins/_security/api/apitokensrather than maintaining its own token store. This makes the proxy stateless. Before API Keys land, the proxy falls back to mapping OAuth scopes to pre-created backend users/roles.Deployment
Any remaining open questions?
Built-in token issuer vs external-only?— Resolved: with API Keys (PR #5443) landing in the engine, the proxy delegates token issuance to the engine's/_plugins/_security/api/apitokensendpoint. No built-in issuer needed.Scope naming convention — What should the standard scope format be? Options:
read:index-pattern,action:resource,role:role-name. Need community input.Token storage backend— Resolved: API Keys are stored in the engine's.opensearch_security_api_tokenssystem index. The proxy is stateless — no separate token store needed.WebSocket support — Dashboards uses WebSockets for real-time features. How does the proxy handle WebSocket upgrade with OAuth tokens?
Rate limiting — Should the proxy include built-in rate limiting per token/client, or defer to external tools (Envoy, API Gateway)?
API Keys API surface alignment — Does the API Keys REST API support filtering by
created_by? The proxy needs this to manage tokens per OAuth client. Does it support custom metadata fields for audit trail (e.g.,client_id,oauth_provider)?Cedar policy management UX — How do users author and test Cedar policies? CLI-only, or a visual editor in Dashboards?
Multi-cluster token federation — Can a token issued for one cluster be used across multiple clusters? What's the trust model?
Dashboards auth integration — How does the proxy authenticate requests to Dashboards APIs that expect session cookies? Does it need to establish a Dashboards session on behalf of the OAuth client, or can Dashboards be extended to accept bearer tokens directly?
References
Appendix
A. Additional use cases
IDE and Developer Tool Integration
VS Code extension (TypeScript):
Cross-Cluster and Federation
Multi-cluster management (Python):
Ansible Playbook
Slack Alert Notification (Python)
Logstash Output with OAuth
B. What this blocks today
C. Design principles
D. Phase 2: OSD Plugin details
E. Phase 3: Cedar Policy Engine details
Replace static scope-to-role mapping with fine-grained Cedar policies, evaluated locally (Apache 2.0, no external service dependency):
F. Enterprise benefits summary
opensearch-auth login→ doneG. Success metrics
H. AI-first development approach
Using AI coding agents to accelerate development:
What AI accelerates vs what needs human judgment: