-
Notifications
You must be signed in to change notification settings - Fork 547
feat: Enterprise Identity — Keycloak SSO, org structure, per-project RBAC, channel pairing #565
Description
Summary
GoClaw's multi-tenant foundation is solid (migration 000027 scoped 40+ tables), but user identity remains fragmented. Users are opaque string IDs — agents don't know who they're talking to, can't gate tools by permission, and can't route work to the right person.
This proposal adds an Enterprise Identity layer that turns GoClaw from a multi-tenant gateway into an org-aware collaboration platform:
- Keycloak SSO — federated auth (Google Workspace, Microsoft 365, GitHub), zero password management
- Org structure — departments, matrix memberships, reporting hierarchy
- Per-project RBAC — different roles + fine-grained permissions per project
- Channel pairing — verify Telegram/Slack/Discord/Zalo users via email OTP, link to org identity
- Agent context enrichment — agents see who they're talking to (role, expertise, department, permissions) and gate tools accordingly
The Problem Today
- No verified identity: sender_id is a platform-specific number. Agent can't distinguish a CTO from a random group member.
- No permission gating: Every user has access to every tool. A viewer can trigger deploy or exec — the agent has no way to check.
- No org context for delegation: When an agent needs a reviewer, it can't query "who in this project has can_approve?" or "who in Frontend is available?"
- No SSO: Web dashboard uses static API tokens. No OAuth, no MFA, no session management.
Proposed Architecture
KEYCLOAK (self-hosted, 1 Realm = 1 Tenant)
IdPs: Google, Microsoft 365, GitHub, password
Realm roles -> org_users.tenant_role
Groups -> departments mapping
|
| JWT (RS256)
v
GOCLAW GATEWAY
JWT Middleware -> org_users (thin cache) -> RunContext.UserProfile
departments <-> department_members <-> org_users
projects <-> project_members <-> org_users
Channel Pairing: email OTP -> verified_user_id link
Key design decisions:
- Keycloak as identity provider — GoClaw validates JWTs, never touches passwords/OAuth tokens
- org_users = thin cache — keycloak_id (UUID) + email + profile (JSONB). Refreshed on login. Zero Keycloak API calls on message hot path
- Graceful degradation — unpaired users keep working exactly as today. No breaking changes.
Data Model (5 new tables, 3 altered)
-- Thin cache of Keycloak users
org_users (id UUID PK, tenant_id FK, email, display_name, avatar_url,
auth_provider, profile JSONB, status, last_login_at)
-- Matrix org: user in multiple departments with different roles
departments (id, tenant_id, name, slug, parent_id FK self, head_user_id FK)
department_members (department_id FK, user_id FK, role, title, UNIQUE(dept, user))
-- Per-project roles + fine-grained permissions
project_members (project_id FK, user_id FK, role, permissions JSONB, UNIQUE(proj, user))
-- Email OTP for channel pairing (3 max attempts + rate limit)
pairing_verifications (user_id FK, email, code, channel_type, sender_id,
attempts, expires_at)
-- Extend existing tables
ALTER TABLE tenant_users ADD keycloak_id UUID FK org_users;
ALTER TABLE channel_contacts ADD email, verified_user_id FK org_users;
ALTER TABLE paired_devices ADD email, verified_user_id FK org_users;How It Changes Agent Behavior
Before (today):
User: "deploy to staging"
Agent: *executes deploy* (no permission check, anyone can do this)
After:
User: "deploy to staging"
Agent: [reads RunContext.UserProfile]
-> Hoang, Backend Lead, project role: lead, permissions: {can_deploy: true}
-> "Deploying to staging as Hoang (lead)..."
Viewer: "deploy to staging"
Agent: -> permissions: {can_deploy: false}
-> "You need can_deploy permission. Ask your project lead."
Smart delegation:
User: "who can review this PR?"
Agent: [queries project_members WHERE can_approve = true]
-> "Minh (QA Lead, available) and Trang (Senior Dev, busy).
Recommend assigning to Minh."
Channel Pairing Flow
Telegram: /pair
Bot: Enter your organization email:
User: hoang@company.com
Bot: Code sent! Enter the 6-digit code:
User: 847291
Bot: Paired! Welcome, Hoang Du (Backend Lead, Engineering).
After pairing, every message from this Telegram account is enriched with the user's full org profile.
Tool Policy Integration
New Step 8 in the existing 7-step PolicyEngine pipeline — checks UserProfile.Permissions per tool. Tools declare RequiredPermission(). Missing permission = tool stripped. Unpaired users get safe read-only tools only.
Integrates cleanly with existing policy pipeline — no refactoring needed.
Implementation Phases
| Phase | Scope | Value |
|---|---|---|
| 1 | Keycloak + org_users + JWT middleware | Web dashboard gets real SSO |
| 2 | Channel pairing via email OTP | Channel users have verified identity |
| 3 | Departments + matrix memberships | Agent knows org chart |
| 4 | Project members + permission-based tool gating | Core RBAC value |
| 5 | Smart delegation (expertise, availability, workload) | Agent routes work intelligently |
Phases 3 and 4 are independent and can be built in parallel.
Why Upstream?
This feature makes GoClaw viable for enterprise/team deployments where:
- Multiple people use the same agent across channels
- Different people need different tool access (intern vs CTO)
- Agents need org context to delegate and route work
- SSO is a hard requirement (Google Workspace, AD/LDAP via Keycloak)
Without this, every enterprise deployment needs a custom identity layer. With this, it's built-in.
What Already Exists (reuse, not rebuild)
| Existing | Reuse |
|---|---|
| tenant_users + 5-level RBAC | Extend with keycloak_id FK |
| channel_contacts + ContactCollector | Add email, verified_user_id |
| paired_devices + PairingStore | Add verified_user_id, email |
| PolicyEngine 7-step pipeline | Add Step 8 |
| RunContext + context propagation | Add WithUserProfile |
| ExtraSystemPrompt injection | Append Current User block |
| projects table (PR #551) | Add project_members junction |
Backward Compatibility
- Zero breaking changes: Unpaired users work exactly as today
- Keycloak is optional: If not configured, all existing auth continues unchanged
- Phased adoption: Each phase ships value independently
- SQLite/Lite builds: org_users store returns nil -> graceful degradation
I'd Like to Contribute This
I've been running GoClaw in production (4 Telegram bots, 275 agents, 15 cron jobs, K8s deployment) and have already contributed:
- PR feat: project-as-a-channel — per-project MCP environment isolation #551 — project-as-a-channel (per-project MCP isolation)
- PR fix(mcp): retry stdio transport initialization with exponential backoff #552 — MCP stdio init retry (Fixes MCP stdio transport: add configurable init delay for slow-starting servers #385)
- PR fix(sessions): use UPSERT in Save() to persist first-run cron sessions #553 — Session Save() UPSERT (Fixes Session Save() uses UPDATE-only — cron job sessions never persist, retry reads stale cache #379)
- PR feat(store): add Upsert() to BuiltinToolStore for additive tool seeding #554 — BuiltinToolStore.Upsert() (Fixes feat: add Upsert() to BuiltinToolStore to support fork-specific tool seeding without reconcile DELETE #336)
I have a complete design spec (architecture, data model, auth flows, UI specs for 5 new pages, failure mode analysis, 7 eng review decisions) and am ready to implement in phases with upstream PRs. Happy to start with Phase 1 (Keycloak + org_users + JWT middleware) as a self-contained PR for review.
Full design spec: Available on request or as a PR to docs/.
Related
- Depends on: feat: project-as-a-channel — per-project MCP environment isolation #551 (project-as-a-channel — for project_members FK)
- Extends: Migration 000027 (tenant foundation)
- Extends: PolicyEngine in internal/tools/policy.go