Skip to content

Conversation

@jrepp
Copy link
Owner

@jrepp jrepp commented Nov 20, 2025

User request: "look at all local branches for unmerged commits, create PRs if they are found by first merging origin/main and submitting the commit data"

This branch contains 5 unmerged commit(s). Conflicts resolved automatically with aggressive strategy.

Co-Authored-By: Claude noreply@anthropic.com

jrepp and others added 5 commits October 18, 2025 17:19
User request: "let's work on backend modeling for the patterns, we want to create a flat set of backends that have unique names and are known to the admin and shared to everyone in the control plane, it will be neccessary for the pattern runners to have access to the backend configuration on startup so they can map their slot implementation details to specific backend configurations"

Created comprehensive RFC for centralized backend configuration management:

## Key Design Decisions

**Flat Backend Registry**:
- All backends have globally unique names (e.g., kafka-prod, postgres-primary)
- Shared across all namespaces and patterns
- Eliminates config duplication and enables central management

**Admin-Managed with Raft**:
- Backends stored in admin FSM state (replicated via Raft)
- New admin commands: REGISTER_BACKEND, UPDATE_BACKEND, DELETE_BACKEND
- Synced to local storage (SQLite/PostgreSQL) on each admin node

**Pattern Slot Binding**:
- Patterns declare slot_bindings: {registry: "postgres-primary", messaging: "kafka-prod"}
- Pattern runners fetch backend configs from admin at startup
- SlotBinder utility creates type-specific slot implementations

**Type-Specific Configs**:
- BackendType enum: KAFKA, NATS, POSTGRES, REDIS, SQLITE, S3, MEMSTORE
- Structured configs per type (KafkaConfig, PostgresConfig, etc.)
- Credentials, connection pooling, timeouts all captured

## Example Flow

1. Operator registers backend:
   `prism-admin backend register kafka-prod --brokers kafka:9092`

2. Pattern references backend:
   ```yaml
   namespace: order-processing
   pattern: multicast-registry
   slot_bindings:
     registry: postgres-primary
     messaging: kafka-prod
   ```

3. Pattern runner binds slots:
   - Fetches kafka-prod config from admin
   - Creates KafkaMessagingSlot with connection details
   - Connects to Kafka and starts processing

## Benefits

- **DRY**: One backend config used by multiple patterns
- **Centralized ops**: Change Kafka URL once, all patterns update
- **Separation of concerns**: Pattern authors don't need connection details
- **Type safety**: Structured configs with validation
- **Observability**: Admin knows which patterns use which backends

## Implementation Plan

6-phase rollout over 4 weeks:
1. Protobuf definitions
2. Admin FSM integration
3. Admin API implementation
4. Pattern runner integration
5. Testing
6. Documentation

## Open Questions

- Secret management (proposed: integrate Vault/K8s Secrets)
- Backend versioning and hot-reload (proposed: require restart initially)
- Multi-region backends (proposed: separate entries per region)
- Health monitoring (proposed: Phase 2 feature)

Builds on RFC-014 (layered patterns), RFC-017 (multicast registry slots),
RFC-035 (pattern launcher), RFC-038 (admin raft), and MEMO-004 (backend guide).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…ration

User request: "pull in slot configuration of pattern implementations - we also want to define a type of config similar to backend with is a frontend, a frontend is a type of interface binding on the proxy, the default is the grpc pattern interface, this happens by default but can be disabled, additional itnerfaces can be added based on a front end definition - we should use the openapi semantics so that if we define a rest based front end interface that maps for example the mcp rest interface we can then bind it to specific pattern interfaces with some route config that can be consumed by the proxy to map rest interfaces to patterns on the backend, a concrete example is to expose a registry pattern as a confluent schema registry api"

Major additions to RFC-039:

1. Frontend Interface Binding Model (parallel to Backend):
   - Frontend resource with globally unique names
   - FrontendType enum: REST, GraphQL, gRPC-Web, SSE, WebSocket
   - Type-specific configs (RestConfig, GraphQLConfig, etc.)
   - RouteMapping for OpenAPI-style REST → gRPC mapping
   - ParamMapping: path/query/header/body → protobuf field mapping
   - ResponseMapping: protobuf → HTTP response transformation

2. Admin State Integration:
   - FrontendEntry in AdminState (Raft-replicated)
   - Frontend management commands (Register, Update, Delete)
   - Frontend management RPCs in ControlPlane service
   - Storage sync to persist frontends

3. Concrete Example: Registry Pattern as Confluent Schema Registry API:
   - Complete route mappings for Confluent REST API
   - POST /subjects/{subject}/versions → RegisterSchema gRPC
   - GET /subjects/{subject}/versions/{version} → GetSchema gRPC
   - POST /compatibility/... → CheckCompatibility gRPC
   - DELETE /subjects/{subject}/versions/{version} → DeleteSchema gRPC
   - Full sequence diagram showing HTTP → gRPC translation
   - Python client example using Confluent SDK with Prism backend
   - Benefits: API compatibility, backend flexibility, protocol translation

4. Pattern Slot Schema Integration (MEMO-006):
   - Slot definitions with required/optional interfaces
   - Runtime validation: backend must implement required interfaces
   - Backend capability metadata (keyvalue_basic, pubsub_basic, etc.)
   - SlotBinder validates interface requirements at pattern startup
   - 45 thin interfaces across 10 data models (per MEMO-006)

5. Namespace Configuration Extensions:
   - FrontendBinding message for namespace opt-in
   - Default gRPC interface (can be disabled)
   - Multiple frontends per namespace
   - Namespace-specific overrides

6. Expanded Implementation Plan (8 phases, 5 weeks):
   - Phase 1: Protobuf definitions for both backend and frontend
   - Phase 2: Admin FSM integration for both registries
   - Phase 3: Admin API implementation for both
   - Phase 4: Pattern runner slot binding with schema validation
   - Phase 5: Proxy frontend integration (REST adapter, route matching)
   - Phase 6: Confluent Schema Registry concrete example
   - Phase 7: Comprehensive testing (backend + frontend)
   - Phase 8: Documentation for operators

Key design principles:
- Parallel architecture: Frontends to Backends (same admin management)
- OpenAPI semantics for route mapping (not full codegen)
- Protocol translation at proxy layer (HTTP → gRPC)
- Centralized admin management for both registries
- Default gRPC + optional additional interfaces

References added:
- RFC-020 (HTTP adapter pattern)
- RFC-032 (Confluent API compatibility)
- MEMO-006 (interface decomposition, slot schemas)
- Confluent Schema Registry API documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
User request: "the PR is waiting for status check but it's not available for this change, for document only changes can we run the document verification, lint, build and update the CI status?"

Problem: CI workflow has paths-ignore for docs-cms/** and *.md files, so documentation-only PRs don't trigger any status checks, leaving PRs without validation.

Solution: Created dedicated docs-pr.yml workflow that:
- Triggers on PRs with documentation changes only
- Runs uv run tooling/validate_docs.py (validates frontmatter, links, MDX)
- Runs uv run tooling/build_docs.py (builds Docusaurus site)
- Provides docs-status check for PR merge requirements
- Uses concurrency groups to cancel stale runs

Benefits:
- Documentation PRs now get status checks
- Validates MDX compilation before merge
- Catches broken links and invalid frontmatter
- Prevents GitHub Pages build failures
- Independent from main CI workflow (doesn't run code tests for doc changes)

Workflow triggers on:
- docs-cms/** (ADRs, RFCs, MEMOs)
- docusaurus/** (Docusaurus config)
- **/*.md (all markdown files)
- tooling/validate_docs.py, tooling/build_docs.py
- .github/workflows/docs-pr.yml (self-test)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
User request: "create a new feature branch to create a rfc that defines client sdks the sdks will target having full integration test coverage and a shared directory structure. there will be 3 sdks rust, python and go to start - we will use best practices for async client apis and use the grpc interfaces directly - we want to expose the pattern interfaces as directly usable apis. to start a client we should use oauth as the default auth method - clients will need to support namespace configuration within the limited set of configuration options available for each pattern - start with producer, consumer and key-value patterns"

Created comprehensive RFC-040 defining client SDK architecture for Rust, Python, and Go:

**Architecture**:
- Pattern-centric APIs: Producer, Consumer, KeyValue as first-class APIs
- Async-first design: tokio (Rust), asyncio (Python), goroutines (Go)
- Direct gRPC communication with Prism proxy for maximum performance
- OAuth2 client credentials flow as default authentication
- Namespace-aware configuration with per-pattern options

**Pattern APIs**:
- Producer: publish(), publish_batch(), flush()
- Consumer: subscribe() with streaming API, ack(), nack()
- KeyValue: get(), set(), delete(), exists()

**Testing Strategy**:
- Full integration test coverage using testcontainers
- Real Prism proxy + backends (Redis, Kafka, NATS) for tests
- Target coverage: Producer/Consumer 85%, OAuth2 90%
- Performance benchmarks: >10k msg/sec producer, <1ms KeyValue p99

**Shared Directory Structure**:
- Consistent layout across all three languages
- src/patterns/ for pattern implementations
- src/auth/ for OAuth2 client
- tests/{unit,integration,e2e}/ for test suites
- examples/ for usage examples

**Configuration**:
- Unified YAML format across all SDKs
- OAuth2 with token caching and automatic refresh
- Per-namespace pattern-specific options
- Built-in observability (Prometheus, OpenTelemetry, structured logging)

**Implementation Roadmap**:
- Phase 1 (Week 1-2): Protobuf code gen, Client factory, OAuth2
- Phase 2 (Week 3-4): Producer, Consumer, KeyValue implementations
- Phase 3 (Week 5-6): Integration tests with testcontainers
- Phase 4 (Week 7-8): Observability, documentation, benchmarks

Updated changelog with RFC-040 summary and key features.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
User request: "update the pr creation command to keep the pr description short, readable and focused on quickly answering why, how and what for reviewers"

Updated .claude/commands/submit-pr.md to enforce concise PR descriptions:

**New Structure** (replaces verbose multi-section format):
- Why: 1-2 sentences on problem/value
- How: 2-4 bullets on implementation approach
- What Changed: 2-4 bullets on measurable impact
- Testing: Simple checklist
- Target: 10-15 lines total (max 20)

**Key Changes**:
- Remove file paths (reviewers see diffs)
- Remove implementation details (code review is for that)
- Remove excessive checklists (Breaking Changes, Dependencies)
- Focus on architecture/approach, not line-by-line changes
- Quantify impact when possible

**Writing Guidelines**:
- Why: Problem/value, not process
- How: Architecture only, not filenames
- What: Impact, not file changes

**Example Good PR** (12 lines):
```
## Why
RFC-040 requires client SDKs to reduce integration friction.

## How
- Define pattern-centric APIs
- OAuth2 auth flow
- Testcontainers for integration tests

## What Changed
- Add RFC-040 with 3-language SDK spec
- Define shared directory structure
- Specify 85% coverage targets

## Testing
- [x] Documentation validation passes
```

This replaces the previous 6-section format that encouraged verbose descriptions with file paths and implementation details.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings November 20, 2025 22:09
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds RFC-040 defining client SDK architecture for Rust, Python, and Go, plus RFC-039 for backend/frontend configuration registries. The work provides standardized client libraries with pattern-centric APIs, OAuth2 authentication, and integration testing, alongside a unified system for managing backend connections and REST API frontends.

  • RFC-040: Multi-language client SDK specification with Producer/Consumer/KeyValue patterns
  • RFC-039: Backend/frontend configuration registry for centralized management
  • Updated documentation workflow and PR submission guidelines

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
docs-cms/rfcs/RFC-040-client-sdk-architecture.md Defines SDK architecture for 3 languages with OAuth2, testcontainers, and pattern APIs
docs-cms/rfcs/RFC-039-backend-configuration-registry.md Specifies backend/frontend registry for centralized config management
docusaurus/docs/changelog.md Adds changelog entry for RFC-040 with key features and roadmap
.github/workflows/docs-pr.yml Adds CI workflow for documentation validation
.claude/commands/submit-pr.md Updates PR guidelines to Why/How/What format with brevity requirements

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@mergify mergify bot added documentation Improvements or additions to documentation infrastructure labels Nov 20, 2025
@mergify
Copy link

mergify bot commented Nov 20, 2025

This PR has merge conflicts with the base branch. Please resolve them.

@mergify
Copy link

mergify bot commented Dec 4, 2025

This PR has been inactive for 14 days. Please update it or close it if it's no longer needed.

@mergify mergify bot added stale and removed stale labels Dec 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation has-conflicts infrastructure size/xs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants