Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
210 changes: 210 additions & 0 deletions docs/proposals/kubernetes-registry.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,210 @@
# Kubernetes Registry Implementation for ToolHive Operator

## Problem Statement

The ToolHive operator currently supports managing individual MCP servers through `MCPServer` CRDs, but lacks a centralized registry mechanism within Kubernetes. This creates challenges in discoverability, catalog management, upstream compatibility, and operational complexity.

## Goals

- **Native Kubernetes Registry**: Implement registry functionality using Custom Resource Definitions
- **Upstream Format Support**: Leverage existing upstream conversion capabilities for ecosystem compatibility
- **Multi-Registry Support**: Support both local registry entries and external registry synchronization.
- **Registry Hierarchy**: Support the multi-registry hierarchy defined in the upstream model
- **Application Integration**: Provide REST API for programmatic access to registry data
- **GitOps Compatibility**: Enable declarative registry management through CRD-based operations

## Architecture Overview

The Kubernetes registry implementation extends the operator with the `MCPRegistry` CRD and supporting controllers that work with the existing MCPServer CRD to provide a complete registry-to-deployment workflow.

## CRD Design Overview

### MCPRegistry CRD

The `MCPRegistry` CRD represents a registry source and synchronization configuration with these key components:

- **Source Configuration**: Support for ConfigMap, URL, Git, and Registry API sources
- **Format Specification**: Handle both ToolHive and upstream registry formats
- **Sync Policy**: Automatic and manual synchronization with configurable intervals
- **Filtering**: Include/exclude servers based on names, tags, tiers, and transports

### Job CRDs (Phase 3)

Declarative operation CRDs for GitOps compatibility:
- `MCPRegistryImportJob`: Declarative import operations
- `MCPRegistryExportJob`: Declarative export operations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the use-case for export?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Data backup? Anyway, I agree it can be dropped for now: if the original data source is not mutable from the registry itself (we only import and sync), then it's probably useless.

- `MCPRegistrySyncJob`: Declarative synchronization operations

**Detailed specifications**: See [kubernetes-registry/crd-specifications.md](kubernetes-registry/crd-specifications.md)

## Key Features and Capabilities

### 1. Registry Management

The ToolHive operator provides comprehensive registry management through specialized components:

#### Registry Controller
- **Synchronization**: Automatic and manual synchronization with external registry sources
- **Format Conversion**: Bidirectional conversion between ToolHive and upstream registry formats
- **Filtering**: Include/exclude servers based on configurable criteria (names, tags, tiers, transports)
- **Status Tracking**: Monitor sync status, error conditions, and statistics
- **Server Labeling**: Automatically apply registry relationship labels to discovered servers

#### Registry API Service
- **REST API**: HTTP endpoints for programmatic registry and server discovery
- **Authentication**: Integration with Kubernetes RBAC and service account tokens
- **Filtering**: Query servers by registry, category, transport type, and custom labels
- **Format Support**: Return data in both ToolHive and upstream registry formats
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This migth be a silly question, but how much difference is there now between toolhive and upstream registry formats? I thought our goal was to use the upstream registry format with vendor extensions. Does toolhive here mean supporting the extensions? (maybe this is a question for @rdimitrov )

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was not the main goal of the PR, but I tried to match the previous proposal tracked as upstream-mcp-registry-format-support.md.
My understanding was that thv would keep its proprietary format and use the extensions mechanisms to export using the upstream format (and vice versa), but if this is not the case it would even simplify the PR by removing the format conversions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we talk about the official MCP registry it consist of:

  • an OpenAPI schema for the registry API server (once it goes live this would be publicly available for everyone to use)
  • and a json schema that covers how you describe an MCP server (the so-called server.json)

On the Toolhive side:

  • We are in the process of moving our registry from our format (i.e ImageMetadata) to follow the upstream format (aka server.json).
  • Note that the structure of the actual registry.json file will slightly change too, but this is expected as this part is really specific to ToolHive (there's no community effort around adopting a file representation of a registry catalogue, at least not yet). Here's a preview of the new format - link.
  • The above will set the foundation that would allows us to then add support for the registry API as well as any other compliant registries). In your proposal I think this maps to a remote registry source.
  • Note that ToolHive's API will probably be a superset of this too so other registry clients besides Toolhive can use it.


### 2. Registry Sources

The implementation supports multiple registry source types, all with both ToolHive and upstream formats.
All sources support configurable synchronization policies including automatic sync intervals, retry behavior, and update strategies.

Registry sources can be organized in hierarchies as defined in the [MCP Registry Ecosystem Diagram](https://github.com/modelcontextprotocol/registry/blob/main/docs/ecosystem-diagram.excalidraw.svg), enabling upstream registries to aggregate from multiple sources.

This aggregation approach, combined with maintaining the ToolHive registry schema, addresses provenance data handling by extracting it from upstream registry extensions during format conversion.

#### ConfigMap Source
- Store registry data directly in Kubernetes ConfigMaps
- Ideal for small, manually managed registries
- Immediate updates when ConfigMap changes

#### URL Source
- Fetch registry data from HTTP/HTTPS endpoints
- Support for authentication via Secret references
- Custom headers for API integration

#### Git Source
- Clone registry data from Git repositories
- Branch and path specification
- Authentication via SSH keys or tokens
- Version tracking and change detection

#### Registry Source
- Reference another registry's REST API endpoint as a data source
- Enables registry hierarchies and aggregation patterns across clusters
- Supports filtering and transformation of upstream registry data
- Works with any registry implementation that exposes the standard API
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the standard API is the OpenAPI spec of the official MCP registry or is it something else?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works with any registry implementation that exposes the standard API
We can discuss the details, but it should work according to the specified format field, being capable to digest both upstream or toolhive APIs.

- Useful for creating curated subsets or company-specific views of upstream registries

### 3. Server-Registry Relationships

#### Automatic Labeling
When deployed servers are created from registries, the controller automatically applies standardized labels during resource creation:

```yaml
labels:
toolhive.stacklok.io/registry-name: upstream-community
toolhive.stacklok.io/registry-namespace: toolhive-system
toolhive.stacklok.io/server-name: filesystem-server
toolhive.stacklok.io/tier: Official
toolhive.stacklok.io/category: filesystem
```

These labels enable filtering, grouping, and querying servers by their registry source.

#### Pre-deployed Server Association
Existing MCPServer resources can be associated with registries by applying the standard labels, enabling unified management across manually deployed and registry-synchronized servers.

## Quick Start Example

```yaml
apiVersion: toolhive.stacklok.io/v1alpha1
kind: MCPRegistry
metadata:
name: upstream-community
namespace: toolhive-system
spec:
displayName: "MCP Community Registry"
format: upstream
source:
type: url
url:
url: "https://registry.modelcontextprotocol.io/servers.json"
syncPolicy:
enabled: true
interval: "1h"
```

**Comprehensive examples**: See [kubernetes-registry/usage-examples.md](kubernetes-registry/usage-examples.md)

## Implementation Overview

The implementation follows a phased approach:

1. **Phase 1**: Core Registry CRD and basic synchronization
2. **Phase 2**: External sources, REST API for applications
3. **Phase 3**: CRD-based operations, automatic labeling
4. **Phase 4**: Production features and filtering
5. **Phase 5**: Advanced integration (optional)

**Detailed implementation plan**: See [kubernetes-registry/implementation-plan.md](kubernetes-registry/implementation-plan.md)

## CLI Integration

New registry management commands:
- `thv registry list/add/sync/remove` - Registry lifecycle management
- `thv registry import/export` - Data migration operations
- `thv search/show` - Enhanced server discovery across registries

**Complete CLI reference**: See [kubernetes-registry/usage-examples.md](kubernetes-registry/usage-examples.md)

## Security and Operations

### Security Model
- **RBAC Integration**: Granular permissions for registry operations
- **Source Validation**: URL restrictions and content validation
- **Authentication**: Secure handling of external source credentials
- **Audit Logging**: Comprehensive operation tracking

### Success Metrics
- **Adoption**: Registry resource creation and server association rates
- **Performance**: <30s sync time, >99% success rate, <100MB memory usage
- **Usability**: Reduced manual configuration complexity
- **Ecosystem**: Upstream registry coverage and format conversion accuracy

**Complete details**: See [kubernetes-registry/implementation-plan.md](kubernetes-registry/implementation-plan.md)

## Future Enhancements

1. **Catalog System** (see [kubernetes-registry/catalog-design.md](kubernetes-registry/catalog-design.md))
- MCPCatalog CRD for curated server collections
- Approval workflows and validation pipelines
- OCI artifact distribution for catalog sharing
- Role-based catalog access and governance

2. **Advanced Registry Features**
- Registry federation and cross-cluster synchronization
- Webhook-based real-time registry updates
- Registry analytics and usage metrics
- Content verification and signature validation

3. **Template System** (see [../mcp-server-template-system.md](../mcp-server-template-system.md))
- Comprehensive template parameter system
- Template versioning and inheritance
- Integration with Helm and Kustomize
- Interactive template wizards and validation

4. **Integration Expansions**
- GitOps workflow integration with ArgoCD/Flux
- CI/CD pipeline integration
- Service mesh integration for advanced networking
- Multi-cluster registry synchronization

5. **Community Features**
- Community ratings and reviews for registry entries
- Automated server discovery from popular repositories
- Registry contribution workflows and governance

## Conclusion

The Kubernetes Registry implementation provides a cloud-native approach to MCP server management that:

- **Leverages Kubernetes APIs** for native resource management and RBAC
- **Integrates with existing tooling** through standard kubectl and custom CLI commands
- **Supports ecosystem growth** through upstream format compatibility and conversion
- **Enables GitOps workflows** through declarative resource definitions
- **Scales operationally** with automated synchronization and registry-based deployment

This implementation transforms ToolHive into a comprehensive Kubernetes-native platform for MCP server lifecycle management, maintaining backward compatibility while enabling ecosystem integration through upstream format support.
Loading
Loading