Skip to content

Commit f6cad56

Browse files
authored
Activate data source synchronization (#77)
* data sync at server start time * with file data source and initial sync Signed-off-by: Daniele Martinoli <dmartino@redhat.com> * injected log Signed-off-by: Daniele Martinoli <dmartino@redhat.com> * periodical sync Signed-off-by: Daniele Martinoli <dmartino@redhat.com> * removed configmap provider Signed-off-by: Daniele Martinoli <dmartino@redhat.com> * with sync coordinator package Signed-off-by: Daniele Martinoli <dmartino@redhat.com> * Added data folder to git ignored list Signed-off-by: Daniele Martinoli <dmartino@redhat.com> * update * minor comments from copilot Signed-off-by: Daniele Martinoli <dmartino@redhat.com> * with app builder Signed-off-by: Daniele Martinoli <dmartino@redhat.com> * removed unused K8sRegistryDataProvider Signed-off-by: Daniele Martinoli <dmartino@redhat.com> * removed duplications b/w FileRegistryDataProvider and FileStorageManager Signed-off-by: Daniele Martinoli <dmartino@redhat.com> * added test coverage Signed-off-by: Daniele Martinoli <dmartino@redhat.com> * linter issues Signed-off-by: Daniele Martinoli <dmartino@redhat.com> * restrict file permissions Signed-off-by: Daniele Martinoli <dmartino@redhat.com> * allow to launch from any folder Signed-off-by: Daniele Martinoli <dmartino@redhat.com> --------- Signed-off-by: Daniele Martinoli <dmartino@redhat.com>
1 parent ed6471b commit f6cad56

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

43 files changed

+3955
-1013
lines changed

.gitignore

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,4 +38,7 @@ go.work.sum
3838
# Build output
3939
/bin/
4040
/dist/
41-
/coverage/
41+
/coverage/
42+
43+
# data folder created by local examples
44+
data/

CLAUDE.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -37,9 +37,8 @@ The codebase follows clean architecture with three layers:
3737
- Provider abstraction with factory pattern
3838

3939
3. **Provider Layer** (backends for registry data)
40-
- `FileRegistryDataProvider` - Local file backend
41-
- `K8sRegistryDataProvider` - Kubernetes ConfigMap backend
42-
- `K8sDeploymentProvider` - Queries deployed instances
40+
- `FileRegistryDataProvider` - Local file backend (reads synced data)
41+
- `K8sDeploymentProvider` - Queries deployed MCP server instances
4342

4443
### Key Patterns for AI Development
4544

README.md

Lines changed: 150 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -26,18 +26,19 @@ The ToolHive Registry API (`thv-registry-api`) implements the official [Model Co
2626
## Features
2727

2828
- **Standards-compliant**: Implements the official MCP Registry API specification
29-
- **Multiple backends**: Supports Kubernetes ConfigMaps and file-based registry data
29+
- **Multiple data sources**: Git repositories, Kubernetes ConfigMaps, API endpoints, and local files
30+
- **Automatic synchronization**: Background sync with configurable intervals and retry logic
3031
- **Container-ready**: Designed for deployment in Kubernetes clusters
3132
- **Flexible deployment**: Works standalone or as part of ToolHive infrastructure
32-
- **Production-ready**: Built-in health checks, graceful shutdown, and basic observability
33+
- **Production-ready**: Built-in health checks, graceful shutdown, and sync status persistence
3334

3435
## Quick Start
3536

3637
### Prerequisites
3738

3839
- Go 1.23 or later
3940
- [Task](https://taskfile.dev) for build automation
40-
- Access to a Kubernetes cluster (for ConfigMap backend)
41+
- Access to a Kubernetes cluster (for ConfigMap data source)
4142

4243
### Building the binary
4344

@@ -48,22 +49,33 @@ task build
4849

4950
### Running the Server
5051

51-
**From a Kubernetes ConfigMap:**
52+
All configuration is done via YAML configuration files. See the [examples/](examples/) directory for sample configurations.
53+
54+
**Quick start with Git source:**
5255
```bash
53-
thv-registry-api serve \
54-
--from-configmap my-registry-cm \
55-
--registry-name my-registry
56+
thv-registry-api serve --config examples/config-git.yaml
5657
```
5758

58-
**From a local file:**
59+
**With Kubernetes ConfigMap:**
5960
```bash
60-
thv-registry-api serve \
61-
--from-file /path/to/registry.json \
62-
--registry-name my-registry
61+
thv-registry-api serve --config examples/config-configmap.yaml
62+
```
63+
64+
**With local file:**
65+
```bash
66+
thv-registry-api serve --config examples/config-file.yaml
6367
```
6468

6569
The server starts on port 8080 by default. Use `--address :PORT` to customize.
6670

71+
**What happens when the server starts:**
72+
1. Loads configuration from the specified YAML file
73+
2. Immediately fetches registry data from the configured source
74+
3. Starts background sync coordinator for automatic updates
75+
4. Serves MCP Registry API endpoints on the configured address
76+
77+
For detailed configuration options and examples, see the [examples/README.md](examples/README.md).
78+
6779
## API Endpoints
6880

6981
The server implements the standard MCP Registry API:
@@ -78,32 +90,75 @@ See the [MCP Registry API specification](https://github.com/modelcontextprotocol
7890

7991
## Configuration
8092

81-
### Command-line Flags
93+
All configuration is done via YAML files. The server requires a `--config` flag pointing to a YAML configuration file.
94+
95+
### Configuration File Structure
96+
97+
```yaml
98+
# Registry name/identifier (optional, defaults to "default")
99+
registryName: my-registry
100+
101+
# Data source configuration (required)
102+
source:
103+
# Source type: git, configmap, api, or file
104+
type: git
105+
106+
# Data format: toolhive (native) or upstream (MCP registry format)
107+
format: toolhive
108+
109+
# Source-specific configuration
110+
git:
111+
repository: https://github.com/stacklok/toolhive.git
112+
branch: main
113+
path: pkg/registry/data/registry.json
114+
115+
# Automatic sync policy (required)
116+
syncPolicy:
117+
# Sync interval (e.g., "30m", "1h", "24h")
118+
interval: "30m"
119+
120+
# Optional: Server filtering
121+
filter:
122+
names:
123+
include: ["official/*"]
124+
exclude: ["*/deprecated"]
125+
tags:
126+
include: ["production"]
127+
exclude: ["experimental"]
128+
```
82129
83-
The `thv-registry-api serve` command supports the following flags:
130+
### Command-line Flags
84131
85132
| Flag | Description | Required | Default |
86133
|------|-------------|----------|---------|
134+
| `--config` | Path to YAML configuration file | Yes | - |
87135
| `--address` | Server listen address | No | `:8080` |
88-
| `--from-configmap` | ConfigMap name containing registry data | Yes* | - |
89-
| `--from-file` | Path to registry.json file | Yes* | - |
90-
| `--registry-name` | Registry identifier | Yes | - |
91136

92-
*One of `--from-configmap` or `--from-file` must be specified (mutually exclusive)
137+
### Data Sources
138+
139+
The server supports four data source types:
140+
141+
1. **Git Repository** - Clone and sync from Git repositories
142+
- Supports branch, tag, or commit pinning
143+
- Ideal for version-controlled registries
144+
- Example: [config-git.yaml](examples/config-git.yaml)
145+
146+
2. **Kubernetes ConfigMap** - Read from ConfigMaps in the cluster
147+
- Requires Kubernetes API access and RBAC permissions
148+
- Ideal for Kubernetes-native deployments
149+
- Example: [config-configmap.yaml](examples/config-configmap.yaml)
93150

94-
### Backend Options
151+
3. **API Endpoint** - Sync from upstream MCP Registry APIs
152+
- Supports federation and aggregation scenarios
153+
- Format conversion from upstream to ToolHive format
154+
- Example: [config-api.yaml](examples/config-api.yaml)
95155

96-
#### ConfigMap Backend
97-
Fetches registry data from a Kubernetes ConfigMap. Requires:
98-
- Kubernetes API access (in-cluster or via kubeconfig)
99-
- ConfigMap with a `registry.json` key containing MCP registry data
100-
- Appropriate RBAC permissions to read ConfigMaps
156+
4. **Local File** - Read from filesystem
157+
- Ideal for local development and testing
158+
- Supports mounted volumes in containers
159+
- Example: [config-file.yaml](examples/config-file.yaml)
101160

102-
#### File Backend
103-
Reads registry data from a local file. Useful for:
104-
- Mounting ConfigMaps as volumes in Kubernetes
105-
- Local development and testing
106-
- Static registry deployments
161+
For complete configuration examples and advanced options, see [examples/README.md](examples/README.md).
107162

108163
## Development
109164

@@ -133,25 +188,47 @@ task build-image
133188

134189
```
135190
cmd/thv-registry-api/
136-
├── api/v1/ # REST API handlers and routes
191+
├── api/ # REST API implementation
192+
│ └── v1/ # API v1 handlers and routes
137193
├── app/ # CLI commands and application setup
138-
├── internal/service/ # Business logic and data providers
139-
│ ├── file_provider.go # File-based registry backend
140-
│ ├── k8s_provider.go # Kubernetes ConfigMap backend
141-
│ ├── provider.go # Provider interfaces
194+
├── internal/service/ # Legacy service layer (being refactored)
195+
│ ├── file_provider.go # File-based registry provider
196+
│ ├── k8s_provider.go # Kubernetes provider
142197
│ └── service.go # Core service implementation
143198
└── main.go # Application entry point
199+
200+
pkg/
201+
├── config/ # Configuration loading and validation
202+
├── sources/ # Data source handlers
203+
│ ├── git.go # Git repository source
204+
│ ├── configmap.go # Kubernetes ConfigMap source
205+
│ ├── api.go # API endpoint source
206+
│ ├── file.go # File system source
207+
│ ├── factory.go # Source handler factory
208+
│ └── storage_manager.go # Storage abstraction
209+
├── sync/ # Sync manager and coordination
210+
│ └── manager.go # Background sync logic
211+
└── status/ # Sync status tracking
212+
└── persistence.go # Status file persistence
213+
214+
examples/ # Example configurations
144215
```
145216
146217
### Architecture
147218
148-
The server follows a clean architecture pattern:
219+
The server follows a clean architecture pattern with the following layers:
149220
150-
1. **API Layer** (`api/v1`): HTTP handlers implementing the MCP Registry API
151-
2. **Service Layer** (`internal/service`): Business logic for registry operations
152-
3. **Provider Layer**: Pluggable backends for registry data sources
153-
- `FileRegistryDataProvider`: Reads from local files
154-
- `K8sRegistryDataProvider`: Fetches from Kubernetes ConfigMaps
221+
1. **API Layer** (`cmd/thv-registry-api/api`): HTTP handlers implementing the MCP Registry API
222+
2. **Service Layer** (`cmd/thv-registry-api/internal/service`): Legacy business logic (being refactored)
223+
3. **Configuration Layer** (`pkg/config`): YAML configuration loading and validation
224+
4. **Source Handler Layer** (`pkg/sources`): Pluggable data source implementations
225+
- `GitSourceHandler`: Clones Git repositories and extracts registry files
226+
- `ConfigMapSourceHandler`: Reads from Kubernetes ConfigMaps
227+
- `APISourceHandler`: Fetches from upstream MCP Registry APIs
228+
- `FileSourceHandler`: Reads from local filesystem
229+
5. **Sync Manager** (`pkg/sync`): Coordinates automatic registry synchronization
230+
6. **Storage Layer** (`pkg/sources`): Persists registry data to local storage
231+
7. **Status Tracking** (`pkg/status`): Tracks and persists sync status
155232
156233
### Testing
157234
@@ -184,10 +261,33 @@ spec:
184261
image: ghcr.io/stacklok/toolhive/thv-registry-api:latest
185262
args:
186263
- serve
187-
- --from-configmap=my-registry
188-
- --registry-name=my-registry
264+
- --config=/etc/registry/config.yaml
189265
ports:
190266
- containerPort: 8080
267+
volumeMounts:
268+
- name: config
269+
mountPath: /etc/registry
270+
volumes:
271+
- name: config
272+
configMap:
273+
name: registry-api-config
274+
---
275+
apiVersion: v1
276+
kind: ConfigMap
277+
metadata:
278+
name: registry-api-config
279+
data:
280+
config.yaml: |
281+
registryName: my-registry
282+
source:
283+
type: configmap
284+
format: toolhive
285+
configmap:
286+
namespace: toolhive-system
287+
name: my-registry-data
288+
key: registry.json
289+
syncPolicy:
290+
interval: "15m"
191291
```
192292
193293
### Docker
@@ -196,10 +296,16 @@ spec:
196296
# Build the image
197297
task build-image
198298

199-
# Run with file backend
200-
docker run -v /path/to/registry.json:/data/registry.json \
299+
# Run with Git source
300+
docker run -v $(pwd)/examples:/config \
301+
ghcr.io/stacklok/toolhive/thv-registry-api:latest \
302+
serve --config /config/config-git.yaml
303+
304+
# Run with file source (mount local registry file)
305+
docker run -v $(pwd)/examples:/config \
306+
-v /path/to/registry.json:/data/registry.json \
201307
ghcr.io/stacklok/toolhive/thv-registry-api:latest \
202-
serve --from-file /data/registry.json --registry-name my-registry
308+
serve --config /config/config-file.yaml
203309
```
204310

205311
## Integration with ToolHive

0 commit comments

Comments
 (0)