-
Notifications
You must be signed in to change notification settings - Fork 114
Thv registry deploy from controller #1931
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Extends the registry API server to support reading registry data from local files in addition to Kubernetes ConfigMaps. This enables deployment scenarios where ConfigMaps are mounted as volumes rather than accessed via the Kubernetes API. The server now accepts either: - `--configmap` flag: Reads from Kubernetes ConfigMap via API - `--file` flag: Reads from mounted file (for volume-mounted ConfigMaps)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this draft version! Added some comments but I generally like the idea
desc: Build the ToolHive runtime and Operator image locally and deploy it to the K8s cluster | ||
desc: | | ||
Build the ToolHive runtime and Operator image locally and deploy it to the K8s cluster. | ||
Set DEPLOY_REGISTRY_API=true to also build and deploy the registry API image. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just thinking aloud: to enable the registry controller we use ENABLE_EXPERIMENTAL_FEATURES=true
, shouldn't we have a single flag for these experimental features?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'm going to just use ENABLE_EXPERIMENTAL_FEATURES=true
for -deploy-local
because there we need to also do a conditional build. For non-local installations from registry we already have a value one can set during the helm invocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a leftover of previously closed issue: the ENABLE_EXPERIMENTAL_FEATURES
is not propagated to the *-deploy-*
tasks. The only way to deploy the registry controller so far is by executing ENABLE_EXPERIMENTAL_FEATURES=true task operator-run
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this will be now fixed for deploy-local
|
||
// GetStorageReference returns a reference to where the data is stored | ||
GetStorageReference(mcpRegistry *mcpv1alpha1.MCPRegistry) *mcpv1alpha1.StorageReference | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original purpose of the StorageManager
interface in the sources
package was to implement data storage functions for the data imported from the configured data source.
Shouldn't we have a dedicate package (registry_api? deployment?...) for the functions related to deploying the Registry API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in general, I'd move the functions to deploy the registry API to a dedicated package
func (r *MCPRegistryReconciler) isAPIReady(ctx context.Context, mcpRegistry *mcpv1alpha1.MCPRegistry) bool { | ||
ctxLogger := log.FromContext(ctx).WithValues("mcpregistry", mcpRegistry.Name) | ||
|
||
deploymentName := fmt.Sprintf("%s-api", mcpRegistry.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe this and similar name generators will go to dedicated functions
// Get current hash from source using the existing handler | ||
hash, err := sourceHandler.CurrentHash(context.Background(), mcpRegistry) | ||
if err != nil { | ||
// If we can't get the hash, return a time-based hash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think fmt.Sprintf("error-%d", time.Now().Unix())
is a "time-based hash". Either update the comment or the code?
ctxLogger := log.FromContext(ctx).WithValues("mcpregistry", mcpRegistry.Name) | ||
|
||
// Build the desired service configuration | ||
service := buildRegistryAPIService(mcpRegistry) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe I can't follow the flows in the review, but it seems that the ensureDeployment
method first checks for the existence of the deployment before invoking buildRegistryAPIDeployment
, while here we do the other way: first we invoke buildRegistryAPIService
then check for the existence of the service.
}, | ||
containerName: "registry-api", | ||
expectError: false, | ||
expectedArgs: []string{"serve", "--file=/data/registry/registry.json"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this /data/registry/registry.json
should be a new constant or a function using the 2 constants, the mount path and ConfigMapStorageDataKey
fetchResult, err := s.fetchAndProcessRegistryData(ctx, mcpRegistry) | ||
if err != nil { | ||
return ctrl.Result{RequeueAfter: time.Minute * 5}, nil | ||
return ctrl.Result{RequeueAfter: time.Minute * 5}, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning both a ctrl.Result and an error would trigger a WARNING while runnig the controller, because the reconciliation loop expects either a requeue request or an error. You should then see a warning like:
9:44AM INFO Warning: Reconciler returned both a non-zero result and a non-nil error. The result will always be ignored if the error is non-nil and the non-nil error causes requeuing with exponential backoff. For more details, see: https://pkg.go.dev/sigs.k8s.io/controller-runtime/pkg/reconcile#Reconciler ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this shouldn't have gone to this PR. I'll back off this change. I think I was testing failure conditions and updates yesterday and managed to run git -am
.
# -- Service account configuration for the registry API | ||
serviceAccount: | ||
# -- Specifies whether a service account should be created | ||
create: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the alternative, if the SA is not created?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just reused the pattern from deploy/charts/operator/templates/serviceaccount.yaml
.
I will get rid of this condition and just rely on the experimental feature one.
Opening as a draft until PR #1909 get merged. My testing went well, there are 2 things that we might address as part of the review or as follow-ups:
Fixes: #1746