feat: add registry role for disconnected deployment#866
feat: add registry role for disconnected deployment#866fabiendupont wants to merge 1 commit intoseapath:mainfrom
Conversation
bb2edc6 to
6f951dc
Compare
In the current implementation, every node installs a registry locally and pull/push the cephadm image. However, this is neither truly disconnected as pull requires internet, nor resource efficient as a single registry is enough. This commit introduces a registry role that deploys docker.io/registry:v2 and allows importing images from internet (pull) or from an exported tarball (load). The seapath_setup_disconnected.yaml playbook installs the registry on the Ansible control node as a singleton. TLS is enabled by default: the registry auto-generates a self-signed CA and server certificate when no user-provided certs are given. The CA is distributed to all cluster nodes so they trust the registry over HTTPS. The registry listens on port 443 to avoid specifying the port in image names. The *_physical_machine roles are updated to use that registry as a mirror, which doesn't require changing the images names, both for Docker and Podman. They install the registry CA certificate in certs.d and set insecure = false when TLS is enabled. The cephadm role is updated to remove image management, which is now handled by the registry role, so cephadm is focused on Ceph cluster management. Contributes to seapath#442 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Fabien Dupont <fdupont@redhat.com>
6f951dc to
5799123
Compare
|
Thanks for the PR, this is an interesting and well-structured proposal 👍 A few points I’d like to clarify and discuss. 1️⃣ Fully disconnected is already possible in the current setupIn the current implementation, it is possible to be fully disconnected, provided that images are made available at OS installation time. For example, with
I assume a similar approach is feasible for:
So strictly speaking, the setup is not inherently “internet-dependent” if the images are preloaded properly. 2️⃣ The real issue: cephadm’s pull behaviorThe actual difficulty is not the base OS installation, but the behavior of Even if images are already present locally:
see https://marc.info/?l=ceph-users&m=164399318917018 To be truly disconnected, we therefore need:
Before deciding on registry topology, I would really like to confirm something:
If such an option exists (or could exist), we could:
Right now, the registry requirement seems to stem from cephadm enforcing the pull validation step. If you have more information on whether this behavior is configurable or patchable, that would be very helpful. 3️⃣ Registry location: node-local vs controller-basedRegarding the architectural choice:
Both are technically valid trade-offs:
From my perspective, either:
But I think we should make that decision explicitly rather than implicitly switching models. Summary
Looking forward to your feedback, especially regarding cephadm’s pull enforcement. |
In the current implementation, every node installs a registry locally and pull/push the cephadm image. However, this is neither truly disconnected as pull requires internet, nor resource efficient as a single registry is enough.
This commit introduces a registry role that deploys docker.io/registry:v2 and allows importing images from internet (pull) or from an exported tarball (load). The seapath_setup_disconnected.yaml playbook installs the registry on the Ansible control node as a singleton.
TLS is enabled by default: the registry auto-generates a self-signed CA and server certificate when no user-provided certs are given. The CA is distributed to all cluster nodes so they trust the registry over HTTPS. The registry listens on port 443 to avoid specifying the port in image names.
The *_physical_machine roles are updated to use that registry as a mirror, which doesn't require changing the images names, both for Docker and Podman. They install the registry CA certificate in certs.d and set insecure = false when TLS is enabled.
The cephadm role is updated to remove image management, which is now handled by the registry role, so cephadm is focused on Ceph cluster management.
Contributes to #442