-
Notifications
You must be signed in to change notification settings - Fork 522
Nodeservice etc host injection #1881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
An extension to the DNS configuration that defines a list of services and namespaces which are appended to the existing cluster image registry entry stored in the SERVICES environment variable for the dns-resolver daemonset running in the openshift-dns namespace. This allows node services to resolve cluster services and connect to them. Potential use cases are for augmenting monitoring leveraging a cluster hosted service for any other means within a node scoped process.
cdfae8f to
fbdf513
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@sdodson: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| - "@Miciah" | ||
| - "@candita" | ||
| - "@alebedev87" | ||
| - "@JoelSpeed" | ||
| - "@everettraven" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Miciah @candita @alebedev87 @JoelSpeed @everettraven PTAL
I'll fix up the markdown lint issues but otherwise I think this covers the basics for the proposed implementation.
| // namespace is the namespace of the service. | ||
| // This field is required and must not be empty. | ||
| // +kubebuilder:validation:Required | ||
| // +kubebuilder:validation:MinLength=1 | ||
| Namespace string `json:"namespace,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have an API PR to give feedback on, or would you prefer detailed API feedback here?
This will need a max length of 63, and a validation to ensure compatibility with DNS1123 label formatting and the appropriate description for that validation included in the godoc (we can copy/paste from other APIs for all of this)
| // name is the name of the service. | ||
| // This field is required and must not be empty. | ||
| // +kubebuilder:validation:Required | ||
| // +kubebuilder:validation:MinLength=1 | ||
| Name string `json:"name,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will need to be validated as DNS1035 label and the documentation updated to describe that validation
Can copy/paste this from other fields using the same validation
| // +listType=map | ||
| // +listMapKey=namespace | ||
| // +listMapKey=name | ||
| NodeServices []NodeService `json:"nodeServices,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will need both a minItems and maxItems, I think we previously discussed limiting this to something small (32?)
| 3. The operator populates the node-resolver DaemonSet `SERVICES` environment variable with the list of service names | ||
| - Format: Comma-separated and/or space-separated list of service names in `name.namespace.svc` format | ||
| - Example: `SERVICES="image-registry.openshift-image-registry.svc, datastore.coredump.svc"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this already exist? If not, do we need the ambiguity of comma OR space separation? Cleaner to pick one no?
| 4. The DNS node resolver pods read the `SERVICES` environment variable on startup | ||
| 5. The `update-node-resolver.sh` script runs on each node and: | ||
| - Parses the list of service names from the `SERVICES` environment variable (comma/space-separated) | ||
| - For each service, performs DNS lookups using `dig` to resolve the ClusterIP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For services, isn't the ClusterIP allocated on the container network at the point where the service is first handled? Wondering if the nodes actually need to dig vs just being told here's the IP on the container network (moving the lookup to the operator not the daemonset)
Or are we looking at host network level routing to the services here?
| - Flexible format: supports comma-separated, space-separated, or mixed delimiters | ||
|
|
||
| **Limitations:** | ||
| - **Service list changes require pod restarts**: When services are added to or removed from the `nodeServices` list in the DNS CR, the operator must update the DaemonSet `SERVICES` environment variable, triggering a rolling restart of all node resolver pods |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How often are these changes likely to happen?
| 4. The DNS node resolver pods read the `SERVICES` environment variable on startup | ||
| 5. The `update-node-resolver.sh` script runs on each node and: | ||
| - Parses the list of service names from the `SERVICES` environment variable (comma/space-separated) | ||
| - For each service, performs DNS lookups using `dig` to resolve the ClusterIP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given this is about providing an option prior to DNS services being completely operational, does this work? What DNS is going to be available at this point?
| - Efficient: ~1.4KB for 20 services, minimal impact on environment variable limits | ||
| - Flexible format: supports comma-separated, space-separated, or mixed delimiters | ||
|
|
||
| **Limitations:** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section appears to duplicate a number of the advantages?
| - Different nodes temporarily track different service lists during the rollout | ||
|
|
||
| **For service IP changes** (ClusterIP of a tracked service changes): | ||
| - The `update-node-resolver.sh` script detects the IP change during its next 60-second polling cycle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once the DNS infrastructure is up and running, does it make sense to continue to have these host entries as a fallback? CoreDNS has its own caching doesn't it, so once it has fetched each of these, it should be able to serve the other host components?
| - The DNS node resolver runs with elevated privileges (required to modify `/etc/hosts`) | ||
| - Only services with valid ClusterIPs will be added (no external IPs or LoadBalancer IPs) | ||
| - Service namespace and name validation prevents injection attacks | ||
| - The resolver will not add entries that conflict with existing system entries (e.g., localhost) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At what point are these entries validated/pruned?
Assisted by: Cursor (claude-sonnet-4.5)