Conversation
| if err != nil { | ||
| if apierrors.IsNotFound(err) { | ||
| log.V(1). | ||
| Info("DaemonSet not found, will set INSTANA_PERSIST_HOST_UNIQUE_ID", "daemonset", dsName) | ||
| return true, reconcileContinue() | ||
| } | ||
| // On other errors, log and default to not setting it to be safe | ||
| log.Error( | ||
| err, | ||
| "failed to check existing DaemonSet, will not set INSTANA_PERSIST_HOST_UNIQUE_ID", | ||
| "daemonset", | ||
| dsName, | ||
| ) | ||
| return false, reconcileContinue() | ||
| } |
There was a problem hiding this comment.
When r.client.Get fails for reasons other than NotFound (for example transient API server/network errors), this code logs and returns false with reconcileContinue(), so reconciliation proceeds with a desired DaemonSet that omits INSTANA_PERSIST_HOST_UNIQUE_ID. In that failure mode, an existing DaemonSet that already had the env var can be patched to remove it, which violates the intended "keep if already present" behavior and can change host identity persistence unexpectedly; this path should return a reconcile failure/requeue instead of continuing.
There was a problem hiding this comment.
I pushed a potential patch
|
Can you please check the test case results? Looks like the e2e test is not working yet: |
|
We were waiting on the agent release that ran yesterday to be able to run this release. Tested the changes locally in the OCP cluster: With the new volume mount, the file is present on the pod: After restarting the pod, the ID is persisted on the node. Retriggering the pipeline |
06d517d to
f3ac56c
Compare
Signed-off-by: Milica Cvrkota <milica.cvrkota@ibm.com>
Signed-off-by: Milica Cvrkota <milica.cvrkota@ibm.com>
Return reconcile failure when reading an existing daemonset fails with non-NotFound errors, and propagate that failure in zoned daemonset builder selection to prevent unintended env var removal. Add tests for helper failure behavior, zoned builder failure propagation, and apply short-circuit when daemonset reads fail. Signed-off-by: Konrad Ohms <konrad.ohms@de.ibm.com> Signed-off-by: Milica Cvrkota <milica.cvrkota@ibm.com>
Signed-off-by: Milica Cvrkota <milica.cvrkota@ibm.com>
Signed-off-by: Milica Cvrkota <milica.cvrkota@ibm.com>
Signed-off-by: Milica Cvrkota <milica.cvrkota@ibm.com>
Signed-off-by: Milica Cvrkota <milica.cvrkota@ibm.com>
Signed-off-by: Milica Cvrkota <milica.cvrkota@ibm.com>
Signed-off-by: Milica Cvrkota <milica.cvrkota@ibm.com>
Signed-off-by: Milica Cvrkota <milica.cvrkota@ibm.com>
Signed-off-by: Milica Cvrkota <milica.cvrkota@ibm.com>
Signed-off-by: Milica Cvrkota <milica.cvrkota@ibm.com>
Summary
This PR enables agent ID persistence across pod restarts by automatically setting the
INSTANA_PERSIST_HOST_UNIQUE_IDenvironment variable on new deployments and mounting/var/lib/instanafor storage. The changes are upgrade-safe and preserve existing configurations.Changes
Environment Variable Management:
INSTANA_PERSIST_HOST_UNIQUE_ID=trueagent.pod.envtake precedenceVolume Configuration:
/var/lib/instanavolume mount withDirectoryOrCreatetype/var/lib/instana(not entire/var/lib) to minimize CVE exposure/var/lib/instana/instana-agent-idImplementation:
Testing
References
Checklist
Note: Remember to run a helm chart release after the operator release to make the changes available through helm.