Skip to content

Add status.conditions to DiscoveredCluster#1988

Merged
openshift-merge-bot[bot] merged 12 commits intostolostron:mainfrom
dislbenn:add-status-conditions
May 1, 2026
Merged

Add status.conditions to DiscoveredCluster#1988
openshift-merge-bot[bot] merged 12 commits intostolostron:mainfrom
dislbenn:add-status-conditions

Conversation

@dislbenn
Copy link
Copy Markdown
Contributor

@dislbenn dislbenn commented Apr 29, 2026

Description

Implements proper Kubernetes status subresource with conditions to provide detailed observability for DiscoveredCluster state. Replaces custom condition types with standard metav1.Condition for better K8s ecosystem compatibility.

Related Issue

N/A - Enhancement to provide better status observability

Changes Made

Status Structure

Replaced custom DiscoveredClusterCondition with standard metav1.Condition:

Before:

type DiscoveredClusterCondition struct {
    Type DiscoveredClusterConditionType
    Status metav1.ConditionStatus
    LastUpdateTime metav1.Time
    LastTransitionTime metav1.Time
}

After:

Conditions []metav1.Condition  // Standard K8s condition type

Condition Types

  1. Available - Tracks cluster availability based on OCM telemetry

    • Status=True + Reason=RecentTelemetry: Cluster is Active
    • Status=False + Reason=StaleTelemetry: Cluster is Stale
    • Message: Includes last telemetry timestamp for troubleshooting
  2. Managed - Tracks import status

    • Status=True + Reason=ImportedAsManagedCluster: Cluster imported
    • Status=False + Reason=NotImported: Not imported

Controller Updates

  • Added updateStatus() method to populate conditions during reconcile
  • Added buildStatusConditions() to construct conditions from cluster state
  • Conditions updated on every reconcile cycle
  • Only updates status if conditions changed (avoids unnecessary writes)

Example Status

status:
  conditions:
  - type: Available
    status: "False"
    reason: StaleTelemetry
    message: "Cluster is stale. Last telemetry: 2026-04-29 05:26:57 UTC"
    lastTransitionTime: "2026-04-29T19:00:00Z"
    observedGeneration: 1
    
  - type: Managed
    status: "False"
    reason: NotImported
    message: "Cluster has not been imported"
    lastTransitionTime: "2026-04-29T19:00:00Z"
    observedGeneration: 1

Benefits

Better observability - Clear explanation of why cluster is Active/Stale
Troubleshooting context - Timestamp shows when last seen
Multiple states - Tracks availability AND managed status separately
K8s best practice - Uses standard metav1.Condition
Automation friendly - kubectl wait --for=condition=Available
kubectl describe - Rich status information visible to users

Checklist

  • I have tested the changes locally and they are functioning as expected.
  • I have updated the documentation (if necessary) to reflect the changes.
  • I have added/updated relevant unit tests (if applicable).
  • I have ensured that my code follows the project's coding standards.
  • I have checked for any potential security issues and addressed them.
  • I have added necessary comments to the code, especially in complex or unclear sections.
  • I have rebased my branch on top of the latest main/master branch.

Additional Notes

This change maintains backward compatibility - spec.status still exists and is populated from OCM. The status.conditions provide additional context and details.

Users can now run:

kubectl wait --for=condition=Available discoveredcluster/my-cluster
kubectl describe discoveredcluster/my-cluster  # Shows detailed status

Reviewers

/cc @ngraham20 @cameronmwall

Definition of Done

  • Code is reviewed.
  • Code is tested.
  • Documentation is updated.
  • All checks and tests pass.
  • Approved by at least one reviewer.
  • Merged into the main/master branch.

Summary by CodeRabbit

  • New Features

    • Enhanced DiscoveredCluster status conditions: added message, reason, and observedGeneration; removed lastUpdateTime; conditions now keyed by type (map semantics) for stable merging; improved availability/managed condition semantics and observability.
  • Chores

    • Bumped Operator SDK builder metadata to v1.42.2 and expanded install RBAC to allow reading OpenShift apiserver configuration.
  • Tests

    • Added unit tests covering status/condition construction and equality logic.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 29, 2026

Warning

Rate limit exceeded

@dislbenn has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 49 minutes and 35 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: e79390d8-7bdc-4f01-8fe4-287cbd25e573

📥 Commits

Reviewing files that changed from the base of the PR and between 9ce1f8c and 4a4e240.

📒 Files selected for processing (7)
  • .tekton/discovery-operator-main-unit-test.yaml
  • .tekton/discovery-operator-mce-217-pull-request.yaml
  • .tekton/discovery-operator-mce-217-push.yaml
  • bundle/manifests/discovery.clusterserviceversion.yaml
  • bundle/manifests/discovery.open-cluster-management.io_discoveredclusters.yaml
  • controllers/discoveredcluster_controller.go
  • controllers/discoveredcluster_controller_test.go
📝 Walkthrough

Walkthrough

Refactors DiscoveredCluster conditions from an enum to string-based types, removes LastUpdateTime, adds Message/Reason/ObservedGeneration, updates CRD schemas and generated deepcopy logic, bumps bundle metadata, and adds controller logic/tests to compute and reconcile status conditions each reconcile loop.

Changes

Cohort / File(s) Summary
API Type Definition
api/v1/discoveredcluster_types.go
Switched DiscoveredClusterCondition.Type to string; removed LastUpdateTime; added Message, Reason, ObservedGeneration; removed DiscoveredClusterConditionType; added ConditionAvailable, ConditionManaged, and reason constants; kubebuilder merge/patch annotations added to Status.Conditions.
Generated Code
api/v1/zz_generated.deepcopy.go
Adjusted deepcopy behavior to account for removed LastUpdateTime; enhanced Filter deepcopy to deep-copy slices (ClusterTypes, InfrastructureProviders, Regions).
CRD Schema
bundle/manifests/..._discoveredclusters.yaml, config/crd/bases/..._discoveredclusters.yaml
CRD updated: removed spec.owner→added spec.provenance, spec.supportLevel, spec.usage; status.conditions items remove lastUpdateTime, add message, reason, observedGeneration; conditions list changed to map keyed by type and made type required.
Bundle Metadata
bundle.Dockerfile, bundle/metadata/annotations.yaml, bundle/manifests/discovery.clusterserviceversion.yaml
Bumped operator-sdk builder label/annotation from operator-sdk-v1.41.1operator-sdk-v1.42.2; updated CSV createdAt timestamp; extended CSV RBAC to include read/list/watch for OpenShift config.openshift.io apiservers.
Reconciliation Logic
controllers/discoveredcluster_controller.go
Added status update flow: fetch fresh resource, build desired Status.Conditions (Available/Managed) from spec (Status, ActivityTimestamp, IsManagedCluster), compare by Type/Status/Reason/Message/ObservedGeneration, and update status subresource when changed.
Tests
controllers/discoveredcluster_controller_test.go
New tests for buildStatusConditions and conditionEqual: validate Available/Managed semantics, reasons/messages, and ObservedGeneration usage; ensure equality ignores LastTransitionTime differences.

Sequence Diagram(s)

sequenceDiagram
  participant Reconciler as Reconciler
  participant API as Kubernetes API
  participant Status as Status Subresource
  rect rgba(200,230,201,0.5)
  Reconciler->>API: GET DiscoveredCluster
  API-->>Reconciler: DiscoveredCluster object
  Reconciler->>Reconciler: buildStatusConditions(spec, meta)
  Reconciler->>API: GET latest DiscoveredCluster (fresh copy)
  API-->>Reconciler: fresh object
  Reconciler->>Status: compare & PATCH status (if changed)
  Status-->>Reconciler: 200 OK / conflict error
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 I nibble at enums, strings now take flight,

Message and Reason make status feel right,
Reconciler hops, checks each condition,
ObservedGeneration marks the transition,
A carrot of clarity—hoppity delight!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding status.conditions to DiscoveredCluster, which is the central objective of this PR.
Description check ✅ Passed The PR description is comprehensive and well-structured, covering purpose, changes, examples, benefits, checklist completion, and additional notes that align with the template requirements.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 49 minutes and 35 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot added the approved PR approval has been given label Apr 29, 2026
Implemented proper Kubernetes status subresource with conditions to track
cluster state. Replaces custom condition type with standard metav1.Condition
for better K8s ecosystem compatibility.

Changes:
- Replaced custom DiscoveredClusterCondition with metav1.Condition
- Added condition types: Available, Managed
- Added condition reasons: RecentTelemetry, StaleTelemetry, ImportedAsManagedCluster, NotImported
- Updated DiscoveredClusterStatus to use []metav1.Condition
- Added updateStatus() to populate conditions in reconcile loop
- Available condition tracks Active/Stale state from OCM with timestamp details
- Managed condition tracks whether cluster is imported as ManagedCluster
- Regenerated deepcopy and CRDs

Benefits:
- Explains WHY cluster is Active/Stale (no recent telemetry)
- Shows WHEN last seen (timestamp in message)
- Tracks managed state separately from availability
- kubectl describe shows full troubleshooting context
- Compatible with kubectl wait --for=condition

Signed-off-by: dislbenn <dbennett@redhat.com>
Added comprehensive test coverage for the new status condition functions:
- Test_Reconciler_buildStatusConditions: Tests condition generation for
  Active/Stale and Managed/NotManaged states
- Test_conditionEqual: Tests condition comparison logic

Updated bundle/manifests/discovery.open-cluster-management.io_discoveredclusters.yaml
to include the new status.conditions schema.

All tests passing.

Signed-off-by: dislbenn <dbennett@redhat.com>
Signed-off-by: dislbenn <dbennett@redhat.com>
@dislbenn dislbenn force-pushed the add-status-conditions branch from e48bb6e to 251d89b Compare April 29, 2026 19:27
When a DiscoveredCluster is deleted during reconciliation, the
updateStatus function should not treat it as an error. This can happen
in test scenarios where resources are cleaned up while reconcile is
in progress.

Changes:
- Added NotFound error handling when fetching fresh resource copy
- Added NotFound error handling when updating status subresource

Fixes test failure in: Test_DiscoveredCluster_Reconciler_Reconcile/should_create_auto_import_Secret_object

Signed-off-by: dislbenn <dbennett@redhat.com>
metav1.Condition has required fields (type, status, reason, message,
lastTransitionTime) baked into the upstream Kubernetes API definition.
To allow all condition fields to be optional, we need a custom type.

Changes:
- Defined custom DiscoveredClusterCondition struct with +optional markers
- Replaced []metav1.Condition with []DiscoveredClusterCondition in status
- Updated controller buildStatusConditions to return custom type
- Updated conditionEqual helper function signature
- Updated all tests to use custom condition type
- Regenerated CRDs and bundle manifests

Result: No required fields in conditions schema
Signed-off-by: dislbenn <dbennett@redhat.com>
Kubernetes requires that fields used in x-kubernetes-list-map-keys must
either be required or have a default value. Since 'type' is the key field
for merging conditions, it must be required.

Changes:
- Removed +optional marker from DiscoveredClusterCondition.Type
- Removed omitempty from Type json tag
- Regenerated CRDs with required: [type]

This fixes the KinD test failure:
'this property is in x-kubernetes-list-map-keys, so it must have a
default or be a required property'

Signed-off-by: dislbenn <dbennett@redhat.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@controllers/discoveredcluster_controller.go`:
- Around line 137-150: The updateStatus function currently computes
newConditions from the stale reconcile snapshot dc; re-read the resource into
fresh and use that fresh object when calling r.buildStatusConditions and when
setting ObservedGeneration/status fields so the status reflects the latest
spec/generation; update all references in updateStatus from dc to fresh (e.g.,
call r.buildStatusConditions(ctx, fresh) and use
fresh.Generation/ObservedGeneration) to avoid writing stale reasons/messages.
- Around line 149-177: The new conditions currently replace
fresh.Status.Conditions wholesale (built by buildStatusConditions) and reset
lastTransitionTime for every condition; change the logic so that after obtaining
newConditions you iterate over newConditions and for each entry find the
matching condition by Type in fresh.Status.Conditions and, if the Status value
is identical, copy the existing condition's LastTransitionTime into the new
condition (only update LastTransitionTime when the Status value actually
changes). You can implement this either inside buildStatusConditions (pass the
old slice to reuse timestamps) or immediately after calling
buildStatusConditions before assigning fresh.Status.Conditions; refer to
buildStatusConditions, fresh.Status.Conditions and the equality check
(conditionEqual) to locate the relevant code paths.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: e1374216-9d2c-4a87-a927-9b39fc993765

📥 Commits

Reviewing files that changed from the base of the PR and between 3e07e53 and 13eb608.

📒 Files selected for processing (9)
  • api/v1/discoveredcluster_types.go
  • api/v1/zz_generated.deepcopy.go
  • bundle.Dockerfile
  • bundle/manifests/discovery.clusterserviceversion.yaml
  • bundle/manifests/discovery.open-cluster-management.io_discoveredclusters.yaml
  • bundle/metadata/annotations.yaml
  • config/crd/bases/discovery.open-cluster-management.io_discoveredclusters.yaml
  • controllers/discoveredcluster_controller.go
  • controllers/discoveredcluster_controller_test.go

Comment thread controllers/discoveredcluster_controller.go
Comment thread controllers/discoveredcluster_controller.go Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/v1/discoveredcluster_types.go`:
- Around line 114-132: The condition fields Status, LastTransitionTime, Message
and Reason are incorrectly marked optional; remove the "+optional" markers and
the `omitempty` json tags for Status, LastTransitionTime, Message and Reason in
the condition struct (the fields named Status, LastTransitionTime, Message,
Reason on the condition type in discoveredcluster_types.go) so they are required
per KEP-1623; keep ObservedGeneration as appropriate, then regenerate
CRDs/bundle (run the project's codegen/crd generation) to reflect the required
fields.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 45f36a22-ba3b-45e0-942c-aa254a18bef1

📥 Commits

Reviewing files that changed from the base of the PR and between 13eb608 and 9ce1f8c.

📒 Files selected for processing (4)
  • api/v1/discoveredcluster_types.go
  • bundle/manifests/discovery.clusterserviceversion.yaml
  • bundle/manifests/discovery.open-cluster-management.io_discoveredclusters.yaml
  • config/crd/bases/discovery.open-cluster-management.io_discoveredclusters.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
  • bundle/manifests/discovery.clusterserviceversion.yaml

Comment thread api/v1/discoveredcluster_types.go
dislbenn and others added 2 commits April 30, 2026 09:41
Regenerated bundle to include Provenance, SupportLevel, and Usage fields
added in stolostron#1987.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: dislbenn <dbennett@redhat.com>
- Use fresh resource copy in buildStatusConditions instead of stale dc
- Preserve LastTransitionTime when condition Status unchanged
- Only update LastTransitionTime when Status actually changes

Addresses CodeRabbit review comments on PR stolostron#1988.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: dislbenn <dbennett@redhat.com>
Signed-off-by: dislbenn <dbennett@redhat.com>
dislbenn and others added 2 commits April 30, 2026 11:25
Previously, status.conditions were only populated on ROSA clusters because
the controller's event filter limited reconciliation to ROSA and MCE-HCP types.
This left OCP, ARO, OSD, and other cluster types without status conditions.

Changes:
- Removed ShouldReconcile predicate and WithEventFilter from SetupWithManager
- Controller now reconciles all DiscoveredCluster types for status updates
- Auto-import logic remains safely restricted to ROSA/MCE-HCP via:
  * Webhook validation (rejects importAsManagedCluster on unsupported types)
  * Reconcile guard (if ImportAsManagedCluster check)
  * Switch statement (default case skips non-ROSA/MCE-HCP)
- Removed Test_Reconciler_ShouldReconcile (function deleted)
- Removed unused event/predicate imports
- Updated function docstrings from "..." placeholders

Result: All cluster types now receive Available/Managed status conditions

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: dislbenn <dbennett@redhat.com>
Changed if-else chain to tagged switch for clearer auth method handling.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: dislbenn <dbennett@redhat.com>
@cameronmwall
Copy link
Copy Markdown
Contributor

/lgtm

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 1, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cameronmwall, dislbenn

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [cameronmwall,dislbenn]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot Bot merged commit 0118479 into stolostron:main May 1, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved PR approval has been given dco-signoff: yes lgtm

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants