Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 9 additions & 10 deletions pkg/cvo/availableupdates.go
Original file line number Diff line number Diff line change
Expand Up @@ -49,15 +49,7 @@ func (optr *Operator) syncAvailableUpdates(ctx context.Context, config *configv1

channel := config.Spec.Channel
desiredArch := optr.getDesiredArchitecture(config.Spec.DesiredUpdate)
currentArch := runtime.GOARCH

if optr.release.Architecture == configv1.ClusterVersionArchitectureMulti {
currentArch = "multi"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need this downcased form for Cincinnati, right? Here's Multi returning nothing, while multi returns 4.20.0:

$ curl -s 'https://api.openshift.com/api/upgrades_info/graph?channel=stable-4.20&arch=Multi' | jq .nodes
null
$ curl -s 'https://api.openshift.com/api/upgrades_info/graph?channel=stable-4.20&arch=multi' | jq -c '[.nodes[] | .version]'
["4.20.0"]

Copy link
Contributor Author

@DavidHurta DavidHurta Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, however, the currentArch variable is only used for the following comparison:

    if desiredArch == string(configv1.ClusterVersionArchitectureMulti) && currentArch != desiredArch {
        return current, []configv1.Release{current}, nil, nil
    }

We use the desiredArch for the query parameter, "which is downcased" before creating the query parameters at:

releaseArch := desiredArch
if desiredArch == string(configv1.ClusterVersionArchitectureMulti) {
releaseArch = "multi"
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to check in multi-arch CI: 4.21 nightly multi CI -> 4.21.0-0.nightly-multi-2025-10-30-034235 -> but both e2e-ovn-serial-aws-multi-a-a-1of2 and e2e-ovn-serial-aws-multi-a-a-2of2 are failing to bootstrap.

Moving back to 4.20.0-0.nightly-multi: 4.20.0-0.nightly-multi-2025-10-30-035942 -> e2e-ovn-serial-aws-multi-a-a-1of2 -> Artifacts -> ... -> e2e artifacts:

$ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/logs/periodic-ci-openshift-multiarch-master-nightly-4.20-ocp-e2e-ovn-serial-aws-multi-a-a-1of2/1983747935866720256/artifacts/ocp-e2e-ovn-serial-aws-multi-a-a/openshift-e2e-test/artifacts/e2e.log | grep 'upgrade recommend'
started: 0/4/45 "[Serial][sig-cli] oc adm upgrade recommend When the update service has conditional recommendations runs successfully with conditional recommendations to the --version target [Suite:openshift/conformance/serial]"
passed: (26.9s) 2025-10-30T05:58:16 "[Serial][sig-cli] oc adm upgrade recommend When the update service has conditional recommendations runs successfully with conditional recommendations to the --version target [Suite:openshift/conformance/serial]"
started: 0/24/45 "[Serial][sig-cli] oc adm upgrade recommend When the update service has no recommendations runs successfully [Suite:openshift/conformance/serial]"
passed: (22s) 2025-10-30T06:22:53 "[Serial][sig-cli] oc adm upgrade recommend When the update service has no recommendations runs successfully [Suite:openshift/conformance/serial]"

which means this test logic is happy about the CVO showing the recommended update to the sha256:cccc... release. Checking gather-extra pod logs:

$ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/logs/periodic-ci-openshift-multiarch-master-nightly-4.20-ocp-e2e-ovn-serial-aws-multi-a-a-1of2/1983747935866720256/artifacts/ocp-e2e-ovn-serial-aws-multi-a-a/gather-extra/artifacts/pods/openshift-cluster-version_cluster-version-operator-f89bdbd46-sln5x_cluster-version-operator.log | grep '1030 05:5[78].* \(cincinnati\|availableupdates\)'
I1030 05:57:50.548607       1 availableupdates.go:83] Retrieving available updates again, because more than 2m35.488910548s has elapsed since last attempt at 2025-10-30T05:54:37Z
I1030 05:57:56.533407       1 availableupdates.go:77] Retrieving available updates again, because the channel has changed from "" to "test-channel"
I1030 05:57:56.537343       1 cincinnati.go:114] Using a root CA pool with 0 root CA subjects to request updates from https://api.openshift.com/api/upgrades_info/v1/graph?arch=multi&channel=test-channel&id=021db8b4-7df4-4870-a035-24fc93f70d05&version=4.20.0-0.nightly-multi-2025-10-29-210751
I1030 05:57:56.800389       1 availableupdates.go:398] Update service https://api.openshift.com/api/upgrades_info/v1/graph could not return available updates: VersionNotFound: currently reconciling cluster version 4.20.0-0.nightly-multi-2025-10-29-210751 not found in the "test-channel" channel
I1030 05:57:56.811990       1 availableupdates.go:98] Available updates were recently retrieved, with less than 2m35.488910548s elapsed since 2025-10-30T05:57:56Z, will try later.
I1030 05:57:57.042846       1 availableupdates.go:103] Retrieving available updates again, because the update service has changed from "" to "http://172.30.166.91:8000/graph" from ClusterVersion spec.upstream
I1030 05:57:57.045682       1 cincinnati.go:114] Using a root CA pool with 0 root CA subjects to request updates from http://172.30.166.91:8000/graph?arch=multi&channel=test-channel&id=021db8b4-7df4-4870-a035-24fc93f70d05&version=4.20.0-0.nightly-multi-2025-10-29-210751
I1030 05:57:57.101065       1 availableupdates.go:98] Available updates were recently retrieved, with less than 2m35.488910548s elapsed since 2025-10-30T05:57:57Z, will try later.
I1030 05:58:06.331303       1 availableupdates.go:98] Available updates were recently retrieved, with less than 2m35.488910548s elapsed since 2025-10-30T05:57:57Z, will try later.
I1030 05:58:16.883450       1 availableupdates.go:103] Retrieving available updates again, because the update service has changed from "http://172.30.166.91:8000/graph" to "https://api.openshift.com/api/upgrades_info/v1/graph" from the operator's default update service
I1030 05:58:16.886723       1 cincinnati.go:114] Using a root CA pool with 0 root CA subjects to request updates from https://api.openshift.com/api/upgrades_info/v1/graph?arch=multi&channel=test-channel&id=021db8b4-7df4-4870-a035-24fc93f70d05&version=4.20.0-0.nightly-multi-2025-10-29-210751
I1030 05:58:17.152154       1 availableupdates.go:398] Update service https://api.openshift.com/api/upgrades_info/v1/graph could not return available updates: VersionNotFound: currently reconciling cluster version 4.20.0-0.nightly-multi-2025-10-29-210751 not found in the "test-channel" channel
I1030 05:58:17.164073       1 availableupdates.go:98] Available updates were recently retrieved, with less than 2m35.488910548s elapsed since 2025-10-30T05:58:17Z, will try later.

well, I can see arch=multi in there, and it's not talking about VersionNotFound when pulling from the e2e test's 172.30.166.91. Would be nice if it logged this single->multi transition branch. Ah here is cincinnati.go doing the Multi -> multi downcasing, so that's one bit I'd missed earlier.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, the multi CI is passing because those tests aren't setting spec.desired.architecture: Multi. We should grow some CI that exercises that.

}

if desiredArch == "" {
desiredArch = currentArch
}
currentArch := optr.getCurrentArchitecture()

// updates are only checked at most once per minimumUpdateCheckInterval or if the generation changes
optrAvailableUpdates := optr.getAvailableUpdates()
Expand Down Expand Up @@ -330,7 +322,14 @@ func (optr *Operator) getDesiredArchitecture(update *configv1.Update) string {
if update != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be if update != nil && len(update.Architecture) > 0? Otherwise, we could return an empty string here, and we'd rather fall back to getCurrentArchitecture, because your syncAvailableUpdates change is dropping the old:

if desiredArch == "" {
	desiredArch = currentArch
}

return string(update.Architecture)
}
return ""
return optr.getCurrentArchitecture()
}

func (optr *Operator) getCurrentArchitecture() string {
if optr.release.Architecture == configv1.ClusterVersionArchitectureMulti {
return string(configv1.ClusterVersionArchitectureMulti)
}
return runtime.GOARCH
}

func calculateAvailableUpdatesStatus(ctx context.Context, clusterID string, transport *http.Transport, userAgent, updateService, desiredArch,
Expand Down
72 changes: 66 additions & 6 deletions pkg/cvo/availableupdates_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -111,22 +111,27 @@ func osusWithSingleConditionalEdge() (*httptest.Server, clusterconditions.Condit
return osus, mockPromql, updates, from
}

func newOperator(url, version string, promqlMock clusterconditions.Condition) (*availableUpdates, *Operator) {
currentRelease := configv1.Release{Version: version, Image: "payload/" + version}
func newOperator(url, version string, promqlMock clusterconditions.Condition, arch string) (*availableUpdates, *Operator) {
var currentReleaseArch configv1.ClusterVersionArchitecture
if arch == string(configv1.ClusterVersionArchitectureMulti) {
currentReleaseArch = configv1.ClusterVersionArchitectureMulti
}
currentRelease := configv1.Release{Version: version, Image: "payload/" + version, Architecture: currentReleaseArch}

registry := clusterconditions.NewConditionRegistry()
registry.Register("Always", &always.Always{})
registry.Register("PromQL", promqlMock)
operator := &Operator{
updateService: url,
architecture: "amd64",
architecture: arch,
proxyLister: notFoundProxyLister{},
cmConfigManagedLister: notFoundConfigMapLister{},
conditionRegistry: registry,
queue: workqueue.NewTypedRateLimitingQueue[any](workqueue.DefaultTypedControllerRateLimiter[any]()),
release: currentRelease,
}
availableUpdates := &availableUpdates{
Architecture: runtime.GOARCH,
Architecture: arch,
Current: configv1.Release{Version: version, Image: "payload/" + version},
}
return availableUpdates, operator
Expand All @@ -149,7 +154,7 @@ var availableUpdatesCmpOpts = []cmp.Option{
func TestSyncAvailableUpdates(t *testing.T) {
fakeOsus, mockPromql, expectedConditionalUpdates, version := osusWithSingleConditionalEdge()
defer fakeOsus.Close()
expectedAvailableUpdates, optr := newOperator(fakeOsus.URL, version, mockPromql)
expectedAvailableUpdates, optr := newOperator(fakeOsus.URL, version, mockPromql, runtime.GOARCH)
expectedAvailableUpdates.UpdateService = fakeOsus.URL
expectedAvailableUpdates.ConditionalUpdates = expectedConditionalUpdates
expectedAvailableUpdates.Channel = cvFixture.Spec.Channel
Expand Down Expand Up @@ -231,7 +236,7 @@ func TestSyncAvailableUpdates_ConditionalUpdateRecommendedConditions(t *testing.
t.Run(tc.name, func(t *testing.T) {
fakeOsus, mockPromql, conditionalUpdates, version := osusWithSingleConditionalEdge()
defer fakeOsus.Close()
availableUpdates, optr := newOperator(fakeOsus.URL, version, mockPromql)
availableUpdates, optr := newOperator(fakeOsus.URL, version, mockPromql, runtime.GOARCH)
optr.availableUpdates = availableUpdates
optr.availableUpdates.ConditionalUpdates = conditionalUpdates
expectedConditions := []metav1.Condition{{}}
Expand Down Expand Up @@ -435,3 +440,58 @@ func TestEvaluateConditionalUpdate(t *testing.T) {
})
}
}

func TestSyncAvailableUpdatesMultiArchAfterMigration(t *testing.T) {
fakeOsus, mockPromql, expectedConditionalUpdates, version := osusWithSingleConditionalEdge()
defer fakeOsus.Close()

expectedAvailableUpdates, optr := newOperator(fakeOsus.URL, version, mockPromql, "Multi")

cv := cvFixture.DeepCopy()
cv.Spec.DesiredUpdate = &configv1.Update{
Architecture: configv1.ClusterVersionArchitectureMulti,
Version: version,
Image: optr.release.Image,
Force: false,
}

expectedAvailableUpdates.UpdateService = fakeOsus.URL
expectedAvailableUpdates.ConditionalUpdates = expectedConditionalUpdates
expectedAvailableUpdates.Channel = cv.Spec.Channel
expectedAvailableUpdates.Condition = configv1.ClusterOperatorStatusCondition{
Type: configv1.RetrievedUpdates,
Status: configv1.ConditionTrue,
}

err := optr.syncAvailableUpdates(context.Background(), cv)

if err != nil {
t.Fatalf("syncAvailableUpdates() unexpected error: %v", err)
}
if diff := cmp.Diff(expectedAvailableUpdates, optr.availableUpdates, availableUpdatesCmpOpts...); diff != "" {
t.Fatalf("available updates differ from expected:\n%s", diff)
}
}

func TestSyncAvailableUpdatesMultiArchAfterMigrationDesiredUpdateNil(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: looks like this could be collapsed with TestSyncAvailableUpdatesMultiArchAfterMigration into a single TestSyncAvailableUpdatesMultiArch function that iterated over (test-case name, desired, expected) tuples, to reduce duplication, and make it easy to add a test-case for desired set, but architecture unset.

fakeOsus, mockPromql, expectedConditionalUpdates, version := osusWithSingleConditionalEdge()
defer fakeOsus.Close()

expectedAvailableUpdates, optr := newOperator(fakeOsus.URL, version, mockPromql, "Multi")
expectedAvailableUpdates.UpdateService = fakeOsus.URL
expectedAvailableUpdates.ConditionalUpdates = expectedConditionalUpdates
expectedAvailableUpdates.Channel = cvFixture.Spec.Channel
expectedAvailableUpdates.Condition = configv1.ClusterOperatorStatusCondition{
Type: configv1.RetrievedUpdates,
Status: configv1.ConditionTrue,
}

err := optr.syncAvailableUpdates(context.Background(), cvFixture)

if err != nil {
t.Fatalf("syncAvailableUpdates() unexpected error: %v", err)
}
if diff := cmp.Diff(expectedAvailableUpdates, optr.availableUpdates, availableUpdatesCmpOpts...); diff != "" {
t.Fatalf("available updates differ from expected:\n%s", diff)
}
}