Skip to content

Conversation

@psalajova
Copy link
Contributor

@psalajova psalajova commented Jan 14, 2026

This PR implements support for the 3-level GSM secret naming structure (collection__group__field) in ci-operator, as per the design document.

Changes

3-level hierarchy

Secrets now use collection__group__field naming instead of collection__name:
Before:

credentials:
  - name: aws-creds
    collection: test-platform-infra
    mount_path: /tmp/aws

After:

credentials:
  - collection: test-platform-infra
    group: cluster-init
    field: aws-creds
    mount_path: /tmp/aws

Supported credential resolution modes

There are three ways one can reference credentials:

  1. Bundle reference (via gsm-config.yaml):
credentials:
  - bundle: aws-bundle
    mount_path: /tmp/aws
  1. Auto-discovery (lists all fields for collection+group):
credentials:
  - collection: test-platform-infra
    group: cluster-init
    mount_path: /tmp/secrets
  1. Explicit field:
credentials:
  - collection: test-platform-infra
    group: cluster-init
    field: aws-creds
    mount_path: /tmp/aws

Implementation details

  • Wired gsmConfig through main.go → defaults → multi_stage chain
  • Credentials are resolved once during createSPCs() before SecretProviderClass creation
  • Added validation to prevent file collisions when different groups share the same mount path
  • Censoring SPCs now track credentials by full secret name (collection__group__field)
  • Updated all tests to use 3-level structure
  • A separate PR needs to be done which will make sure ci-operator has gsm-config.yaml available at all times, as this is critical for bundle resolution.

Jira: https://issues.redhat.com/browse/DPTP-4656

@openshift-ci-robot
Copy link
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@coderabbitai
Copy link

coderabbitai bot commented Jan 14, 2026

Walkthrough

This change integrates Google Secret Manager (GSM) support into ci-tools infrastructure. It migrates GSM types from the secretbootstrap package to the api package, extends the CredentialReference API with bundle/group/field fields, adds GSM configuration to multi-stage test steps, implements credential reference resolution through GSM bundles, and provides index secret creation/updating capabilities for GSM sync workflows.

Changes

Cohort / File(s) Summary
Command-line Tool GSM Integration
cmd/ci-operator/main.go
Added GSM config path, credentials file flags, and config loading when Secrets Store CSI driver enabled; passes GSM config to renderer.
CI Secret Bootstrap Migration
cmd/ci-secret-bootstrap/main.go, cmd/ci-secret-bootstrap/main_test.go
Migrated all GSM-related type references from secretbootstrap.* to api.*; updated function signatures for GSM config, bundle filtering, and validation; aliased vault API import to disambiguate.
CI Secret Generator GSM Sync
cmd/ci-secret-generator/main.go, cmd/ci-secret-generator/main_test.go
Added GSM sync flag to updateSecrets signature; introduced index secret construction and update calls; added comprehensive test coverage for index secret creation with various scenarios and error handling.
API Package Type Migrations
pkg/api/gsm.go, pkg/api/gsm_test.go
Renamed Bundle to GSMBundle, updated field types, package declaration changed from secretbootstrap to api.
CredentialReference API Extension
pkg/api/types.go
Added Bundle, Group, Field fields to CredentialReference; changed Collection to use omitempty; added IsAutoDiscovery(), IsExplicitField(), IsBundleReference() methods.
Deepcopy Code Generation
pkg/api/zz_generated.deepcopy.go
Added autogenerated DeepCopyInto() and DeepCopy() methods for DockerConfigSpec, FieldEntry, GSMBundle, GSMConfig, GSMSecretRef, RegistryAuthData, TargetSpec.
Defaults Flow GSM Wiring
pkg/defaults/defaults.go, pkg/defaults/defaults_test.go
Extended FromConfig and fromConfig signatures to include GSM config and credentials file; added GSM client initialization when CSI driver enabled; threaded GSM configuration through multi-stage step creation.
GSM Secrets Management
pkg/gsm-secrets/secrets.go, pkg/gsm-secrets/types.go, pkg/gsm-secrets/execution.go
Added ConstructIndexSecretContent() function and UpdaterSASecretName constant; updated index secret creation to use new content builder.
Secrets Client Interface & Mocks
pkg/secrets/client.go, pkg/secrets/client_mock.go
Added UpdateIndexSecret() method to Client interface; generated comprehensive GoMock implementations for all three client interfaces with full recorder support.
GSM and Vault Client Implementations
pkg/secrets/gsm.go, pkg/secrets/vault.go
Implemented UpdateIndexSecret() on both gsmSyncDecorator and vault clients; updated GSM secret naming to use generalized scheme; added TestPlatformCollection constant; updated SetFieldOnItem to propagate to GSM.
Multi-stage CSI Utilities & Bundle Resolution
pkg/steps/multi_stage/csi_utils.go, pkg/steps/multi_stage/csi_utils_test.go
Replaced credential-name-based paths with GSM-secret-name-based paths; renamed replaceForbiddenSymbolsInCredentialName to restoreForbiddenSymbolsInSecretName; updated validation error messages; refactored SPC naming to use credential fields.
GSM Bundle Resolver
pkg/steps/multi_stage/gsm_bundle_resolver.go
New file introducing bundle-based credential resolution with three pathways: bundle expansion, auto-discovery via GSM, and explicit field specification; includes mount path collision validation.
Multi-stage Step Execution
pkg/steps/multi_stage/multi_stage.go, pkg/steps/multi_stage/multi_stage_test.go, pkg/steps/multi_stage/run_test.go
Added GSMConfiguration struct; extended MultiStageTestStep and newMultiStageTestStep signatures with GSM parameter; added GSM client lifecycle management and validation in Run().
Multi-stage Init & Tests
pkg/steps/multi_stage/init.go, pkg/steps/multi_stage/init_test.go, pkg/steps/multi_stage/gen_test.go
Pre-resolves credential references via GSM; updated SPC creation to use resolved credentials; deduplicates by full GSM secret name; updated test fixtures to use new credential representation with Group/Field.
Web Registry Reference Documentation
pkg/webreg/zz_generated.ci_operator_reference.go
Updated credential entry templates from collection-based keys to bundle-based schema with explicit field and group specifications.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 14, 2026
@openshift-ci openshift-ci bot requested review from Prucek and hector-vido January 14, 2026 17:13
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 14, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: psalajova

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 14, 2026
Comment on lines 112 to 115
gsmClient *secretmanager.Client
gsmCredentialsFile string
gsmConfig *api.GSMConfig
gsmProjectConfig gsm.Config
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to send all these params to the multistage step? Couldn't it just use the gsmClient that would be created in defaults maybe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you have a point, I don't like this amount of new parameters as well... I'll look into it, maybe I can simplify it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved them all into a common struct

@psalajova psalajova force-pushed the ci-operator-include-groups-and-mapping-file branch from b93030e to d3b02d3 Compare January 19, 2026 16:05
@psalajova psalajova force-pushed the ci-operator-include-groups-and-mapping-file branch from d3b02d3 to d796b98 Compare January 19, 2026 16:16
@psalajova psalajova changed the title [WIP] ci-operator: Implement 3-level naming hierarchy changes to CSI mechanism ci-operator: Implement 3-level naming hierarchy changes to CSI mechanism Jan 19, 2026
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 19, 2026
@psalajova
Copy link
Contributor Author

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 20, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@pkg/defaults/defaults.go`:
- Around line 171-197: When enableSecretsStoreCSIDriver is true, do not swallow
errors from gsm.GetConfigFromEnv or secretmanager.NewClient; instead return the
initialization error immediately so GSM is not left nil. In the block that
constructs gsmConfiguration (symbols: gsmConfiguration,
enableSecretsStoreCSIDriver, gsm.GetConfigFromEnv, secretmanager.NewClient),
replace the logrus.WithError(...).Error calls with returning a wrapped error
(e.g. fmt.Errorf("gsm init: %w", err)) from the surrounding function; ensure the
function signature and callers are updated to propagate this error instead of
assuming success. Also keep assigning gsmConfiguration.ProjectConfig and
gsmConfiguration.Client only on success.

In `@pkg/steps/multi_stage/multi_stage.go`:
- Around line 225-231: Remove the per-step deferred close of the shared GSM
client: inside the run method where you check s.enableSecretsStoreCSIDriver and
s.gsm (and currently call defer s.gsm.Client.Close()), delete that defer so the
shared s.gsm.Client is not closed by each step (this prevents breaking later
steps and race conditions); ensure the GSM client lifecycle is managed centrally
(closed once at job shutdown) and leave the rest of the logic (the nil checks
and call to s.createSPCs) unchanged.
🧹 Nitpick comments (3)
cmd/ci-secret-generator/main_test.go (1)

581-586: Consider adding a dedicated field for disabled clusters instead of matching test name.

Using tc.name string matching to determine test behavior is fragile. If the test name changes, this conditional silently stops working correctly.

♻️ Suggested improvement

Add a disabledClusters field to the test case struct:

 testCases := []struct {
     name               string
     config             secretgenerator.Config
     GSMsyncEnabled     bool
     expectedIndexCalls int
     verifyIndexPayload func(t *testing.T, itemName string, payload []byte)
+    disabledClusters   sets.Set[string]
 }{

Then use tc.disabledClusters directly instead of the name-based conditional.

pkg/steps/multi_stage/gsm_bundle_resolver.go (1)

46-54: Non-deterministic group ordering in error message.

The groupList slice is populated by iterating over a map, which has non-deterministic order. This can cause the error message to vary between runs, making debugging and test assertions harder.

♻️ Suggested fix
+import "sort"
+
 for key, groups := range mountPathGroups {
     if len(groups) > 1 {
         var groupList []string
         for group := range groups {
             groupList = append(groupList, group)
         }
+        sort.Strings(groupList)
         return fmt.Errorf("multiple groups (%v) found for collection=%s, mount_path=%s - different groups in the same collection must use different mount paths to avoid file name collisions",
             groupList, key.collection, key.mountPath)
     }
 }
pkg/gsm-secrets/secrets.go (1)

87-89: Consider creating a defensive copy of the input slice.

The function appends to and sorts the input secretsList in place, which mutates the caller's slice. While current usage passes an empty slice, this could cause subtle bugs if callers expect their input to remain unchanged.

♻️ Suggested defensive copy
 func ConstructIndexSecretContent(secretsList []string) []byte {
-	secretsList = append(secretsList, UpdaterSASecretName)
-	sort.Strings(secretsList)
+	result := make([]string, len(secretsList), len(secretsList)+1)
+	copy(result, secretsList)
+	result = append(result, UpdaterSASecretName)
+	sort.Strings(result)

 	var formattedSecrets []string
-	for _, secret := range secretsList {
+	for _, secret := range result {
 		formattedSecrets = append(formattedSecrets, fmt.Sprintf("- %s", secret))
 	}

 	return []byte(strings.Join(formattedSecrets, "\n"))
 }

Comment on lines +171 to +197
// Initialize GSM configuration if CSI driver is enabled
var gsmConfiguration *multi_stage.GSMConfiguration
if enableSecretsStoreCSIDriver {
gsmConfiguration = &multi_stage.GSMConfiguration{
Config: gsmConfig,
CredentialsFile: gsmCredentialsFile,
}

// Get GSM project config from environment
gsmProjectConfig, err := gsm.GetConfigFromEnv()
if err != nil {
logrus.WithError(err).Error("Failed to get GSM project config from environment")
} else {
gsmConfiguration.ProjectConfig = gsmProjectConfig
}

// Initialize GSM client with credentials
var opts []option.ClientOption
if gsmCredentialsFile != "" {
opts = append(opts, option.WithCredentialsFile(gsmCredentialsFile))
}
gsmClient, err := secretmanager.NewClient(ctx, opts...)
if err != nil {
logrus.WithError(err).Error("Failed to initialize GSM client")
} else {
gsmConfiguration.Client = gsmClient
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the function containing these lines
cd /tmp/repo 2>/dev/null || cd . && \
cat -n pkg/defaults/defaults.go | sed -n '150,210p'

Repository: openshift/ci-tools

Length of output: 2799


🏁 Script executed:

# Let's also check the function signature to understand return types
cat -n pkg/defaults/defaults.go | sed -n '1,180p' | grep -A 30 "^[[:space:]]*[0-9]\+\s*func"

Repository: openshift/ci-tools

Length of output: 2516


🏁 Script executed:

# Get more context about how this function is structured
wc -l pkg/defaults/defaults.go && \
head -200 pkg/defaults/defaults.go | tail -100

Repository: openshift/ci-tools

Length of output: 4109


🏁 Script executed:

# Find all usages of gsmConfiguration after line 197
sed -n '197,$p' pkg/defaults/defaults.go | head -200 | grep -n "gsmConfiguration"

Repository: openshift/ci-tools

Length of output: 331


🏁 Script executed:

# Get more comprehensive view of gsmConfiguration usage
rg "gsmConfiguration" pkg/defaults/defaults.go -n -A 2 -B 2

Repository: openshift/ci-tools

Length of output: 1687


🏁 Script executed:

# Check if there are any validation checks for gsmConfiguration fields
rg "ProjectConfig|\.Client" pkg/defaults/defaults.go -n -B 2 -A 2 | head -60

Repository: openshift/ci-tools

Length of output: 1270


🏁 Script executed:

# Check the MultiStageTestStep function to see how it uses gsmConfiguration
rg "func MultiStageTestStep" pkg/ -A 30 | head -60

Repository: openshift/ci-tools

Length of output: 2208


🏁 Script executed:

# Look for how ProjectConfig and Client are used in multi_stage package
fd "multi_stage" --type f | head -10

Repository: openshift/ci-tools

Length of output: 733


🏁 Script executed:

# Find the multi_stage package and check GSMConfiguration usage
find . -path "*/multi_stage*" -type f -name "*.go" | head -5

Repository: openshift/ci-tools

Length of output: 238


🏁 Script executed:

# Check how GSMConfiguration fields are used in multi_stage package
rg "gsmConfiguration|\.ProjectConfig|\.Client" pkg/steps/multi_stage/ -n -B 2 -A 2 | head -100

Repository: openshift/ci-tools

Length of output: 1395


🏁 Script executed:

# Look at the GSMConfiguration struct definition
rg "type GSMConfiguration" pkg/steps/multi_stage/ -A 10

Repository: openshift/ci-tools

Length of output: 875


🏁 Script executed:

# Check if ProjectConfig or Client are accessed without nil checks
rg "ProjectConfig|gsm\.Client" pkg/steps/multi_stage/ -n -B 2 -A 2

Repository: openshift/ci-tools

Length of output: 4182


Return error when GSM initialization fails instead of logging and continuing.

When CSI driver is enabled and GSM client or project config initialization fails, continuing with nil values defers error detection to runtime. The code already validates these fields later (multi_stage.go line 226 checks s.gsm.Client == nil), so fail fast during initialization rather than pushing failures downstream.

Proposed fix
-		// Get GSM project config from environment
-		gsmProjectConfig, err := gsm.GetConfigFromEnv()
-		if err != nil {
-			logrus.WithError(err).Error("Failed to get GSM project config from environment")
-		} else {
-			gsmConfiguration.ProjectConfig = gsmProjectConfig
-		}
+		// Get GSM project config from environment
+		gsmProjectConfig, err := gsm.GetConfigFromEnv()
+		if err != nil {
+			return nil, nil, fmt.Errorf("failed to get GSM project config from environment: %w", err)
+		}
+		gsmConfiguration.ProjectConfig = gsmProjectConfig
 
 		// Initialize GSM client with credentials
 		var opts []option.ClientOption
 		if gsmCredentialsFile != "" {
 			opts = append(opts, option.WithCredentialsFile(gsmCredentialsFile))
 		}
-		gsmClient, err := secretmanager.NewClient(ctx, opts...)
-		if err != nil {
-			logrus.WithError(err).Error("Failed to initialize GSM client")
-		} else {
-			gsmConfiguration.Client = gsmClient
-		}
+		gsmClient, err := secretmanager.NewClient(ctx, opts...)
+		if err != nil {
+			return nil, nil, fmt.Errorf("failed to initialize GSM client: %w", err)
+		}
+		gsmConfiguration.Client = gsmClient
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Initialize GSM configuration if CSI driver is enabled
var gsmConfiguration *multi_stage.GSMConfiguration
if enableSecretsStoreCSIDriver {
gsmConfiguration = &multi_stage.GSMConfiguration{
Config: gsmConfig,
CredentialsFile: gsmCredentialsFile,
}
// Get GSM project config from environment
gsmProjectConfig, err := gsm.GetConfigFromEnv()
if err != nil {
logrus.WithError(err).Error("Failed to get GSM project config from environment")
} else {
gsmConfiguration.ProjectConfig = gsmProjectConfig
}
// Initialize GSM client with credentials
var opts []option.ClientOption
if gsmCredentialsFile != "" {
opts = append(opts, option.WithCredentialsFile(gsmCredentialsFile))
}
gsmClient, err := secretmanager.NewClient(ctx, opts...)
if err != nil {
logrus.WithError(err).Error("Failed to initialize GSM client")
} else {
gsmConfiguration.Client = gsmClient
}
// Initialize GSM configuration if CSI driver is enabled
var gsmConfiguration *multi_stage.GSMConfiguration
if enableSecretsStoreCSIDriver {
gsmConfiguration = &multi_stage.GSMConfiguration{
Config: gsmConfig,
CredentialsFile: gsmCredentialsFile,
}
// Get GSM project config from environment
gsmProjectConfig, err := gsm.GetConfigFromEnv()
if err != nil {
return nil, nil, fmt.Errorf("failed to get GSM project config from environment: %w", err)
}
gsmConfiguration.ProjectConfig = gsmProjectConfig
// Initialize GSM client with credentials
var opts []option.ClientOption
if gsmCredentialsFile != "" {
opts = append(opts, option.WithCredentialsFile(gsmCredentialsFile))
}
gsmClient, err := secretmanager.NewClient(ctx, opts...)
if err != nil {
return nil, nil, fmt.Errorf("failed to initialize GSM client: %w", err)
}
gsmConfiguration.Client = gsmClient
🤖 Prompt for AI Agents
In `@pkg/defaults/defaults.go` around lines 171 - 197, When
enableSecretsStoreCSIDriver is true, do not swallow errors from
gsm.GetConfigFromEnv or secretmanager.NewClient; instead return the
initialization error immediately so GSM is not left nil. In the block that
constructs gsmConfiguration (symbols: gsmConfiguration,
enableSecretsStoreCSIDriver, gsm.GetConfigFromEnv, secretmanager.NewClient),
replace the logrus.WithError(...).Error calls with returning a wrapped error
(e.g. fmt.Errorf("gsm init: %w", err)) from the surrounding function; ensure the
function signature and callers are updated to propagate this error instead of
assuming success. Also keep assigning gsmConfiguration.ProjectConfig and
gsmConfiguration.Client only on success.

Comment on lines 225 to 231
if s.enableSecretsStoreCSIDriver {
if s.gsm == nil || s.gsm.Client == nil {
return fmt.Errorf("GSM client was not initialized - credentials file may be missing")
}
defer s.gsm.Client.Close()
logrus.Info("Using initialized GSM client")
if err := s.createSPCs(ctx); err != nil {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, understand the repository structure and locate relevant files
git ls-files | grep -E "(multi_stage|gsm|secret)" | head -20

Repository: openshift/ci-tools

Length of output: 960


🏁 Script executed:

# Search for GSM client initialization and usage
rg -i "gsm.*client|client.*gsm" --type go -B 2 -A 2 | head -50

Repository: openshift/ci-tools

Length of output: 4350


🏁 Script executed:

# Look at the specific file and surrounding context
cat -n pkg/steps/multi_stage/multi_stage.go | sed -n '200,250p'

Repository: openshift/ci-tools

Length of output: 2134


🏁 Script executed:

# Search for GSM client initialization in defaults
rg "gsm.*=|\.gsm\s*=" --type go -B 3 -A 3

Repository: openshift/ci-tools

Length of output: 16123


🏁 Script executed:

# Check for Close() calls on GSM client
rg "\.gsm.*Close|gsm.*Client.*Close" --type go -B 2 -A 2

Repository: openshift/ci-tools

Length of output: 770


🏁 Script executed:

# Look at multiStageTestStep struct to understand its lifecycle
rg "type multiStageTestStep struct" --type go -A 20

Repository: openshift/ci-tools

Length of output: 1746


🏁 Script executed:

# Find how multiStageTestStep is instantiated and used
rg "multiStageTestStep\{|&multiStageTestStep|newMultiStageTestStep" --type go -B 3 -A 5

Repository: openshift/ci-tools

Length of output: 7277


🏁 Script executed:

# Search for where job shutdown/cleanup happens
rg "Close\(\)|defer.*Close|cleanup" pkg/steps/multi_stage/ --type go -B 2 -A 2

Repository: openshift/ci-tools

Length of output: 1069


🏁 Script executed:

# Check if there's a cleanup/teardown pattern at job level
rg -i "teardown|shutdown|cleanup|finali" pkg/steps/multi_stage/ --type go | head -20

Repository: openshift/ci-tools

Length of output: 437


🏁 Script executed:

# Search for how GSM client should be managed at a higher level
rg "gsmConfiguration|GSMConfiguration" --type go -B 2 -A 2 | head -80

Repository: openshift/ci-tools

Length of output: 4425


🏁 Script executed:

# Find where step.Run() is called and how many times
rg "\.Run\(ctx\)" --type go -B 3 -A 1 | head -60

Repository: openshift/ci-tools

Length of output: 4180


🏁 Script executed:

# Look for Step interface and its Run method
rg "type Step interface" --type go -A 20

Repository: openshift/ci-tools

Length of output: 1493


🏁 Script executed:

# Search for job execution and cleanup patterns
rg -i "defer.*close|jobfinish|jobcleanup" --type go -B 2 -A 2 | head -40

Repository: openshift/ci-tools

Length of output: 2815


🏁 Script executed:

# Search for where gsmConfiguration is used and if there's cleanup
rg "gsmConfiguration" --type go -B 1 -A 3 | grep -E "(defer|Close|cleanup|return)" | head -20

Repository: openshift/ci-tools

Length of output: 111


🏁 Script executed:

# Check defaults.go around the GSM client creation for any defer Close
cat -n pkg/defaults/defaults.go | sed -n '1,50p'

Repository: openshift/ci-tools

Length of output: 2232


🏁 Script executed:

# Look at the end of the Execute function to see cleanup patterns
rg "func Execute" pkg/defaults/defaults.go -A 200 | tail -50

Repository: openshift/ci-tools

Length of output: 44


🏁 Script executed:

# Check if multiple test steps can be created for a single job
rg "for.*Tests\[|for.*rawSteps" pkg/defaults/defaults.go -B 2 -A 5

Repository: openshift/ci-tools

Length of output: 565


🏁 Script executed:

# Get more context around where gsmConfiguration is created and used
cat -n pkg/defaults/defaults.go | sed -n '550,700p'

Repository: openshift/ci-tools

Length of output: 7753


🏁 Script executed:

# Look for Execute function definition and return statement
rg "func Execute" pkg/defaults/defaults.go -A 300 | head -200

Repository: openshift/ci-tools

Length of output: 44


🏁 Script executed:

# Check if there's any cleanup of gsmClient at the end
rg "defer.*gsmClient|gsmClient.*Close" pkg/defaults/defaults.go -B 5 -A 5

Repository: openshift/ci-tools

Length of output: 44


🏁 Script executed:

# Examine the full context of gsmConfiguration initialization and lifecycle
cat -n pkg/defaults/defaults.go | sed -n '600,680p'

Repository: openshift/ci-tools

Length of output: 4572


🏁 Script executed:

# Find the Execute function signature and full implementation
rg "^func Execute" pkg/defaults/defaults.go -A 350 | head -250

Repository: openshift/ci-tools

Length of output: 44


🏁 Script executed:

# Look for where steps are returned and if any cleanup is done
rg "return.*steps" pkg/defaults/defaults.go -B 5 | head -40

Repository: openshift/ci-tools

Length of output: 391


🏁 Script executed:

# Get the full context around gsmConfiguration creation (around line 850-950)
cat -n pkg/defaults/defaults.go | sed -n '850,950p'

Repository: openshift/ci-tools

Length of output: 5145


🏁 Script executed:

# Find the Execute function definition
rg "^func Execute\(" pkg/defaults/defaults.go -A 500 | head -300

Repository: openshift/ci-tools

Length of output: 44


🏁 Script executed:

# Search specifically for where gsmClient is created and if it's deferred
cat -n pkg/defaults/defaults.go | sed -n '1,100p'

Repository: openshift/ci-tools

Length of output: 4497


🏁 Script executed:

# Look for the function signature and beginning of Execute
rg "^func Execute\(" pkg/defaults/defaults.go -B 5 -A 50

Repository: openshift/ci-tools

Length of output: 44


🏁 Script executed:

# Find all function definitions in defaults.go
rg "^func " pkg/defaults/defaults.go | head -20

Repository: openshift/ci-tools

Length of output: 1768


🏁 Script executed:

# Look at the full FromConfig function to see return statement
rg "^func FromConfig" pkg/defaults/defaults.go -A 600 | tail -100

Repository: openshift/ci-tools

Length of output: 3250


🏁 Script executed:

# Search around line 470-530 where gsmConfiguration should be setup
cat -n pkg/defaults/defaults.go | sed -n '460,550p'

Repository: openshift/ci-tools

Length of output: 4432


🏁 Script executed:

# Look at the middle part of FromConfig (lines 100-300)
cat -n pkg/defaults/defaults.go | sed -n '100,300p'

Repository: openshift/ci-tools

Length of output: 11042


🏁 Script executed:

# Get the exact area where gsmConfiguration is initialized
rg "gsmConfiguration.*=" pkg/defaults/defaults.go -B 3 -A 15

Repository: openshift/ci-tools

Length of output: 1496


🏁 Script executed:

# Look for where steps are executed and if there's cleanup
rg "\.Run\(" pkg/ --type go -B 5 -A 5 | head -80

Repository: openshift/ci-tools

Length of output: 4270


🏁 Script executed:

# Search for step execution or runner
rg "for.*steps\[|for.*step :=" pkg/ --type go -B 2 -A 8 | head -100

Repository: openshift/ci-tools

Length of output: 6092


Remove the defer close from the shared GSM client.

The GSM client is created once in defaults and shared across all multi-stage steps via gsmConfiguration. Closing it in each step's run() method breaks subsequent steps and introduces race conditions in concurrent execution. The client should be closed once at job shutdown, not per-step.

🐛 Proposed fix
-		defer s.gsm.Client.Close()
-		logrus.Info("Using initialized GSM client")
+		logrus.Info("Using initialized GSM client")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if s.enableSecretsStoreCSIDriver {
if s.gsm == nil || s.gsm.Client == nil {
return fmt.Errorf("GSM client was not initialized - credentials file may be missing")
}
defer s.gsm.Client.Close()
logrus.Info("Using initialized GSM client")
if err := s.createSPCs(ctx); err != nil {
if s.enableSecretsStoreCSIDriver {
if s.gsm == nil || s.gsm.Client == nil {
return fmt.Errorf("GSM client was not initialized - credentials file may be missing")
}
logrus.Info("Using initialized GSM client")
if err := s.createSPCs(ctx); err != nil {
🤖 Prompt for AI Agents
In `@pkg/steps/multi_stage/multi_stage.go` around lines 225 - 231, Remove the
per-step deferred close of the shared GSM client: inside the run method where
you check s.enableSecretsStoreCSIDriver and s.gsm (and currently call defer
s.gsm.Client.Close()), delete that defer so the shared s.gsm.Client is not
closed by each step (this prevents breaking later steps and race conditions);
ensure the GSM client lifecycle is managed centrally (closed once at job
shutdown) and leave the rest of the logic (the nil checks and call to
s.createSPCs) unchanged.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 20, 2026

@psalajova: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/breaking-changes 05cd174 link false /test breaking-changes
ci/prow/images 05cd174 link true /test images

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants