Skip to content

Support storage backend integration test suite#284

Open
shanyl9 wants to merge 2 commits intomainfrom
support_storage_backend_integration_test_suite
Open

Support storage backend integration test suite#284
shanyl9 wants to merge 2 commits intomainfrom
support_storage_backend_integration_test_suite

Conversation

@shanyl9
Copy link
Collaborator

@shanyl9 shanyl9 commented Feb 19, 2026

Sorry, we do not accept changes directly against this repository. Please see
CONTRIBUTING.md for information on where and how to contribute instead.

Summary by CodeRabbit

  • New Features

    • Added support for Rapid7 storage backend integration as an alternative to local storage configuration in testing workflows.
  • Chores

    • Updated Go toolchain to version 1.25.0.
    • Upgraded dependencies across cloud SDKs, Kubernetes components, container tooling, and observability packages.

@coderabbitai
Copy link

coderabbitai bot commented Feb 19, 2026

📝 Walkthrough

Walkthrough

This pull request integrates Rapid7 storage backend support into the integration test suite. Changes include workflow conditionals to choose between local and storage backends, new helper functions to fetch application profiles from either source, and test updates to support dual-path data retrieval based on backend availability.

Changes

Cohort / File(s) Summary
Workflow Configuration
.github/workflows/manual-integration-tests.yml
Added workflow inputs for storage backend selection (use_storage_backend, customer_guid, access_key). Introduces conditional execution paths: installs Kubescape via Helm for local storage, or Rapid7 release for storage backend. Propagates BACKEND_ACCOUNT_ID and BACKEND_ACCESS_KEY environment variables to test steps when storage backend is enabled.
Storage Backend Helpers
tests/integration-test-suite/helpers.go
Introduced MaxSniffingTimeLabel constant to avoid platform-specific imports. Added storage backend runtime utilities: getClusterName, getStorageUrl, NewStorageClient. Implemented storage-backed fetch functions: fetchApplicationProfileFromStorageBackend, fetchNetworkNeighborProfileFromStorageBackend. Renamed cluster-based fetchers to explicit variants (fetchApplicationProfileFromCluster, fetchNetworkNeighborProfileFromCluster). Extended verifyApplicationProfileCompleted and verifyNetworkNeighborProfileCompleted to support dual data sources via isRapid7 flag.
Test Infrastructure
tests/integration-test-suite/helm.go, tests/integration-test-suite/integration_test.go
Added CheckRapidHelmRelease() function to detect Rapid7 Helm release presence. Extended IntegrationTestSuite with isRapid7, accountID, and accessKey fields. SetupSuite now populates Rapid7 flags and reads storage backend credentials from environment, failing fast if credentials are missing.
Integration Test Cases
tests/integration-test-suite/case_*.go (12 files)
Removed containerwatcher imports across all test files. Replaced containerwatcher.MaxSniffingTimeLabel with MaxSniffingTimeLabel constant. Updated ApplicationProfile type usage to *v1beta1.ApplicationProfile. Added conditional fetch logic based on isRapid7: fetches from storage backend when enabled, otherwise from cluster. Extended verification function calls to include accountID, accessKey, and isRapid7 parameters. Increased some test wait durations (e.g., main container learning period from 3 to 4 minutes).
Dependencies and Scripts
tests/integration-test-suite/go.mod, tests/integration-test-suite/run-non-failover-tests.sh, tests/integration-test-suite/run-failover-tests.sh
Upgraded Go toolchain from 1.24.5 to 1.25.0 and modernized dependency graph across cloud SDKs, Kubernetes components, observability stacks (OpenTelemetry, Prometheus), and container tooling. Enhanced test runner scripts with bash validation, GOPATH setup, and gotestsum discovery logic with fallback paths and installation guidance.

Sequence Diagram(s)

sequenceDiagram
    participant Test as Integration Test
    participant Helm as Helm
    participant WF as Workflow
    participant Cluster as Kubernetes Cluster
    participant Storage as Storage Backend

    WF->>WF: Check use_storage_backend flag
    alt Storage Backend Enabled
        WF->>Helm: Install Rapid7 release
        Helm->>Cluster: Deploy Rapid7 components
        WF->>Test: Set BACKEND_ACCOUNT_ID, BACKEND_ACCESS_KEY
        Test->>Test: SetupSuite: isRapid7=true
        Test->>Storage: NewStorageClient(credentials)
        Test->>Storage: Fetch ApplicationProfile via accountID/accessKey
        Storage-->>Test: ApplicationProfile
    else Cluster Storage
        WF->>Helm: Install Kubescape (standard)
        Helm->>Cluster: Deploy Kubescape
        Test->>Test: SetupSuite: isRapid7=false
        Test->>Cluster: Fetch ApplicationProfile via ksObjectConnection
        Cluster-->>Test: ApplicationProfile
    end
    Test->>Test: verifyApplicationProfileCompleted(accountID, accessKey, isRapid7)
    Test->>Test: Execute test assertions
Loading
sequenceDiagram
    participant Helper as Test Helper
    participant isRapid7 as isRapid7 Flag
    participant Backend as Storage Backend
    participant Cluster as Kubernetes Cluster

    Helper->>isRapid7: Check backend type
    alt Rapid7 Enabled
        Helper->>Backend: fetchApplicationProfileFromStorageBackend(accountID, accessKey)
        Helper->>Backend: List ApplicationProfiles by workload-name/kind
        Backend-->>Helper: Matching profile
        Helper->>Backend: Retrieve specific profile
        Backend-->>Helper: ApplicationProfile
    else Cluster-based
        Helper->>Cluster: fetchApplicationProfileFromCluster(ksObjectConnection)
        Cluster-->>Helper: ApplicationProfile from API
    end
    Helper-->>Helper: Return profile to test
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 Hops through clusters, swift and true,
Now storage backends beckon—paths anew!
Rapid7 vaults await our test commands,
Dual-source maestro, with steady hands.
Backend or cluster, both paths align,
Integration magic, perfectly fine! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main objective of the PR, which adds support for storage backend integration to the test suite across multiple test files and configuration changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch support_storage_backend_integration_test_suite

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🧹 Nitpick comments (11)
tests/integration-test-suite/run-non-failover-tests.sh (1)

25-44: $GOTESTSUM_PATH is detected but never used in the invocation command.

The detection block builds and validates $GOTESTSUM_PATH, but line 65 invokes gotestsum by name rather than "$GOTESTSUM_PATH". It works because PATH is updated in branches 2 and 3, but the variable is effectively dead after the existence check. Use "$GOTESTSUM_PATH" in the invocation for explicitness and to avoid a subtle breakage if PATH manipulation ever changes.

♻️ Suggested fix
-    gotestsum --format standard-verbose --junitfile "junit-$test.xml" -- -count=1 -timeout 30m -v -run "IntegrationTestSuite/$test$" ./... -- $EXTRA_TEST_ARGS 2>&1 | tee "log-$test.txt" &
+"$GOTESTSUM_PATH" --format standard-verbose --junitfile "junit-$test.xml" -- -count=1 -timeout 30m -v -run "IntegrationTestSuite/$test$" ./... -- $EXTRA_TEST_ARGS 2>&1 | tee "log-$test.txt" &
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration-test-suite/run-non-failover-tests.sh` around lines 25 - 44,
The script detects and validates GOTESTSUM_PATH but later invokes gotestsum by
name; update the invocation to use the validated "$GOTESTSUM_PATH" (quoted)
wherever the script runs gotestsum so the resolved binary is used explicitly;
keep the existing PATH exports in the branches that set GOPATH/bin or
$HOME/go/bin, but change calls like gotestsum ... to "$GOTESTSUM_PATH" ... to
avoid relying on PATH and to make the variable non-dead.
tests/integration-test-suite/run-failover-tests.sh (1)

25-44: Same $GOTESTSUM_PATH detection-vs-invocation inconsistency as in run-non-failover-tests.sh.

Line 55 invokes gotestsum by name instead of "$GOTESTSUM_PATH". The same fix applies here.

♻️ Suggested fix
-    gotestsum --format standard-verbose --junitfile "junit-$test.xml" -- -count=1 -timeout 30m -v -run "IntegrationTestSuite/$test$" ./... 2>&1 | tee "log-$test.txt"
+    "$GOTESTSUM_PATH" --format standard-verbose --junitfile "junit-$test.xml" -- -count=1 -timeout 30m -v -run "IntegrationTestSuite/$test$" ./... 2>&1 | tee "log-$test.txt"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration-test-suite/run-failover-tests.sh` around lines 25 - 44, The
script detects gotestsum and stores its path in GOTESTSUM_PATH but later invokes
gotestsum by name; change any plain invocations of gotestsum to use the
discovered "$GOTESTSUM_PATH" variable instead (preserve quoting) so the script
always runs the detected binary; update places that call gotestsum (search for
bare "gotestsum" invocations) to use "$GOTESTSUM_PATH" and ensure PATH exports
remain unchanged.
tests/integration-test-suite/go.mod (2)

3-3: Consider pinning to the latest go 1.25.x patch for security fixes.

go1.25.6 was released 2026-01-15 with security fixes, while the module declares go 1.25.0. Upgrading to go 1.25.6 (or whatever the latest patch is) picks up all accumulated fixes.

♻️ Suggested fix
-go 1.25.0
+go 1.25.6
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration-test-suite/go.mod` at line 3, The module's Go version is
pinned to "go 1.25.0" in the go.mod go directive; update that directive to the
latest 1.25.x patch (e.g., "go 1.25.6") so the build picks up recent security
fixes — locate the go directive in the go.mod file and replace "go 1.25.0" with
the current 1.25.patch version.

3-3: Heads-up: Go 1.25 changes default GOMAXPROCS in Kubernetes pods.

Go 1.25 introduces cgroup-aware GOMAXPROCS on Linux: if the CPU bandwidth limit of the cgroup is lower than the number of logical CPUs, GOMAXPROCS defaults to the lower limit — in container runtimes like Kubernetes this corresponds to the pod's CPU limit. This new behavior only takes effect if your program uses Go version 1.25 or higher in your go.mod.

Integration tests running in constrained pods (e.g., with low CPU limits) may now use fewer goroutines than before, potentially slowing parallel test execution. Verify that the CI runner pods have sufficient CPU limits, or set GODEBUG=containermaxprocs=0 to opt out.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration-test-suite/go.mod` at line 3, The go.mod currently
specifies the Go version "go 1.25.0", which enables cgroup-aware GOMAXPROCS in
CI pods; to avoid reduced parallelism in integration tests, either change the go
directive back to a pre-1.25 version (remove or replace the "go 1.25.0" line) or
keep "go 1.25.0" and update CI to set GODEBUG=containermaxprocs=0 (export
GODEBUG in your test job environment) or increase the pod CPU limits; locate the
Go version directive ("go 1.25.0") in the go.mod and apply one of these fixes.
.github/workflows/manual-integration-tests.yml (3)

112-122: Quote command substitutions to prevent word splitting (SC2046).

Static analysis flags unquoted $(echo ...) substitutions in the Helm --set arguments. While unlikely to cause issues with well-formed image references, quoting them is defensive and consistent with shell best practices.

Proposed fix
           helm upgrade --install rapid7 charts/kubescape-operator \
             -n kubescape --create-namespace \
-            --set nodeAgent.image.repository=$(echo "${{ github.event.inputs.node_agent_image }}" | cut -d: -f1) \
-            --set nodeAgent.image.tag=$(echo "${{ github.event.inputs.node_agent_image }}" | cut -d: -f2-) \
-            --set storage.image.repository=$(echo "${{ github.event.inputs.storage_image }}" | cut -d: -f1) \
-            --set storage.image.tag=$(echo "${{ github.event.inputs.storage_image }}" | cut -d: -f2-) \
+            --set nodeAgent.image.repository="$(echo "${{ github.event.inputs.node_agent_image }}" | cut -d: -f1)" \
+            --set nodeAgent.image.tag="$(echo "${{ github.event.inputs.node_agent_image }}" | cut -d: -f2-)" \
+            --set storage.image.repository="$(echo "${{ github.event.inputs.storage_image }}" | cut -d: -f1)" \
+            --set storage.image.tag="$(echo "${{ github.event.inputs.storage_image }}" | cut -d: -f2-)" \
             --set nodeAgent.config.learningPeriod=30s \
             --set nodeAgent.config.updatePeriod=1m \
             --set storage.cleanupInterval=1m \
-            $EXTRA_SET_ARGS \
+            "$EXTRA_SET_ARGS" \
             --wait

Apply the same fix to the failover job block (lines 275–285).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/manual-integration-tests.yml around lines 112 - 122, The
helm upgrade command uses unquoted command substitutions for image repo/tag
values (the $(echo "${{ github.event.inputs.node_agent_image }}" | cut -d: -f1),
$(echo ... -f2-), and the equivalent storage substitutions); quote those
substitutions so the --set arguments become --set
nodeAgent.image.repository="$(...)", --set nodeAgent.image.tag="$(...)", --set
storage.image.repository="$(...)", and --set storage.image.tag="$(...)" to
prevent word-splitting, and apply the same quoting changes to the failover job
block where the same substitutions are used.

133-137: Redundant environment variable setup.

Lines 133–137 write BACKEND_ACCOUNT_ID and BACKEND_ACCESS_KEY to $GITHUB_ENV (persisting them for all subsequent steps), and then lines 140–142 also set the same variables via the step-level env: block. One or the other is sufficient. Since the env: block on lines 140–142 covers the test step directly, the $GITHUB_ENV step can be removed to avoid confusion.

The same duplication exists in the failover job (lines 296–305).

Also applies to: 140-142

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/manual-integration-tests.yml around lines 133 - 137,
Remove the redundant step that echoes BACKEND_ACCOUNT_ID and BACKEND_ACCESS_KEY
into $GITHUB_ENV (the step named "Set storage backend environment variables")
and any duplicate echo steps in the failover job; instead rely on the existing
step-level env: block that already sets BACKEND_ACCOUNT_ID and
BACKEND_ACCESS_KEY for the test step. Locate and delete the echo
"BACKEND_ACCOUNT_ID=..." and echo "BACKEND_ACCESS_KEY=..." runs (and their
enclosing if: condition) in both the main job and the failover job, leaving the
env: entries (the step-level environment variables used by the test step)
intact. Ensure no other steps rely on the removed $GITHUB_ENV export—if any do,
move those steps to use the step-level env or re-add only the necessary
persistence.

95-123: Extract duplicated Helm install logic into a reusable composite action or shared script.

The workflow contains 4 near-identical Helm install blocks: one Kubescape and one Rapid7 block in each job (integration-tests and failover-tests). All four blocks perform identical git operations, dependency building, and Helm set argument parsing. Only the release name differs (kubescape vs rapid7).

Additionally, both Rapid7 and Kubescape installations use identical --set values for nodeAgent and storage configuration. The backend credentials (customer_guid, access_key) are available as workflow inputs and set as environment variables (BACKEND_ACCOUNT_ID, BACKEND_ACCESS_KEY), but they are not passed to the Helm install command itself—verify whether this is intentional or if Helm requires these as explicit --set arguments for proper Rapid7 backend configuration.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/manual-integration-tests.yml around lines 95 - 123, The
Helm install logic is duplicated across the integration-tests and failover-tests
jobs (the block that builds EXTRA_SET_ARGS, clones helm-charts, runs helm
dependency build, and executes helm upgrade --install with the --set nodeAgent.*
and storage.* flags); extract this into a reusable composite action or shared
script that accepts parameters for release name (kubescape vs rapid7),
branch_helm_chart, node_agent_image, storage_image, extra_helm_set_args, and any
other per-job toggles, and replace each duplicated block with a single
invocation of that action/script that constructs EXTRA_SET_ARGS and runs helm
upgrade --install; also verify and, if required, add the BACKEND_ACCOUNT_ID and
BACKEND_ACCESS_KEY inputs as explicit --set arguments (e.g., --set
backend.customer_guid=... and --set backend.access_key=...) so the Helm chart
receives backend credentials when deploying rapid7.
tests/integration-test-suite/helpers.go (4)

316-329: Parameter lists are growing unwieldy — consider a struct or receiver method.

verifyApplicationProfileCompleted now takes 9 parameters; verifyNetworkNeighborProfileCompleted takes 11. This harms readability at every call site and makes future extensions painful. Since most of these values (ksObjectConnection, testNamespace, accountID, accessKey, isRapid7) are already fields on IntegrationTestSuite, these could be methods on the suite instead of free functions:

func (s *IntegrationTestSuite) verifyApplicationProfileCompleted(expectedCompleteness, relatedKind, relatedName string) {
    // access s.ksObjectConnection, s.isRapid7, etc. directly
}

Also applies to: 361-374

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration-test-suite/helpers.go` around lines 316 - 329, The free
function verifyApplicationProfileCompleted (and similarly
verifyNetworkNeighborProfileCompleted) takes too many parameters; convert them
into receiver methods on IntegrationTestSuite so they use existing suite fields
(e.g. s.ksObjectConnection, s.testNamespace, s.accountID, s.accessKey,
s.isRapid7) instead of passing those values; update the signature to func (s
*IntegrationTestSuite) verifyApplicationProfileCompleted(expectedCompleteness,
relatedKind, relatedName string) and adjust the body to reference s.* fields
(including calls to
fetchApplicationProfileFromStorageBackend/fetchApplicationProfileFromCluster)
and then update all call sites to use the method form, doing the same refactor
for verifyNetworkNeighborProfileCompleted.

151-151: Hardcoded page size of 100 may miss profiles.

ListApplicationProfiles and ListNetworkNeighborhoods pass 100 as the limit with an empty continuation token. If the namespace accumulates > 100 profiles (unlikely in a clean test namespace, but possible if cleanup fails), the target profile won't be found and the test will report a misleading "not found" error.

Consider using a label selector or field selector in the list call to filter by workload-name/kind directly, avoiding pagination issues entirely.

Also applies to: 207-207

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration-test-suite/helpers.go` at line 151, The tests call
ListApplicationProfiles and ListNetworkNeighborhoods with a hardcoded limit of
100 and an empty continuation token, which can miss entries; update the calls to
either (A) pass a selector/filter (e.g., workload-name/kind label or field
selector) into the list call so it only returns matching profiles, or (B)
implement proper pagination by looping on the continuation token until nil/empty
or the target profile is found; adjust the calls that currently look like
ListApplicationProfiles(ctx, testNamespace, 100, "") and
ListNetworkNeighborhoods(...) to use the selector/filtered-params or a
continuation-token loop so tests never falsely report "not found."

43-54: getClusterName shells out to kubectl — fragile in CI.

This function executes kubectl config current-context as a subprocess. If kubectl isn't on PATH or the kubeconfig doesn't have a current-context set, it will Fatal and kill the process (same concern as NewStorageClient). Consider using the Go clientcmd API already imported in this file to read the current context programmatically:

Suggested fix
 func getClusterName() string {
-	cmd := exec.Command("kubectl", "config", "current-context")
-	clusterNameBytes, err := cmd.Output()
+	kubeconfig := os.Getenv("KUBECONFIG")
+	if kubeconfig == "" {
+		kubeconfig = os.ExpandEnv("$HOME/.kube/config")
+	}
+	config, err := clientcmd.NewNonInteractiveDeferredLoadingClientConfig(
+		&clientcmd.ClientConfigLoadingRules{ExplicitPath: kubeconfig},
+		&clientcmd.ConfigOverrides{},
+	).RawConfig()
 	if err != nil {
 		logger.L().Fatal("failed to get current cluster name", helpers.Error(err))
 	}
-	clusterName := string(clusterNameBytes)
-	// Trim newline (matching helm.go behavior)
-	clusterName = strings.TrimSpace(clusterName)
+	clusterName := config.CurrentContext
 	logger.L().Info("Current cluster name", helpers.String("clusterName", clusterName))
 	return clusterName
 }

This also removes the os/exec import dependency.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration-test-suite/helpers.go` around lines 43 - 54, Replace the
shelling out in getClusterName with the Kubernetes clientcmd API: load the
kubeconfig with clientcmd (e.g., via clientcmd.NewDefaultClientConfig or
NewNonInteractiveDeferredLoadingClientConfig and RawConfig()), read the
CurrentContext value from the returned config, handle errors from loading the
config and the case of an empty CurrentContext using the same logger calls, trim
whitespace as before, and remove the now-unused os/exec import; refer to
getClusterName and clientcmd.RawConfig()/CurrentContext to locate where to
change the code.

142-173: Cache StorageClient in suite setup to eliminate redundant service-discovery lookups and connections.

Both fetchApplicationProfileFromStorageBackend and fetchNetworkNeighborProfileFromStorageBackend independently call getStorageUrl() and NewStorageClient() on every invocation. getStorageUrl() creates a new service discovery client and makes an API call each time, while NewStorageClient() initializes a new client and establishes a connection.

This redundancy is evident in the test suite, where verifyApplicationProfileCompleted and verifyNetworkNeighborProfileCompleted are called back-to-back in 9 tests (e.g., lines 98–99 in case_sidecar_profile_create_test.go, lines 57–58 in case_simple_profile_create_test.go). Each pair results in duplicate service discovery lookups and client connections for identical credentials.

Add a storageClient field to IntegrationTestSuite and initialize it once in SetupSuite() when isRapid7 is true, similar to how ksObjectConnection is managed. Then pass it to these functions instead of recreating it on each call.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration-test-suite/helpers.go` around lines 142 - 173, Add a
storageClient field to IntegrationTestSuite and initialize it once in
SetupSuite() when isRapid7 is true (same pattern as ksObjectConnection) so tests
reuse a single StorageClient instead of recreating it; update
fetchApplicationProfileFromStorageBackend and
fetchNetworkNeighborProfileFromStorageBackend to accept a StorageClient
parameter (or use s.storageClient) and remove calls to getStorageUrl() and
NewStorageClient() inside those functions so they use the cached client; ensure
SetupSuite() calls getStorageUrl() and NewStorageClient() only once and assigns
the result to the new storageClient field for subsequent test helper use.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/manual-integration-tests.yml:
- Around line 26-38: The workflow currently exposes sensitive values via
workflow_dispatch inputs (customer_guid and access_key); remove these plaintext
inputs and instead read credentials from GitHub Secrets (e.g.,
secrets.BACKEND_CUSTOMER_GUID and secrets.BACKEND_ACCESS_KEY) in the workflow,
keep only non-sensitive flags like use_storage_backend, and update any steps
that referenced customer_guid or access_key to use the secrets via the secrets
context (e.g., replace uses of the customer_guid/access_key inputs with
secrets.BACKEND_CUSTOMER_GUID and secrets.BACKEND_ACCESS_KEY).

In
`@tests/integration-test-suite/case_scale_deployment_partial_completion_test.go`:
- Line 69: The intermediate test fetches call fetchApplicationProfileFromCluster
unconditionally but in Rapid7 mode profiles live in the backend; wrap those
intermediate cluster fetches in a conditional so they only run when s.isRapid7
is false (e.g. if !s.isRapid7 { applicationProfile, err :=
fetchApplicationProfileFromCluster(...) } ), or skip the intermediate check when
s.isRapid7 is true, keeping the final calls to verifyApplicationProfileCompleted
and verifyNetworkNeighborProfileCompleted (which already accept isRapid7)
unchanged.

In `@tests/integration-test-suite/helpers.go`:
- Around line 79-89: The helper NewStorageClient currently calls
logger.L().Fatal on errors (logger.L().Fatal in the NewStorageClient function),
which aborts the process and prevents test cleanup; change NewStorageClient to
return (*backendv1.StorageClient, error) instead of aborting: on
backendv1.NewStorageClient or client.Connect errors, return nil and the wrapped
error (with context) rather than calling logger.L().Fatal; update all callers
(tests and helpers that call NewStorageClient) to handle the returned error and
fail the test via t.Fatalf or propagate the error so deferred cleanup/TearDown
logic can run.
- Around line 64-77: The code in getStorageUrl hardcodes the service discovery
host when calling
v3.NewServiceDiscoveryClientV3("api.stage-us-east-1.r7.armo-cadr.com"); change
this to read the host from an environment variable (e.g.,
SERVICE_DISCOVERY_HOST) with the current staging host as the default, and pass
that variable value into v3.NewServiceDiscoveryClientV3; ensure any errors
remain handled the same way and that tests use the env var to target non-staging
environments.

In `@tests/integration-test-suite/integration_test.go`:
- Line 27: Fix the typo in the inline comment that currently reads "These are
only used when storage backend is enbled (Rapid7)" by changing "enbled" to
"enabled" so the comment reads "These are only used when storage backend is
enabled (Rapid7)"; locate the comment string in integration_test.go and update
it accordingly.

In `@tests/integration-test-suite/run-failover-tests.sh`:
- Line 47: The failover test list duplicates a test defined in the non-failover
script; update one of the lists to remove the duplicate: decide whether
TestSimpleProfileNodeAgentFailover belongs in FAILOVER_TESTS (in
run-failover-tests.sh) or NON_FAILOVER_TESTS (in run-non-failover-tests.sh),
then remove it from the other script so each test appears only once; modify the
string assigned to FAILOVER_TESTS or NON_FAILOVER_TESTS accordingly (e.g.,
delete or replace "TestSimpleProfileNodeAgentFailover" from FAILOVER_TESTS) and
ensure the remaining test names reflect the intended grouping.

In `@tests/integration-test-suite/run-non-failover-tests.sh`:
- Line 47: The NON_FAILOVER_TESTS variable currently lists a failover test
(TestSimpleProfileNodeAgentFailover) likely due to copy/paste; update the value
of NON_FAILOVER_TESTS to include the actual non-failover test names (for example
TestJobProfileCreate, TestCronJobProfileCreate,
TestCrashLoopAndStableProfileIncomplete) and remove
TestSimpleProfileNodeAgentFailover so the non-failover script runs the correct
suite; locate the assignment to NON_FAILOVER_TESTS in run-non-failover-tests.sh
and replace the test list accordingly.

---

Nitpick comments:
In @.github/workflows/manual-integration-tests.yml:
- Around line 112-122: The helm upgrade command uses unquoted command
substitutions for image repo/tag values (the $(echo "${{
github.event.inputs.node_agent_image }}" | cut -d: -f1), $(echo ... -f2-), and
the equivalent storage substitutions); quote those substitutions so the --set
arguments become --set nodeAgent.image.repository="$(...)", --set
nodeAgent.image.tag="$(...)", --set storage.image.repository="$(...)", and --set
storage.image.tag="$(...)" to prevent word-splitting, and apply the same quoting
changes to the failover job block where the same substitutions are used.
- Around line 133-137: Remove the redundant step that echoes BACKEND_ACCOUNT_ID
and BACKEND_ACCESS_KEY into $GITHUB_ENV (the step named "Set storage backend
environment variables") and any duplicate echo steps in the failover job;
instead rely on the existing step-level env: block that already sets
BACKEND_ACCOUNT_ID and BACKEND_ACCESS_KEY for the test step. Locate and delete
the echo "BACKEND_ACCOUNT_ID=..." and echo "BACKEND_ACCESS_KEY=..." runs (and
their enclosing if: condition) in both the main job and the failover job,
leaving the env: entries (the step-level environment variables used by the test
step) intact. Ensure no other steps rely on the removed $GITHUB_ENV export—if
any do, move those steps to use the step-level env or re-add only the necessary
persistence.
- Around line 95-123: The Helm install logic is duplicated across the
integration-tests and failover-tests jobs (the block that builds EXTRA_SET_ARGS,
clones helm-charts, runs helm dependency build, and executes helm upgrade
--install with the --set nodeAgent.* and storage.* flags); extract this into a
reusable composite action or shared script that accepts parameters for release
name (kubescape vs rapid7), branch_helm_chart, node_agent_image, storage_image,
extra_helm_set_args, and any other per-job toggles, and replace each duplicated
block with a single invocation of that action/script that constructs
EXTRA_SET_ARGS and runs helm upgrade --install; also verify and, if required,
add the BACKEND_ACCOUNT_ID and BACKEND_ACCESS_KEY inputs as explicit --set
arguments (e.g., --set backend.customer_guid=... and --set
backend.access_key=...) so the Helm chart receives backend credentials when
deploying rapid7.

In `@tests/integration-test-suite/go.mod`:
- Line 3: The module's Go version is pinned to "go 1.25.0" in the go.mod go
directive; update that directive to the latest 1.25.x patch (e.g., "go 1.25.6")
so the build picks up recent security fixes — locate the go directive in the
go.mod file and replace "go 1.25.0" with the current 1.25.patch version.
- Line 3: The go.mod currently specifies the Go version "go 1.25.0", which
enables cgroup-aware GOMAXPROCS in CI pods; to avoid reduced parallelism in
integration tests, either change the go directive back to a pre-1.25 version
(remove or replace the "go 1.25.0" line) or keep "go 1.25.0" and update CI to
set GODEBUG=containermaxprocs=0 (export GODEBUG in your test job environment) or
increase the pod CPU limits; locate the Go version directive ("go 1.25.0") in
the go.mod and apply one of these fixes.

In `@tests/integration-test-suite/helpers.go`:
- Around line 316-329: The free function verifyApplicationProfileCompleted (and
similarly verifyNetworkNeighborProfileCompleted) takes too many parameters;
convert them into receiver methods on IntegrationTestSuite so they use existing
suite fields (e.g. s.ksObjectConnection, s.testNamespace, s.accountID,
s.accessKey, s.isRapid7) instead of passing those values; update the signature
to func (s *IntegrationTestSuite)
verifyApplicationProfileCompleted(expectedCompleteness, relatedKind, relatedName
string) and adjust the body to reference s.* fields (including calls to
fetchApplicationProfileFromStorageBackend/fetchApplicationProfileFromCluster)
and then update all call sites to use the method form, doing the same refactor
for verifyNetworkNeighborProfileCompleted.
- Line 151: The tests call ListApplicationProfiles and ListNetworkNeighborhoods
with a hardcoded limit of 100 and an empty continuation token, which can miss
entries; update the calls to either (A) pass a selector/filter (e.g.,
workload-name/kind label or field selector) into the list call so it only
returns matching profiles, or (B) implement proper pagination by looping on the
continuation token until nil/empty or the target profile is found; adjust the
calls that currently look like ListApplicationProfiles(ctx, testNamespace, 100,
"") and ListNetworkNeighborhoods(...) to use the selector/filtered-params or a
continuation-token loop so tests never falsely report "not found."
- Around line 43-54: Replace the shelling out in getClusterName with the
Kubernetes clientcmd API: load the kubeconfig with clientcmd (e.g., via
clientcmd.NewDefaultClientConfig or NewNonInteractiveDeferredLoadingClientConfig
and RawConfig()), read the CurrentContext value from the returned config, handle
errors from loading the config and the case of an empty CurrentContext using the
same logger calls, trim whitespace as before, and remove the now-unused os/exec
import; refer to getClusterName and clientcmd.RawConfig()/CurrentContext to
locate where to change the code.
- Around line 142-173: Add a storageClient field to IntegrationTestSuite and
initialize it once in SetupSuite() when isRapid7 is true (same pattern as
ksObjectConnection) so tests reuse a single StorageClient instead of recreating
it; update fetchApplicationProfileFromStorageBackend and
fetchNetworkNeighborProfileFromStorageBackend to accept a StorageClient
parameter (or use s.storageClient) and remove calls to getStorageUrl() and
NewStorageClient() inside those functions so they use the cached client; ensure
SetupSuite() calls getStorageUrl() and NewStorageClient() only once and assigns
the result to the new storageClient field for subsequent test helper use.

In `@tests/integration-test-suite/run-failover-tests.sh`:
- Around line 25-44: The script detects gotestsum and stores its path in
GOTESTSUM_PATH but later invokes gotestsum by name; change any plain invocations
of gotestsum to use the discovered "$GOTESTSUM_PATH" variable instead (preserve
quoting) so the script always runs the detected binary; update places that call
gotestsum (search for bare "gotestsum" invocations) to use "$GOTESTSUM_PATH" and
ensure PATH exports remain unchanged.

In `@tests/integration-test-suite/run-non-failover-tests.sh`:
- Around line 25-44: The script detects and validates GOTESTSUM_PATH but later
invokes gotestsum by name; update the invocation to use the validated
"$GOTESTSUM_PATH" (quoted) wherever the script runs gotestsum so the resolved
binary is used explicitly; keep the existing PATH exports in the branches that
set GOPATH/bin or $HOME/go/bin, but change calls like gotestsum ... to
"$GOTESTSUM_PATH" ... to avoid relying on PATH and to make the variable
non-dead.

Comment on lines +26 to +38
use_storage_backend:
description: 'Use storage backend (Rapid7) instead of local storage'
required: false
type: boolean
default: false
customer_guid:
description: 'Customer GUID (Account ID) for storage backend'
required: false
default: ''
access_key:
description: 'Access key for storage backend'
required: false
default: ''
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Sensitive credentials exposed as workflow dispatch inputs.

customer_guid and access_key are defined as plaintext workflow_dispatch inputs. These values will be visible in the GitHub Actions UI (workflow run parameters) and potentially in logs. Credentials should be stored as GitHub repository or environment secrets and referenced via ${{ secrets.BACKEND_ACCESS_KEY }}, not passed through the dispatch form.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/manual-integration-tests.yml around lines 26 - 38, The
workflow currently exposes sensitive values via workflow_dispatch inputs
(customer_guid and access_key); remove these plaintext inputs and instead read
credentials from GitHub Secrets (e.g., secrets.BACKEND_CUSTOMER_GUID and
secrets.BACKEND_ACCESS_KEY) in the workflow, keep only non-sensitive flags like
use_storage_backend, and update any steps that referenced customer_guid or
access_key to use the secrets via the secrets context (e.g., replace uses of the
customer_guid/access_key inputs with secrets.BACKEND_CUSTOMER_GUID and
secrets.BACKEND_ACCESS_KEY).


// Verify that profile exists and is partial
applicationProfile, err := fetchApplicationProfile(s.ksObjectConnection, s.testNamespace, "deployment", "scale-partial-test-deployment")
applicationProfile, err := fetchApplicationProfileFromCluster(s.ksObjectConnection, s.testNamespace, "deployment", "scale-partial-test-deployment")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, let's look at the actual file and the lines mentioned
echo "=== File structure and content around lines 69, 92, 103, 107 ==="
cd tests/integration-test-suite || exit 1
wc -l case_scale_deployment_partial_completion_test.go
head -110 case_scale_deployment_partial_completion_test.go | tail -50

Repository: kubescape/storage

Length of output: 3054


🏁 Script executed:

#!/bin/bash
# Check how other test files handle fetchApplicationProfileFromCluster
echo "=== Pattern of fetchApplicationProfileFromCluster across test files ==="
rg -n 'fetchApplicationProfileFromCluster' --type=go -B2 -A2 tests/integration-test-suite/ | head -100

Repository: kubescape/storage

Length of output: 6740


🏁 Script executed:

#!/bin/bash
# Check the integration test setup to understand isRapid7
echo "=== Check integration_test.go for isRapid7 context ==="
rg -n 'isRapid7' --type=go tests/integration-test-suite/ | head -30

Repository: kubescape/storage

Length of output: 6308


Lines 69 and 92 unconditionally fetch from cluster while final verifications properly respect isRapid7 mode.

The intermediate profile fetches at lines 69 and 92 always call fetchApplicationProfileFromCluster regardless of the isRapid7 setting. In contrast, the final verifications at lines 103 and 107 correctly pass isRapid7 to verifyApplicationProfileCompleted and verifyNetworkNeighborProfileCompleted, which internally handle storage backend routing.

Other test files follow the correct pattern:

  • case_statefulset_profile_cleanup_test.go wraps fetchApplicationProfileFromCluster with if !s.isRapid7
  • case_init_sidecar_profile_create_test.go, case_crashloop_profile_incomplete_test.go, and similar tests conditionally call the storage backend when isRapid7 is true

When running in Rapid7 mode where profiles are stored in the backend, these unconditional cluster-based fetches will fail. Add conditional branching at lines 69 and 92 to check isRapid7 before fetching from the cluster, or skip these intermediate checks in Rapid7 mode.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@tests/integration-test-suite/case_scale_deployment_partial_completion_test.go`
at line 69, The intermediate test fetches call
fetchApplicationProfileFromCluster unconditionally but in Rapid7 mode profiles
live in the backend; wrap those intermediate cluster fetches in a conditional so
they only run when s.isRapid7 is false (e.g. if !s.isRapid7 {
applicationProfile, err := fetchApplicationProfileFromCluster(...) } ), or skip
the intermediate check when s.isRapid7 is true, keeping the final calls to
verifyApplicationProfileCompleted and verifyNetworkNeighborProfileCompleted
(which already accept isRapid7) unchanged.

Comment on lines 56 to +65
s.LogWithTimestamp("Killing node agent pod on same node as test pod for failover test")
DeleteNodeAgentPodOnSameNode(s.T(), s.clientset, s.testNamespace, "app=nodeagent-failover-test-deployment")
time.Sleep(10 * time.Second)
//DeleteNodeAgentPodOnSameNode(s.T(), s.clientset, s.testNamespace, "app=nodeagent-failover-test-deployment")

s.LogWithTimestamp("Waiting 4 more minutes for learning period to complete after failover")
time.Sleep(4 * time.Minute)

s.LogWithTimestamp("Verifying profiles are complete after failover")
verifyApplicationProfileCompleted(s.T(), s.ksObjectConnection, "partial", s.testNamespace, "deployment", "nodeagent-failover-test-deployment")
verifyNetworkNeighborProfileCompleted(s.T(), s.ksObjectConnection, false, false, "partial", s.testNamespace, "deployment", "nodeagent-failover-test-deployment")
verifyApplicationProfileCompleted(s.T(), s.ksObjectConnection, "partial", s.testNamespace, "deployment", "nodeagent-failover-test-deployment", s.accountID, s.accessKey, s.isRapid7)
verifyNetworkNeighborProfileCompleted(s.T(), s.ksObjectConnection, false, false, "partial", s.testNamespace, "deployment", "nodeagent-failover-test-deployment", s.accountID, s.accessKey, s.isRapid7)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Commented-out pod deletion breaks the test's core scenario and invalidates the status assertion.

Four cascading issues all stem from Line 58 being commented out:

  1. The failover scenario is never exercised. DeleteNodeAgentPodOnSameNode is never called, so the test no longer validates recovery from a node-agent disruption. A 10-second sleep is not a meaningful substitute.
  2. Log message (Line 56) is misleading. "Killing node agent pod on same node as test pod for failover test" is emitted even though nothing is killed, making test logs untrustworthy for failure diagnosis.
  3. Test description (Lines 13-14) is stale. It still describes the killing step ("kills the node-agent pod on the same node as the test pod after 5 minutes") which no longer happens.
  4. Status assertion "partial" (Lines 64-65) is likely incorrect without a disruption. With an uninterrupted node-agent running the full learning period, profiles are expected to complete fully, not remain "partial". This assertion could produce a spurious pass (if the function accepts either state) or a spurious failure.

If this is intentionally skipped for a specific internal environment, the sleep at Line 57 should be removed and the test should at minimum be marked with a t.Skip(...) or a conditional guard, and the description/log should be updated. Otherwise, the deletion should be restored.

🛠️ Suggested minimal fix if intent is to conditionally skip the kill step
 s.LogWithTimestamp("Killing node agent pod on same node as test pod for failover test")
-time.Sleep(10 * time.Second)
-//DeleteNodeAgentPodOnSameNode(s.T(), s.clientset, s.testNamespace, "app=nodeagent-failover-test-deployment")
+if !s.isRapid7 {
+    DeleteNodeAgentPodOnSameNode(s.T(), s.clientset, s.testNamespace, "app=nodeagent-failover-test-deployment")
+} else {
+    s.T().Log("Skipping node-agent pod deletion: not supported in Rapid7 environment")
+}

And update the expected status on Lines 64-65 to account for whether a disruption actually occurred:

-verifyApplicationProfileCompleted(s.T(), s.ksObjectConnection, "partial", ...)
+expectedStatus := "partial"
+if s.isRapid7 {
+    expectedStatus = "complete"
+}
+verifyApplicationProfileCompleted(s.T(), s.ksObjectConnection, expectedStatus, ...)

Comment on lines +64 to +77
func getStorageUrl() (string, error) {
serviceDiscoveryClient, err := v3.NewServiceDiscoveryClientV3("api.stage-us-east-1.r7.armo-cadr.com")
if err != nil {
return "", fmt.Errorf("failed to create service discovery client: %v", err)
}

services, err := servicediscovery.GetServices(serviceDiscoveryClient)
if err != nil {
return "", fmt.Errorf("failed to get services: %v", err)
}

storageUrl := services.GetStorageUrl()
return storageUrl, nil
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Hardcoded staging URL will break in non-staging environments.

The service discovery URL api.stage-us-east-1.r7.armo-cadr.com on Line 65 is hardcoded. If these tests are ever run against production or a different staging region, they'll silently connect to the wrong backend. Consider making this configurable via an environment variable with this as a default.

Suggested fix
 func getStorageUrl() (string, error) {
-	serviceDiscoveryClient, err := v3.NewServiceDiscoveryClientV3("api.stage-us-east-1.r7.armo-cadr.com")
+	sdHost := os.Getenv("SERVICE_DISCOVERY_HOST")
+	if sdHost == "" {
+		sdHost = "api.stage-us-east-1.r7.armo-cadr.com"
+	}
+	serviceDiscoveryClient, err := v3.NewServiceDiscoveryClientV3(sdHost)
 	if err != nil {
 		return "", fmt.Errorf("failed to create service discovery client: %v", err)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration-test-suite/helpers.go` around lines 64 - 77, The code in
getStorageUrl hardcodes the service discovery host when calling
v3.NewServiceDiscoveryClientV3("api.stage-us-east-1.r7.armo-cadr.com"); change
this to read the host from an environment variable (e.g.,
SERVICE_DISCOVERY_HOST) with the current staging host as the default, and pass
that variable value into v3.NewServiceDiscoveryClientV3; ensure any errors
remain handled the same way and that tests use the env var to target non-staging
environments.

Comment on lines +79 to +89
func NewStorageClient(clusterName, accountID, accessKey, url string) *backendv1.StorageClient {
client, err := backendv1.NewStorageClient(url, accountID, accessKey, clusterName)
if err != nil {
logger.L().Fatal("failed to create storage client", helpers.Error(err))
}
err = client.Connect()
if err != nil {
logger.L().Fatal("failed to connect to storage backend", helpers.Error(err))
}
return client
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

logger.L().Fatal aborts the process without test cleanup.

Fatal-level log calls in most Go logging frameworks (including go-logger) call os.Exit(1), which bypasses t.Cleanup, deferred teardown, and testify suite TearDownTest/TearDownSuite. This means namespaces, deployments, and Helm releases created by the test won't be cleaned up on failure.

Return an error instead and let callers (or the verification functions) decide how to fail.

Suggested fix
-func NewStorageClient(clusterName, accountID, accessKey, url string) *backendv1.StorageClient {
+func NewStorageClient(clusterName, accountID, accessKey, url string) (*backendv1.StorageClient, error) {
 	client, err := backendv1.NewStorageClient(url, accountID, accessKey, clusterName)
 	if err != nil {
-		logger.L().Fatal("failed to create storage client", helpers.Error(err))
+		return nil, fmt.Errorf("failed to create storage client: %w", err)
 	}
 	err = client.Connect()
 	if err != nil {
-		logger.L().Fatal("failed to connect to storage backend", helpers.Error(err))
+		return nil, fmt.Errorf("failed to connect to storage backend: %w", err)
 	}
-	return client
+	return client, nil
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func NewStorageClient(clusterName, accountID, accessKey, url string) *backendv1.StorageClient {
client, err := backendv1.NewStorageClient(url, accountID, accessKey, clusterName)
if err != nil {
logger.L().Fatal("failed to create storage client", helpers.Error(err))
}
err = client.Connect()
if err != nil {
logger.L().Fatal("failed to connect to storage backend", helpers.Error(err))
}
return client
}
func NewStorageClient(clusterName, accountID, accessKey, url string) (*backendv1.StorageClient, error) {
client, err := backendv1.NewStorageClient(url, accountID, accessKey, clusterName)
if err != nil {
return nil, fmt.Errorf("failed to create storage client: %w", err)
}
err = client.Connect()
if err != nil {
return nil, fmt.Errorf("failed to connect to storage backend: %w", err)
}
return client, nil
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration-test-suite/helpers.go` around lines 79 - 89, The helper
NewStorageClient currently calls logger.L().Fatal on errors (logger.L().Fatal in
the NewStorageClient function), which aborts the process and prevents test
cleanup; change NewStorageClient to return (*backendv1.StorageClient, error)
instead of aborting: on backendv1.NewStorageClient or client.Connect errors,
return nil and the wrapped error (with context) rather than calling
logger.L().Fatal; update all callers (tests and helpers that call
NewStorageClient) to handle the returned error and fail the test via t.Fatalf or
propagate the error so deferred cleanup/TearDown logic can run.

clientset *kubernetes.Clientset
ksObjectConnection spdxv1beta1.SpdxV1beta1Interface
isRapid7 bool
// These are only used when storage backend is enbled (Rapid7)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Typo in comment: "enbled" → "enabled".

-	// These are only used when storage backend is enbled (Rapid7)
+	// These are only used when storage backend is enabled (Rapid7)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// These are only used when storage backend is enbled (Rapid7)
// These are only used when storage backend is enabled (Rapid7)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration-test-suite/integration_test.go` at line 27, Fix the typo in
the inline comment that currently reads "These are only used when storage
backend is enbled (Rapid7)" by changing "enbled" to "enabled" so the comment
reads "These are only used when storage backend is enabled (Rapid7)"; locate the
comment string in integration_test.go and update it accordingly.


# List of failover tests to run sequentially
FAILOVER_TESTS="TestSimpleProfileStorageFailover TestSimpleProfileNodeAgentFailover TestLongStorageFailover TestScaleDeploymentPartialCompletion"
FAILOVER_TESTS="TestSimpleProfileNodeAgentFailover"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Same test name as run-non-failover-tests.sh — duplicate test execution in CI.

FAILOVER_TESTS="TestSimpleProfileNodeAgentFailover" matches NON_FAILOVER_TESTS="TestSimpleProfileNodeAgentFailover" in the companion script. Whichever of these two is the intentional owner of this test, the other should be updated to contain its respective tests. See the related comment on run-non-failover-tests.sh.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration-test-suite/run-failover-tests.sh` at line 47, The failover
test list duplicates a test defined in the non-failover script; update one of
the lists to remove the duplicate: decide whether
TestSimpleProfileNodeAgentFailover belongs in FAILOVER_TESTS (in
run-failover-tests.sh) or NON_FAILOVER_TESTS (in run-non-failover-tests.sh),
then remove it from the other script so each test appears only once; modify the
string assigned to FAILOVER_TESTS or NON_FAILOVER_TESTS accordingly (e.g.,
delete or replace "TestSimpleProfileNodeAgentFailover" from FAILOVER_TESTS) and
ensure the remaining test names reflect the intended grouping.


# List of non-failover tests to run in parallel
NON_FAILOVER_TESTS="TestSimpleProfileCreate TestJobProfileCreate TestCronJobProfileCreate TestInitContainerProfileCreate TestSidecarProfileCreate TestInitSidecarProfileCreate TestCrashLoopProfileIncomplete TestCrashLoopAndStableProfileIncomplete TestStatefulSetProfileCleanup"
NON_FAILOVER_TESTS="TestSimpleProfileNodeAgentFailover"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

NON_FAILOVER_TESTS contains a failover test — likely a copy-paste error.

TestSimpleProfileNodeAgentFailover is identical to what run-failover-tests.sh runs (line 47 of that file). Running both scripts in CI will execute the same test twice, and the NON_FAILOVER_TESTS variable name contradicts its contents. The non-failover script should contain the non-failover tests (e.g., TestJobProfileCreate, TestCronJobProfileCreate, TestCrashLoopAndStableProfileIncomplete).

🐛 Suggested fix
-NON_FAILOVER_TESTS="TestSimpleProfileNodeAgentFailover"
+NON_FAILOVER_TESTS="TestJobProfileCreate TestCronJobProfileCreate TestCrashLoopAndStableProfileIncomplete"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
NON_FAILOVER_TESTS="TestSimpleProfileNodeAgentFailover"
NON_FAILOVER_TESTS="TestJobProfileCreate TestCronJobProfileCreate TestCrashLoopAndStableProfileIncomplete"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration-test-suite/run-non-failover-tests.sh` at line 47, The
NON_FAILOVER_TESTS variable currently lists a failover test
(TestSimpleProfileNodeAgentFailover) likely due to copy/paste; update the value
of NON_FAILOVER_TESTS to include the actual non-failover test names (for example
TestJobProfileCreate, TestCronJobProfileCreate,
TestCrashLoopAndStableProfileIncomplete) and remove
TestSimpleProfileNodeAgentFailover so the non-failover script runs the correct
suite; locate the assignment to NON_FAILOVER_TESTS in run-non-failover-tests.sh
and replace the test list accordingly.

@github-actions
Copy link

Summary:

  • License scan: failure
  • Credentials scan: failure
  • Vulnerabilities scan: failure
  • Unit test: success
  • Go linting: failure

@matthyx matthyx moved this to Needs Reviewer in KS PRs tracking Feb 23, 2026
@matthyx matthyx removed the status in KS PRs tracking Feb 23, 2026
@matthyx matthyx moved this to WIP in KS PRs tracking Feb 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: WIP

Development

Successfully merging this pull request may close these issues.

2 participants