Skip to content

Conversation

@petr-muller
Copy link
Member

@petr-muller petr-muller commented Jan 14, 2026

I do not entirely understand why the relaxation for disruption tests exists here but the TODO exists this may be worth tightening. The pityFactor will relax the threshold in some cases (high attempts, high pass rate) so the additional relaxation would only apply in the remaining cases.

Our current aggregation logic is too sensitive, leading to the rejection
of payloads for non-regressions. Analysis shows that a significant
portion of rejected payloads are failing due to infrastructure noise or
existing flakes rather than genuine code regressions.

We have component readiness as a backstop to identify regressions with
greater sample sizes.
I do not _entirely_ understand why the relaxation for disruption tests
exists here but the TODO exists this may be worth tightening. The
`pityFactor` will relax the threshold in some cases (high attempts,
high pass rate) so the additional relaxation would only apply in the
remaining cases.
Copilot AI review requested due to automatic review settings January 14, 2026 16:40
@openshift-ci-robot
Copy link
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jan 14, 2026
@openshift-ci-robot
Copy link
Contributor

@petr-muller: This pull request explicitly references no jira issue.

Details

In response to this:

I do not entirely understand why the relaxation for disruption tests exists
here but the TODO exists this may be worth tightening. The pityFactor will
relax the threshold in some cases (high attempts, high pass rate) so the
additional relaxation would only apply in the remaining cases.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link

coderabbitai bot commented Jan 14, 2026

Walkthrough

The changes introduce a "pity factor" mechanism that relaxes pass requirements for small sample sizes in job run aggregation analysis. A new helper function computes adjusted pass thresholds allowing up to 2 failures for smaller samples, reducing false failures from infrastructure noise. This affects percentile disruption checks and failure determination logic across multiple functions.

Changes

Cohort / File(s) Summary
Implementation - Pity Factor Logic
pkg/jobrunaggregator/jobrunaggregatoranalyzer/pass_fail.go
Introduces pityFactor() helper function that relaxes required pass thresholds for small sample sizes by allowing up to 2 failures. Modifies CheckFailed() and innerCheckPercentileDisruptionWithGrace() to compute strictRequiredNumberOfPasses and apply pity factor logic. Returns adjusted required passes and descriptive messages. Propagates pity factor messaging into summary strings conditionally.
Testing - Pity Factor Coverage
pkg/jobrunaggregator/jobrunaggregatoranalyzer/pass_fail_test.go
Adds two new test functions: TestCheckFailedWithPityFactor() and TestInnerCheckPercentileDisruptionWithPityFactor() to validate pity factor behavior across varying passes, failures, skips, and percentile thresholds. Updates existing percentile-disruption test cases to reflect new expected pass requirements and introduces test scaffolding for dynamic test case construction. Adds jobrunaggregatorapi import.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

✨ Finishing touches
  • 📝 Generate docstrings


📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 69ff8fa and f180675.

📒 Files selected for processing (2)
  • pkg/jobrunaggregator/jobrunaggregatoranalyzer/pass_fail.go
  • pkg/jobrunaggregator/jobrunaggregatoranalyzer/pass_fail_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • pkg/jobrunaggregator/jobrunaggregatoranalyzer/pass_fail_test.go
  • pkg/jobrunaggregator/jobrunaggregatoranalyzer/pass_fail.go
🧬 Code graph analysis (1)
pkg/jobrunaggregator/jobrunaggregatoranalyzer/pass_fail_test.go (2)
pkg/jobrunaggregator/jobrunaggregatoranalyzer/pass_fail.go (1)
  • TestKey (166-169)
pkg/jobrunaggregator/jobrunaggregatorapi/types_row_aggregatedtestrun.go (1)
  • AggregatedTestRunRow (9-22)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Agent
  • GitHub Check: CodeQL analysis (go)
🔇 Additional comments (9)
pkg/jobrunaggregator/jobrunaggregatoranalyzer/pass_fail.go (4)

540-554: LGTM!

The conditional logic correctly appends the pity factor message only when the relaxation actually helped the test pass (i.e., when numberOfPasses < strictRequiredNumberOfPasses). This provides useful diagnostic information in the summary.


623-648: LGTM!

The implementation is consistent with innerCheckPercentileDisruptionWithGrace. The pity factor logic is applied correctly, and the message formatting provides clear diagnostic output.


651-659: LGTM!

The pityFactor function is well-documented and implements the intended relaxation correctly:

  • Allows up to 2 failures by capping required passes at numberOfAttempts - 2
  • Takes the minimum of this cap and the strict requirement, ensuring no over-relaxation
  • Returns both the adjusted requirement and a descriptive message for diagnostics

The implementation aligns with the PR objective of reducing aggregation failures caused by infrastructure noise while relying on component readiness for regression detection over larger samples.


504-506: Potential index out-of-bounds if numberOfAttempts exceeds 12.

The requiredPassesByPassPercentageByNumberOfAttempts array only has entries for attempts 0-12. If numberOfAttempts > 12 (i.e., more than 12 job runs are passed to innerCheckPercentileDisruptionWithGrace or CheckFailed), accessing the array at line 505 or 623 will panic. No bounds validation prevents this scenario. This is a pre-existing issue, not introduced by this PR, but it's exposed by the modified code paths.

pkg/jobrunaggregator/jobrunaggregatoranalyzer/pass_fail_test.go (5)

9-9: LGTM!

The import is required for the AggregatedTestRunRow type used in the new pity factor tests.


61-87: LGTM!

Test expectations correctly updated to reflect the removal of the additional relaxation. The required passes and disruption arrays are consistent with the new pity factor calculation logic.


207-219: LGTM!

The updated expectations for the 80th percentile test correctly reflect 5 natural passes needed, with the disruption array producing exactly 5 successes (values ≤ 1).


295-486: Well-structured test coverage for pity factor behavior.

The table-driven tests comprehensively cover the pity factor logic across different reliability tiers (70%-100% working percentage). The test setup pattern using queryTestRunsOnce.Do(func() {}) to bypass the BigQuery client is a pragmatic approach for unit testing.


488-571: LGTM!

Good test coverage for innerCheckPercentileDisruptionWithGrace. The invariant assertion on line 567-568 ensuring totalAttempts == successes + failures is a valuable sanity check. Tests correctly validate both the required passes calculation and the final status.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@petr-muller
Copy link
Member Author

/hold

Needs #4894

@petr-muller petr-muller marked this pull request as draft January 14, 2026 16:41
@openshift-ci openshift-ci bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Jan 14, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 14, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: petr-muller

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 14, 2026
@petr-muller
Copy link
Member Author

/uncc @danilo-gemoli @droslean
/cc @neisw @xueqzhan

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request removes the simple -1 relaxation for disruption tests and replaces it with a more sophisticated pityFactor function that allows up to 2 failures regardless of the strict pass requirement. The change affects both disruption percentile checks and regular test failure checks.

Changes:

  • Replaced the -1 adjustment with a pityFactor function that allows up to 2 failures
  • Updated test cases to reflect new expected pass requirements (7 instead of 6, 5 instead of 4)
  • Added comprehensive test coverage for the pity factor behavior

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
pkg/jobrunaggregator/jobrunaggregatoranalyzer/pass_fail.go Removed -1 relaxation, added pityFactor function, integrated pity factor into disruption and regular checks, updated summary messages
pkg/jobrunaggregator/jobrunaggregatoranalyzer/pass_fail_test.go Updated existing test expectations, added comprehensive tests for CheckFailedWithPityFactor and innerCheckPercentileDisruptionWithPityFactor

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@petr-muller
Copy link
Member Author

/test all

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 15, 2026

@petr-muller: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/breaking-changes f180675 link false /test breaking-changes
ci/prow/images f180675 link true /test images

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants