Skip to content

Conversation

@ggordaniRed
Copy link
Collaborator

@ggordaniRed ggordaniRed commented Dec 7, 2025

Summary by CodeRabbit

  • New Features

    • Added Network Operator CI Dashboard alongside existing GPU Operator dashboard for comprehensive test coverage monitoring
    • Implemented conditional dashboard generation that regenerates dashboards only when underlying test data changes for improved efficiency
  • Chores

    • Updated development environment configuration files to exclude system and editor metadata
    • Enhanced documentation for dashboard generation infrastructure and workflows

✏️ Tip: You can customize this high-level summary in your review settings.

adding new utills for generating new dasboard for NNO
Critical fixes for NNO dashboard data collection:

1. Fixed file structure detection:
   - Network Operator jobs have finished.json at build root (not in artifacts/)
   - Fetch all finished.json files and filter for 'nvidia-network-operator' in path
   - Ensures finished.json is at correct location

2. Added robust version file detection:
   - Try multiple possible artifact paths for ocp.version and operator.version
   - Patterns: **/network-operator-e2e/, **/artifacts/, **/nvidia-network-operator*/
   - Filter results to only include Network Operator jobs
   - Handles various test configurations (DOCA4, bare-metal, hosted, etc.)

3. Improved file filtering:
   - Filter out nested artifacts to avoid duplicates
   - Verify files are in expected locations

These changes fix the issue where GitHub Actions was finding 0 Network
Operator files despite PR #67673 containing Network Operator jobs.

Related to: PR #67673 testing and dashboard generation
The version files were being fetched correctly but the lookup dictionary
keys didn't match the build directory keys used during processing.

Problem:
- Version files are deeply nested: .../BUILD_ID/artifacts/.../ocp.version
- Was using parent directory as key (just the artifacts dir)
- Build processing was looking up by full build path
- Result: 'Missing ocp.version' warnings despite files being found

Solution:
- Extract BUILD_ID from version file paths using regex
- Build lookup dictionary with full build path as key:
  pr-logs/pull/{repo}/{pr}/{job_name}/{build_id}
- Use same key format when looking up versions during processing
- Now keys match and versions are found!

This fixes the issue where GitHub Actions found version files but
processed 0 builds due to lookup key mismatch.
Added debug logging to show:
- The build_dir key being used for lookup
- Available keys in ocp_lookup dictionary
- Available keys in operator_lookup dictionary

This will help identify why lookups are failing.
The issue was in how we extracted the build_id from version file paths:
- Version files are deeply nested: .../build_id/artifacts/.../ocp.version
- Using regex on the full path matched 'artifacts' as build_id
- Solution: Extract build_id from path position (parts[5])

This fixes the 'Missing ocp.version' warnings and allows proper data collection.
@openshift-ci
Copy link

openshift-ci bot commented Dec 7, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ggordaniRed

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved label Dec 7, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 7, 2025

Walkthrough

This PR extends CI dashboard generation to support NVIDIA Network Operator alongside the existing GPU Operator. It introduces a shared workflows/common/ module with reusable utilities for GCS access, data structures, validation, and HTML generation. The GPU Operator dashboard is refactored to use these utilities, and a complete new Network Operator dashboard pipeline is implemented. The GitHub Actions workflow conditionally generates both dashboards only when their JSON data changes.

Changes

Cohort / File(s) Change Summary
GitHub Actions Workflow
.github/workflows/generate_matrix_page.yaml
Adds parallel Network Operator dashboard handling: sets environment variables for NNO paths, installs requirements, fetches NNO CI data, and conditionally runs generation scripts based on JSON diffs (GPU_CHANGED and NNO_CHANGED).
Git & IDE Config
.gitignore
Adds .DS_Store and .vscode/ to ignored patterns.
Documentation
workflows/README.md
Documents the new nno_dashboard/ directory structure and refactors shared module description to describe new common/ package.
Shared Utilities Package
workflows/common/__init__.py, workflows/common/data_structures.py, workflows/common/data_fetching.py, workflows/common/gcs_utils.py, workflows/common/html_builders.py, workflows/common/validation.py
Establishes centralized workflows.common module with public APIs for: version/status constants and TestResult dataclass; GCS utilities (HTTP, file fetching, Prow URLs); data extraction and transformation (version lookups, timestamp/status extraction, repo inference); HTML generation (TOC, notes, history bars, footers, sanitization); and validation (OCP versions, semantic versions, infrastructure types).
GPU Operator Dashboard Refactoring
workflows/gpu_operator_dashboard/fetch_ci_data.py, workflows/gpu_operator_dashboard/generate_ci_dashboard.py
Refactors to import GCS utilities, data-fetching helpers, and validation from workflows.common; removes in-file implementations; updates merging logic to use shared merge_job_history_links; updates semantic-version validation calls to pass version-key parameters.
Network Operator Dashboard Implementation
workflows/nno_dashboard/__init__.py, workflows/nno_dashboard/README.md, workflows/nno_dashboard/fetch_ci_data.py, workflows/nno_dashboard/generate_ci_dashboard.py, workflows/nno_dashboard/requirements.txt
Introduces complete NNO dashboard pipeline: fetch script extracts test flavors, processes finished.json artifacts, constructs TestResult objects, and merges with baseline; generate script restructures data by NNO version, builds per-OCP HTML matrices, and assembles final dashboard. Includes comprehensive README and Python dependencies (requests, semver, pydantic).
NNO Dashboard Templates
workflows/nno_dashboard/templates/header.html, workflows/nno_dashboard/templates/main_table.html, workflows/nno_dashboard/templates/test_flavor_section.html
Provides minimal HTML template structure: header with title and stylesheet link; main table with OCP version container and placeholders for notes and flavor sections; test flavor section with labeled div and tbody for dynamic rows.
Output Data File
workflows/test_matrix_dashboard/output/network_operator_matrix.json
Initializes empty JSON object as placeholder for NNO test-matrix output.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Areas requiring extra attention:

  • Shared utilities correctness (workflows/common/data_structures.py, workflows/common/validation.py, workflows/common/gcs_utils.py): verify data transformation logic, error handling, and backward compatibility (especially TestResult.from_dict operator_version fallback)
  • GPU Operator refactoring (workflows/gpu_operator_dashboard/fetch_ci_data.py, workflows/gpu_operator_dashboard/generate_ci_dashboard.py): confirm that extracted utilities preserve original behavior and that semantic-version validation signature changes are correctly applied at all call sites
  • NNO dashboard implementation (workflows/nno_dashboard/fetch_ci_data.py, workflows/nno_dashboard/generate_ci_dashboard.py): review data aggregation logic, flavor extraction heuristics, and HTML matrix rendering for correctness and edge cases (missing artifacts, empty results)
  • Workflow YAML logic (.github/workflows/generate_matrix_page.yaml): verify JSON diff detection and conditional branch logic are correct

Possibly related PRs

  • PR #291: Establishes code-level reorganization of workflows into subdirectories; this PR builds on that structure by centralizing shared utilities into the new workflows/common/ package.
  • PR #295: Modifies GPU dashboard data model and introduces canonical version keys (OCP_FULL_VERSION, GPU_OPERATOR_VERSION) that this PR adopts across shared utilities and both dashboards.
  • PR #179: Implements conditional "generate HTML only if JSON changed" pattern; this PR extends that approach to both GPU and Network Operator dashboards.

Suggested labels

ok-to-test

Suggested reviewers

  • empovit
  • wabouhamad
  • TomerNewman

Poem

🐰 Hop, hop! A dashboard springs to life,
Network Operator joins the dance with flair,
Shared libraries ease the strife,
Parallel workflows float through the air!
CI tests now multiply with grace,

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'New nno dashboard' is vague and generic, using non-descriptive terminology that doesn't convey meaningful information about the changeset's scope or purpose. Consider a more descriptive title such as 'Add Network Operator CI dashboard generation' or 'Implement NNO dashboard with data fetching and HTML generation' to better summarize the primary change.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 94.74% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (15)
workflows/nno_dashboard/README.md (1)

44-68: Solid documentation; consider adding a language to the job-name code fence

The README gives a clear end-to-end view of data flow, flavors, and CI wiring; it's very helpful for future contributors. One minor markdown nit: the fenced block showing the job-name pattern (lines 121–123) lacks a language spec, which markdownlint flags. You could mark it as text (or bash) to satisfy MD040:

-```
+```text
 pull-ci-rh-ecosystem-edge-nvidia-ci-main-{infrastructure}-nvidia-network-operator-{test-type}





Also applies to: 70-82, 95-99, 121-129, 160-175

</blockquote></details>
<details>
<summary>workflows/common/html_builders.py (1)</summary><blockquote>

`31-55`: **Minor typing and robustness tweaks for HTML helpers**

The helpers look good and centralize a lot of duplicated markup; a couple of small refinements could help:

- In `build_history_bar`, `int(result.get("job_timestamp", 0))` will raise if `job_timestamp` is a non-numeric string. If there’s any chance of malformed data, consider a `try`/`except` around the cast and either skip such entries or fall back to a safe default.
- You’re comparing against string literals (`"SUCCESS"`, `"FAILURE"`) here; if you want tighter coupling with `STATUS_*` constants from `data_structures`, you could pass those in or import them to avoid divergence over time.
- Ruff’s RUF013 can be resolved by making the `timestamp` annotation explicitly optional:

```diff
-def build_last_updated_footer(timestamp: str = None) -> str:
+def build_last_updated_footer(timestamp: str | None = None) -> str:

None of these are blocking; they’re just small maintainability and safety improvements.

Also applies to: 57-122, 125-144, 147-161

workflows/common/gcs_utils.py (1)

21-39: GCS helpers look solid; consider explicit Optional typing for HTTP params

The GCS utilities are well factored: shared JSON fetching, file content retrieval with timeouts, prefix/glob-based listing with pagination, and clear URL builders. A couple of minor polish points:

  • To satisfy Ruff’s RUF013 and make the intent clearer, you can mark params and headers as explicitly optional:
-from typing import Dict, Any
+from typing import Any, Dict
@@
-def http_get_json(url: str, params: Dict[str, Any] = None, headers: Dict[str, str] = None) -> Dict[str, Any]:
+def http_get_json(
+    url: str,
+    params: Dict[str, Any] | None = None,
+    headers: Dict[str, str] | None = None,
+) -> Dict[str, Any]:
  • Using a gcsweb-based URL in build_prow_job_url is consistent with the project’s preference for artifact navigation over the Prow UI, while still leveraging Prow’s job-history endpoint where appropriate.

Functionally this looks good as-is; the above is optional cleanup.

Based on learnings, the gcsweb artifact URL choice aligns with earlier feedback about navigation UX.

Also applies to: 41-61, 64-76, 78-116, 119-129

workflows/common/__init__.py (1)

48-93: Optional: Consider sorting __all__ for linter compliance.

Static analysis (RUF022) suggests sorting __all__. The current logical grouping by category is readable, but if you want linter compliance, you could either:

  1. Sort alphabetically
  2. Add # noqa: RUF022 to suppress the warning while keeping the logical grouping

The logical grouping is a valid choice for maintainability.

workflows/common/validation.py (1)

28-29: Optional: Extract shared infrastructure types constant.

The invalid_keys list in is_valid_ocp_version duplicates infrastructure_types in is_infrastructure_type. Consider extracting a module-level constant to ensure consistency:

+INFRASTRUCTURE_TYPES = ["doca4", "bare-metal", "hosted", "unknown"]
+
+
 def is_valid_ocp_version(version: str) -> bool:
     ...
-    invalid_keys = ["doca4", "bare-metal", "hosted", "unknown"]
-    if version.lower() in invalid_keys:
+    if version.lower() in INFRASTRUCTURE_TYPES:
         return False
     ...

 def is_infrastructure_type(value: str) -> bool:
     ...
-    infrastructure_types = ["doca4", "bare-metal", "hosted", "unknown"]
-    return value.lower() in infrastructure_types
+    return value.lower() in INFRASTRUCTURE_TYPES

Also applies to: 88-89

workflows/nno_dashboard/generate_ci_dashboard.py (4)

1-1: Shebang without executable permission.

The file has a shebang (#!/usr/bin/env python3) but lacks executable permission. Either make the file executable or remove the shebang if it's only intended to be run as a module.


114-127: TODOs indicate incomplete data extraction.

Lines 116 and 126 have hardcoded placeholder values with TODO comments:

  • gpu_operator_version = "master-latest"
  • "hardware": "ConnectX-6 Dx"

These may produce misleading information in the dashboard. Consider extracting these values from actual test data or clearly marking them as "Unknown" until the data source is identified.

Would you like me to help identify where this data might be available in the test artifacts, or open an issue to track this improvement?


259-260: NNO version sorting has the same lexicographic issue.

The sorting at line 260 uses basic string comparison. Consider semantic version sorting for consistency with GPU operator version display.


324-327: Move import os to the top of the file.

The os module is imported inside main(). For consistency and clarity, place it at the top of the file with other imports.

 import argparse
 import json
+import os
 import semver
 from typing import Dict, List, Any, Set
workflows/common/data_fetching.py (1)

162-176: Consider adding error handling for invalid integer strings.

If value is not a valid integer string (e.g., "abc"), int(value) will raise a ValueError. While this is used with argparse where input is somewhat controlled, adding defensive handling would make the utility more robust.

 def int_or_none(value: Optional[str]) -> Optional[int]:
     if value is None:
         return None
     if value.lower() in ('none', 'unlimited'):
         return None
-    return int(value)
+    try:
+        return int(value)
+    except ValueError:
+        raise ValueError(f"Expected integer or 'none'/'unlimited', got: {value}")
workflows/nno_dashboard/fetch_ci_data.py (5)

1-1: Shebang without executable permission.

Same as the generate_ci_dashboard.py - either make executable or remove shebang.


142-142: Move imports to the top of the file.

The from workflows.common.data_fetching import ... at line 142 is inside the function body. For consistency and to avoid repeated import overhead, move this to the top-level imports.

 from workflows.common import (
     logger,
     fetch_gcs_file_content,
     fetch_filtered_files,
     build_prow_job_url,
     build_job_history_url,
     TestResult,
     OCP_FULL_VERSION,
     OPERATOR_VERSION,
     STATUS_SUCCESS,
     STATUS_FAILURE,
     STATUS_ABORTED,
     is_valid_ocp_version,
 )
+from workflows.common.data_fetching import (
+    extract_test_status,
+    extract_timestamp,
+    determine_repo_from_job_name,
+    merge_job_history_links,
+    convert_sets_to_lists_recursive,
+)

Then remove the inline imports at lines 142, 175, and 360.


271-274: Remove or prefix unused variables.

Variables pr_num and repo are extracted but never used. Either remove them or prefix with _ to indicate intentional non-use.

         job_name = match.group("job_name")
         build_id = match.group("build_id")
-        pr_num = match.group("pr_number")
-        repo = match.group("repo")
+        # pr_num and repo available via: match.group("pr_number"), match.group("repo")

202-222: Consider consolidating version file fetching.

The current approach makes 6 separate GCS API calls (3 patterns × 2 file types). Consider:

  1. Using a broader glob pattern and filtering locally
  2. Deduplicating results by file path to avoid potential duplicates

This is a performance consideration for PRs with many builds.


347-351: Fix type hint for optional parameter.

The existing_results parameter defaults to None but the type hint doesn't indicate it's optional. Use Optional or the union syntax.

+from typing import Optional  # if not already imported
+
 def merge_and_save_results(
     new_results: Dict[str, Dict[str, Any]],
     output_filepath: str,
-    existing_results: Dict[str, Dict[str, Any]] = None
+    existing_results: Optional[Dict[str, Dict[str, Any]]] = None
 ) -> None:
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4535eab and a38ffe2.

📒 Files selected for processing (20)
  • .github/workflows/generate_matrix_page.yaml (2 hunks)
  • .gitignore (1 hunks)
  • workflows/README.md (1 hunks)
  • workflows/common/__init__.py (1 hunks)
  • workflows/common/data_fetching.py (1 hunks)
  • workflows/common/data_structures.py (1 hunks)
  • workflows/common/gcs_utils.py (1 hunks)
  • workflows/common/html_builders.py (1 hunks)
  • workflows/common/validation.py (1 hunks)
  • workflows/gpu_operator_dashboard/fetch_ci_data.py (2 hunks)
  • workflows/gpu_operator_dashboard/generate_ci_dashboard.py (2 hunks)
  • workflows/nno_dashboard/README.md (1 hunks)
  • workflows/nno_dashboard/__init__.py (1 hunks)
  • workflows/nno_dashboard/fetch_ci_data.py (1 hunks)
  • workflows/nno_dashboard/generate_ci_dashboard.py (1 hunks)
  • workflows/nno_dashboard/requirements.txt (1 hunks)
  • workflows/nno_dashboard/templates/header.html (1 hunks)
  • workflows/nno_dashboard/templates/main_table.html (1 hunks)
  • workflows/nno_dashboard/templates/test_flavor_section.html (1 hunks)
  • workflows/test_matrix_dashboard/output/network_operator_matrix.json (1 hunks)
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: empovit
Repo: rh-ecosystem-edge/nvidia-ci PR: 295
File: workflows/gpu_operator_dashboard/fetch_ci_data.py:61-64
Timestamp: 2025-08-31T15:44:50.325Z
Learning: In the nvidia-ci project, gcsweb URLs are preferred over prow.ci.openshift.org URLs for the GPU operator dashboard because gcsweb allows easier navigation up the directory structure, while Prow UI requires switching to "Artifacts" first.
📚 Learning: 2025-08-31T15:44:50.325Z
Learnt from: empovit
Repo: rh-ecosystem-edge/nvidia-ci PR: 295
File: workflows/gpu_operator_dashboard/fetch_ci_data.py:61-64
Timestamp: 2025-08-31T15:44:50.325Z
Learning: In the nvidia-ci project, gcsweb URLs are preferred over prow.ci.openshift.org URLs for the GPU operator dashboard because gcsweb allows easier navigation up the directory structure, while Prow UI requires switching to "Artifacts" first.

Applied to files:

  • workflows/nno_dashboard/README.md
  • workflows/gpu_operator_dashboard/fetch_ci_data.py
📚 Learning: 2025-08-27T18:08:59.287Z
Learnt from: empovit
Repo: rh-ecosystem-edge/nvidia-ci PR: 291
File: workflows/gpu-operator-versions/nvidia_gpu_operator.py:58-58
Timestamp: 2025-08-27T18:08:59.287Z
Learning: In the nvidia-ci repository, the GH_AUTH_TOKEN environment variable is set using base64 encoding of the GitHub token: `export GH_AUTH_TOKEN=$(echo ${{ secrets.GITHUB_TOKEN }} | base64)` and this works successfully with GitHub Container Registry (GHCR) Bearer authentication in workflows/gpu-operator-versions/nvidia_gpu_operator.py.

Applied to files:

  • .github/workflows/generate_matrix_page.yaml
📚 Learning: 2025-11-09T13:12:43.631Z
Learnt from: empovit
Repo: rh-ecosystem-edge/nvidia-ci PR: 359
File: workflows/gpu_operator_versions/update_versions.py:100-106
Timestamp: 2025-11-09T13:12:43.631Z
Learning: In the nvidia-ci repository's `workflows/gpu_operator_versions/update_versions.py`, the `gpu_releases` variable is intentionally limited to the latest 2 GPU operator versions (via `get_latest_versions(list(gpu_versions.keys()), 2)` in main()). For maintenance OCP versions with pinned GPU operators, if the pinned version is not among the latest 2 releases, the test is intentionally skipped with a warning rather than falling back to test with any available version. This is because testing with outdated GPU operators would fail in their infrastructure, and they prefer to notice missing tests rather than deal with failed test runs, especially since GitHub organization repositories don't send notification emails about failed workflows.

Applied to files:

  • workflows/gpu_operator_dashboard/fetch_ci_data.py
  • workflows/gpu_operator_dashboard/generate_ci_dashboard.py
🧬 Code graph analysis (7)
workflows/common/html_builders.py (1)
workflows/gpu_operator_dashboard/generate_ci_dashboard.py (2)
  • build_toc (160-171)
  • build_notes (142-157)
workflows/common/data_fetching.py (1)
workflows/common/gcs_utils.py (1)
  • fetch_gcs_file_content (41-61)
workflows/common/data_structures.py (1)
workflows/gpu_operator_dashboard/fetch_ci_data.py (2)
  • TestResult (67-110)
  • to_dict (75-82)
workflows/common/__init__.py (7)
workflows/common/utils.py (1)
  • get_logger (5-15)
workflows/common/templates.py (1)
  • load_template (9-37)
workflows/common/data_structures.py (1)
  • TestResult (22-58)
workflows/common/gcs_utils.py (5)
  • http_get_json (21-38)
  • fetch_gcs_file_content (41-61)
  • build_prow_job_url (64-75)
  • fetch_filtered_files (78-116)
  • build_job_history_url (119-129)
workflows/common/html_builders.py (5)
  • build_toc (10-28)
  • build_notes (31-54)
  • build_history_bar (57-122)
  • build_last_updated_footer (125-144)
  • sanitize_id (147-161)
workflows/common/validation.py (3)
  • is_valid_ocp_version (10-43)
  • has_valid_semantic_versions (46-75)
  • is_infrastructure_type (78-89)
workflows/common/data_fetching.py (8)
  • build_version_lookups (12-40)
  • build_finished_lookup (43-66)
  • extract_test_status (69-90)
  • extract_timestamp (93-103)
  • determine_repo_from_job_name (106-116)
  • convert_sets_to_lists_recursive (119-136)
  • merge_job_history_links (139-159)
  • int_or_none (162-176)
workflows/gpu_operator_dashboard/fetch_ci_data.py (2)
workflows/common/gcs_utils.py (4)
  • http_get_json (21-38)
  • fetch_gcs_file_content (41-61)
  • build_prow_job_url (64-75)
  • fetch_filtered_files (78-116)
workflows/common/data_fetching.py (2)
  • int_or_none (162-176)
  • merge_job_history_links (139-159)
workflows/nno_dashboard/fetch_ci_data.py (5)
workflows/common/gcs_utils.py (4)
  • fetch_gcs_file_content (41-61)
  • fetch_filtered_files (78-116)
  • build_prow_job_url (64-75)
  • build_job_history_url (119-129)
workflows/common/data_structures.py (2)
  • TestResult (22-58)
  • to_dict (31-43)
workflows/gpu_operator_dashboard/fetch_ci_data.py (2)
  • TestResult (67-110)
  • to_dict (75-82)
workflows/common/validation.py (1)
  • is_valid_ocp_version (10-43)
workflows/common/data_fetching.py (5)
  • extract_test_status (69-90)
  • extract_timestamp (93-103)
  • determine_repo_from_job_name (106-116)
  • merge_job_history_links (139-159)
  • convert_sets_to_lists_recursive (119-136)
workflows/gpu_operator_dashboard/generate_ci_dashboard.py (1)
workflows/common/validation.py (1)
  • has_valid_semantic_versions (46-75)
🪛 LanguageTool
workflows/nno_dashboard/README.md

[uncategorized] ~20-~20: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...frastructure - Bare Metal: Tests on bare metal servers - Hosted: Tests in hosted e...

(EN_COMPOUND_ADJECTIVE_INTERNAL)


[style] ~105-~105: ‘with success’ might be wordy. Consider a shorter alternative.
Context: ...tor versions 5. Creates clickable links with success/failure styling ## Shared Utilities T...

(EN_WORDINESS_PREMIUM_WITH_SUCCESS)


[uncategorized] ~160-~160: The official name of this software platform is spelled with a capital “H”.
Context: ...alled by the GitHub Actions workflow in .github/workflows/generate_matrix_page.yaml: ...

(GITHUB)


[uncategorized] ~181-~181: The official name of this software platform is spelled with a capital “H”.
Context: ...ss dashboards - GitHub Actions Workflow - ...

(GITHUB)

🪛 markdownlint-cli2 (0.18.1)
workflows/nno_dashboard/README.md

121-121: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🪛 Ruff (0.14.7)
workflows/common/validation.py

41-41: Consider moving this statement to an else block

(TRY300)

workflows/common/gcs_utils.py

21-21: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


21-21: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

workflows/common/html_builders.py

125-125: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

workflows/common/data_fetching.py

36-36: Do not catch blind exception: Exception

(BLE001)


63-63: Do not catch blind exception: Exception

(BLE001)

workflows/common/__init__.py

48-93: __all__ is not sorted

Apply an isort-style sorting to __all__

(RUF022)

workflows/nno_dashboard/fetch_ci_data.py

1-1: Shebang is present but file is not executable

(EXE001)


242-242: Do not catch blind exception: Exception

(BLE001)


254-254: Do not catch blind exception: Exception

(BLE001)


273-273: Local variable pr_num is assigned to but never used

Remove assignment to unused variable pr_num

(F841)


274-274: Local variable repo is assigned to but never used

Remove assignment to unused variable repo

(F841)


282-282: Do not catch blind exception: Exception

(BLE001)


307-307: Do not catch blind exception: Exception

(BLE001)


350-350: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


374-374: Loop control variable flavor not used within loop body

(B007)

workflows/nno_dashboard/generate_ci_dashboard.py

1-1: Shebang is present but file is not executable

(EXE001)

🔇 Additional comments (29)
workflows/nno_dashboard/requirements.txt (1)

1-4: Requirements entries look appropriate for the new dashboard module

The requirements file is minimal and aligned with typical needs for HTTP access, version parsing, and data modeling in these workflows; no blocking issues from this review.

.gitignore (1)

5-6: Good to ignore local editor and OS artifacts

Adding .DS_Store and .vscode/ is a sensible safeguard against committing local environment files.

workflows/nno_dashboard/__init__.py (1)

1-3: Package docstring is clear and sufficient

The module-level docstring succinctly documents the purpose of the nno_dashboard package; no further initialization logic is needed here.

workflows/README.md (1)

9-11: Directory structure docs are up to date

Adding nno_dashboard/ and documenting the new common/ shared utilities keeps this overview aligned with the current workflows layout.

workflows/common/data_structures.py (1)

8-18: Tighten TestResult dict handling and guard against missing operator version

Two small robustness/consistency points:

  • to_dict hard-codes "operator_version" instead of using the shared OPERATOR_VERSION constant.
  • from_dict will set operator_version to None if both operator_version and gpu_operator_version are absent, which violates the type annotation and can blow up later in version-handling code.

You can address both with a small refactor:

@@
-        result = {
-            OCP_FULL_VERSION: self.ocp_full_version,
-            "operator_version": self.operator_version,
+        result = {
+            OCP_FULL_VERSION: self.ocp_full_version,
+            OPERATOR_VERSION: self.operator_version,
@@
-        # Handle backward compatibility with GPU operator data
-        operator_version = data.get("operator_version") or data.get(GPU_OPERATOR_VERSION)
-        
+        # Handle backward compatibility with GPU operator data
+        operator_version = data.get(OPERATOR_VERSION) or data.get(GPU_OPERATOR_VERSION)
+        if operator_version is None:
+            raise KeyError(
+                f"Missing '{OPERATOR_VERSION}' or '{GPU_OPERATOR_VERSION}' in TestResult data: {data}"
+            )
+
         return cls(
             ocp_full_version=data[OCP_FULL_VERSION],
             operator_version=operator_version,

This keeps the JSON schema centralized via constants and fails fast when the input is structurally invalid instead of propagating a None value.

Also applies to: 21-58

⛔ Skipped due to learnings
Learnt from: empovit
Repo: rh-ecosystem-edge/nvidia-ci PR: 359
File: workflows/gpu_operator_versions/update_versions.py:100-106
Timestamp: 2025-11-09T13:12:43.631Z
Learning: In the nvidia-ci repository's `workflows/gpu_operator_versions/update_versions.py`, the `gpu_releases` variable is intentionally limited to the latest 2 GPU operator versions (via `get_latest_versions(list(gpu_versions.keys()), 2)` in main()). For maintenance OCP versions with pinned GPU operators, if the pinned version is not among the latest 2 releases, the test is intentionally skipped with a warning rather than falling back to test with any available version. This is because testing with outdated GPU operators would fail in their infrastructure, and they prefer to notice missing tests rather than deal with failed test runs, especially since GitHub organization repositories don't send notification emails about failed workflows.
workflows/test_matrix_dashboard/output/network_operator_matrix.json (1)

1-1: LGTM!

Valid placeholder for the new Network Operator dashboard data. The empty JSON object will be populated by the fetch_ci_data script.

workflows/nno_dashboard/templates/header.html (1)

1-12: LGTM!

Clean HTML template header with proper accessibility attributes (lang="en") and responsive viewport configuration. The unclosed body tag is intentional for template composition with the footer.

workflows/nno_dashboard/templates/test_flavor_section.html (1)

1-15: LGTM!

Well-structured HTML template with semantic table elements and appropriate placeholders for dynamic content injection.

workflows/nno_dashboard/templates/main_table.html (1)

1-9: LGTM!

Good structure with anchor-based navigation support. The anchor ID pattern (ocp-{ocp_key}) aligns with the build_toc function that generates #ocp-{ocp_version} links.

.github/workflows/generate_matrix_page.yaml (3)

71-74: LGTM on adding NNO requirements.

Good practice to install both dashboard requirements to support parallel generation.


84-88: LGTM on NNO data fetching.

Correctly mirrors the GPU Operator fetch pattern with NNO-specific paths.


104-111: The stated concern about git diff failing with an error is incorrect.

git diff --exit-code does not fail when a file is new or doesn't exist in the branch. Instead, it compares the working tree against HEAD and returns exit code 0 (no differences) or 1 (differences found). Untracked and new files are handled gracefully without errors. The suggested fix of adding 2>/dev/null is unnecessary and doesn't address any actual problem.

The existing logic is sound: when network_operator_matrix.json is generated in the working directory and compared against gh-pages, git diff will detect it as a new file (exit code 1), correctly setting NNO_CHANGED=true.

Likely an incorrect or invalid review comment.

workflows/common/__init__.py (1)

1-46: LGTM! Clean module organization.

Well-structured public API surface with clear categorization of utilities, data structures, GCS helpers, HTML builders, validation, and data fetching.

workflows/common/validation.py (2)

10-43: LGTM! Robust validation logic.

Good defensive programming with early returns and proper exception handling. The validation correctly handles edge cases like empty strings and non-numeric parts.


46-75: LGTM! Proper semver validation with suffix handling.

Good handling of operator versions with suffixes like "23.9.0(bundle)" by splitting on ( before parsing.

workflows/gpu_operator_dashboard/generate_ci_dashboard.py (2)

11-11: LGTM!

The import of has_valid_semantic_versions from the centralized workflows.common.validation module is appropriate and aligns with the PR's goal of consolidating shared utilities.


40-41: Good use of the refactored validation function.

Explicitly passing OCP_FULL_VERSION and GPU_OPERATOR_VERSION as key arguments makes the code more readable and ensures correct field lookup in the result dictionary.

workflows/nno_dashboard/generate_ci_dashboard.py (1)

24-61: Well-organized flavor mappings.

The FLAVOR_COLUMN_MAP and COLUMN_ORDER constants provide a clean separation between job naming conventions and display columns, making future additions straightforward.

workflows/common/data_fetching.py (7)

12-40: LGTM!

The build_version_lookups function handles failures gracefully by logging warnings and continuing with other files. This resilience is appropriate for batch processing of external resources.


43-66: LGTM!

Similar resilient error handling pattern. The function correctly continues processing when individual files fail.


69-90: LGTM!

Clean implementation that normalizes test status with sensible defaults. Passing status constants as parameters provides flexibility for different dashboards.


93-103: LGTM!

Simple and correct.


106-116: LGTM!

Clean implementation matching the pattern used in other fetch_ci_data modules.


119-136: LGTM!

Correct recursive implementation for JSON serialization preparation.


139-159: LGTM!

Robust implementation that handles mixed input types (set or list) gracefully.

workflows/gpu_operator_dashboard/fetch_ci_data.py (2)

14-25: Good refactor to use centralized utilities.

The imports from workflows.common.gcs_utils and workflows.common.data_fetching consolidate shared functionality, reducing code duplication across dashboards.


541-544: LGTM!

Correct usage of the centralized merge_job_history_links function, replacing the previous inline merging logic.

workflows/nno_dashboard/fetch_ci_data.py (2)

40-117: LGTM!

Comprehensive parsing logic that handles various job name patterns. The fallback to "Standard" is sensible for unknown patterns.


414-440: LGTM!

The main function has proper error handling for loading existing data and follows the established pattern from the GPU operator dashboard.

Comment on lines +147 to +148
# Sort GPU operator versions
sorted_gpu_ops = sorted(gpu_operators_data.keys())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Lexicographic sorting may produce incorrect version ordering.

sorted(gpu_operators_data.keys()) performs alphabetical sorting, which is incorrect for semantic versions. For example, "25.10.0" would sort before "25.4.0".

Consider using semantic version sorting as done in the GPU operator dashboard:

+import semver
+
 def build_matrix_table(ocp_version: str, gpu_operators_data: Dict[str, Dict[str, Any]]) -> str:
     ...
     # Sort GPU operator versions
-    sorted_gpu_ops = sorted(gpu_operators_data.keys())
+    sorted_gpu_ops = sorted(
+        gpu_operators_data.keys(),
+        key=lambda v: semver.VersionInfo.parse(v.split("(")[0].strip()) if v != "master-latest" else semver.VersionInfo(0, 0, 0),
+        reverse=True
+    )

Committable suggestion skipped: line range outside the PR's diff.

@openshift-merge-robot
Copy link
Collaborator

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants