New nno dashboard #385

ggordaniRed · 2025-12-07T09:52:53Z

Summary by CodeRabbit

New Features
- Added Network Operator CI Dashboard alongside existing GPU Operator dashboard for comprehensive test coverage monitoring
- Implemented conditional dashboard generation that regenerates dashboards only when underlying test data changes for improved efficiency
Chores
- Updated development environment configuration files to exclude system and editor metadata
- Enhanced documentation for dashboard generation infrastructure and workflows

_{✏️ Tip: You can customize this high-level summary in your review settings.}

adding new utills for generating new dasboard for NNO

Critical fixes for NNO dashboard data collection: 1. Fixed file structure detection: - Network Operator jobs have finished.json at build root (not in artifacts/) - Fetch all finished.json files and filter for 'nvidia-network-operator' in path - Ensures finished.json is at correct location 2. Added robust version file detection: - Try multiple possible artifact paths for ocp.version and operator.version - Patterns: **/network-operator-e2e/, **/artifacts/, **/nvidia-network-operator*/ - Filter results to only include Network Operator jobs - Handles various test configurations (DOCA4, bare-metal, hosted, etc.) 3. Improved file filtering: - Filter out nested artifacts to avoid duplicates - Verify files are in expected locations These changes fix the issue where GitHub Actions was finding 0 Network Operator files despite PR #67673 containing Network Operator jobs. Related to: PR #67673 testing and dashboard generation

The version files were being fetched correctly but the lookup dictionary keys didn't match the build directory keys used during processing. Problem: - Version files are deeply nested: .../BUILD_ID/artifacts/.../ocp.version - Was using parent directory as key (just the artifacts dir) - Build processing was looking up by full build path - Result: 'Missing ocp.version' warnings despite files being found Solution: - Extract BUILD_ID from version file paths using regex - Build lookup dictionary with full build path as key: pr-logs/pull/{repo}/{pr}/{job_name}/{build_id} - Use same key format when looking up versions during processing - Now keys match and versions are found! This fixes the issue where GitHub Actions found version files but processed 0 builds due to lookup key mismatch.

Added debug logging to show: - The build_dir key being used for lookup - Available keys in ocp_lookup dictionary - Available keys in operator_lookup dictionary This will help identify why lookups are failing.

The issue was in how we extracted the build_id from version file paths: - Version files are deeply nested: .../build_id/artifacts/.../ocp.version - Using regex on the full path matched 'artifacts' as build_id - Solution: Extract build_id from path position (parts[5]) This fixes the 'Missing ocp.version' warnings and allows proper data collection.

openshift-ci · 2025-12-07T09:52:59Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ggordaniRed

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [ggordaniRed]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai · 2025-12-07T10:23:06Z

Walkthrough

This PR extends CI dashboard generation to support NVIDIA Network Operator alongside the existing GPU Operator. It introduces a shared workflows/common/ module with reusable utilities for GCS access, data structures, validation, and HTML generation. The GPU Operator dashboard is refactored to use these utilities, and a complete new Network Operator dashboard pipeline is implemented. The GitHub Actions workflow conditionally generates both dashboards only when their JSON data changes.

Changes

Cohort / File(s)	Change Summary
GitHub Actions Workflow `.github/workflows/generate_matrix_page.yaml`	Adds parallel Network Operator dashboard handling: sets environment variables for NNO paths, installs requirements, fetches NNO CI data, and conditionally runs generation scripts based on JSON diffs (GPU_CHANGED and NNO_CHANGED).
Git & IDE Config `.gitignore`	Adds `.DS_Store` and `.vscode/` to ignored patterns.
Documentation `workflows/README.md`	Documents the new `nno_dashboard/` directory structure and refactors shared module description to describe new `common/` package.
Shared Utilities Package `workflows/common/__init__.py`, `workflows/common/data_structures.py`, `workflows/common/data_fetching.py`, `workflows/common/gcs_utils.py`, `workflows/common/html_builders.py`, `workflows/common/validation.py`	Establishes centralized `workflows.common` module with public APIs for: version/status constants and TestResult dataclass; GCS utilities (HTTP, file fetching, Prow URLs); data extraction and transformation (version lookups, timestamp/status extraction, repo inference); HTML generation (TOC, notes, history bars, footers, sanitization); and validation (OCP versions, semantic versions, infrastructure types).
GPU Operator Dashboard Refactoring `workflows/gpu_operator_dashboard/fetch_ci_data.py`, `workflows/gpu_operator_dashboard/generate_ci_dashboard.py`	Refactors to import GCS utilities, data-fetching helpers, and validation from `workflows.common`; removes in-file implementations; updates merging logic to use shared `merge_job_history_links`; updates semantic-version validation calls to pass version-key parameters.
Network Operator Dashboard Implementation `workflows/nno_dashboard/__init__.py`, `workflows/nno_dashboard/README.md`, `workflows/nno_dashboard/fetch_ci_data.py`, `workflows/nno_dashboard/generate_ci_dashboard.py`, `workflows/nno_dashboard/requirements.txt`	Introduces complete NNO dashboard pipeline: fetch script extracts test flavors, processes finished.json artifacts, constructs TestResult objects, and merges with baseline; generate script restructures data by NNO version, builds per-OCP HTML matrices, and assembles final dashboard. Includes comprehensive README and Python dependencies (requests, semver, pydantic).
NNO Dashboard Templates `workflows/nno_dashboard/templates/header.html`, `workflows/nno_dashboard/templates/main_table.html`, `workflows/nno_dashboard/templates/test_flavor_section.html`	Provides minimal HTML template structure: header with title and stylesheet link; main table with OCP version container and placeholders for notes and flavor sections; test flavor section with labeled div and tbody for dynamic rows.
Output Data File `workflows/test_matrix_dashboard/output/network_operator_matrix.json`	Initializes empty JSON object as placeholder for NNO test-matrix output.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Areas requiring extra attention:

Shared utilities correctness (workflows/common/data_structures.py, workflows/common/validation.py, workflows/common/gcs_utils.py): verify data transformation logic, error handling, and backward compatibility (especially TestResult.from_dict operator_version fallback)
GPU Operator refactoring (workflows/gpu_operator_dashboard/fetch_ci_data.py, workflows/gpu_operator_dashboard/generate_ci_dashboard.py): confirm that extracted utilities preserve original behavior and that semantic-version validation signature changes are correctly applied at all call sites
NNO dashboard implementation (workflows/nno_dashboard/fetch_ci_data.py, workflows/nno_dashboard/generate_ci_dashboard.py): review data aggregation logic, flavor extraction heuristics, and HTML matrix rendering for correctness and edge cases (missing artifacts, empty results)
Workflow YAML logic (.github/workflows/generate_matrix_page.yaml): verify JSON diff detection and conditional branch logic are correct

Possibly related PRs

PR #291: Establishes code-level reorganization of workflows into subdirectories; this PR builds on that structure by centralizing shared utilities into the new workflows/common/ package.
PR #295: Modifies GPU dashboard data model and introduces canonical version keys (OCP_FULL_VERSION, GPU_OPERATOR_VERSION) that this PR adopts across shared utilities and both dashboards.
PR #179: Implements conditional "generate HTML only if JSON changed" pattern; this PR extends that approach to both GPU and Network Operator dashboards.

Suggested labels

ok-to-test

Suggested reviewers

empovit
wabouhamad
TomerNewman

Poem

🐰 Hop, hop! A dashboard springs to life,
Network Operator joins the dance with flair,
Shared libraries ease the strife,
Parallel workflows float through the air!
CI tests now multiply with grace, ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title 'New nno dashboard' is vague and generic, using non-descriptive terminology that doesn't convey meaningful information about the changeset's scope or purpose.	Consider a more descriptive title such as 'Add Network Operator CI dashboard generation' or 'Implement NNO dashboard with data fetching and HTML generation' to better summarize the primary change.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	Docstring coverage is 94.74% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (15)

workflows/nno_dashboard/README.md (1)
44-68: Solid documentation; consider adding a language to the job-name code fence

The README gives a clear end-to-end view of data flow, flavors, and CI wiring; it's very helpful for future contributors. One minor markdown nit: the fenced block showing the job-name pattern (lines 121–123) lacks a language spec, which markdownlint flags. You could mark it as text (or bash) to satisfy MD040:
-```
+```text
 pull-ci-rh-ecosystem-edge-nvidia-ci-main-{infrastructure}-nvidia-network-operator-{test-type}
Also applies to: 70-82, 95-99, 121-129, 160-175

</blockquote></details>
<details>
<summary>workflows/common/html_builders.py (1)</summary><blockquote>

`31-55`: **Minor typing and robustness tweaks for HTML helpers**

The helpers look good and centralize a lot of duplicated markup; a couple of small refinements could help:

- In `build_history_bar`, `int(result.get("job_timestamp", 0))` will raise if `job_timestamp` is a non-numeric string. If there’s any chance of malformed data, consider a `try`/`except` around the cast and either skip such entries or fall back to a safe default.
- You’re comparing against string literals (`"SUCCESS"`, `"FAILURE"`) here; if you want tighter coupling with `STATUS_*` constants from `data_structures`, you could pass those in or import them to avoid divergence over time.
- Ruff’s RUF013 can be resolved by making the `timestamp` annotation explicitly optional:

```diff
-def build_last_updated_footer(timestamp: str = None) -> str:
+def build_last_updated_footer(timestamp: str | None = None) -> str:
None of these are blocking; they’re just small maintainability and safety improvements.

Also applies to: 57-122, 125-144, 147-161
workflows/common/gcs_utils.py (1)
21-39: GCS helpers look solid; consider explicit Optional typing for HTTP params

The GCS utilities are well factored: shared JSON fetching, file content retrieval with timeouts, prefix/glob-based listing with pagination, and clear URL builders. A couple of minor polish points:

To satisfy Ruff’s RUF013 and make the intent clearer, you can mark params and headers as explicitly optional:
-from typing import Dict, Any
+from typing import Any, Dict
@@
-def http_get_json(url: str, params: Dict[str, Any] = None, headers: Dict[str, str] = None) -> Dict[str, Any]:
+def http_get_json(
+    url: str,
+    params: Dict[str, Any] | None = None,
+    headers: Dict[str, str] | None = None,
+) -> Dict[str, Any]:
Using a gcsweb-based URL in build_prow_job_url is consistent with the project’s preference for artifact navigation over the Prow UI, while still leveraging Prow’s job-history endpoint where appropriate.

Functionally this looks good as-is; the above is optional cleanup.

Based on learnings, the gcsweb artifact URL choice aligns with earlier feedback about navigation UX.

Also applies to: 41-61, 64-76, 78-116, 119-129
workflows/common/__init__.py (1)

48-93: Optional: Consider sorting __all__ for linter compliance.

Static analysis (RUF022) suggests sorting __all__. The current logical grouping by category is readable, but if you want linter compliance, you could either:

Sort alphabetically

Add # noqa: RUF022 to suppress the warning while keeping the logical grouping

The logical grouping is a valid choice for maintainability.
workflows/common/validation.py (1)
28-29: Optional: Extract shared infrastructure types constant.

The invalid_keys list in is_valid_ocp_version duplicates infrastructure_types in is_infrastructure_type. Consider extracting a module-level constant to ensure consistency:
+INFRASTRUCTURE_TYPES = ["doca4", "bare-metal", "hosted", "unknown"]
+
+
 def is_valid_ocp_version(version: str) -> bool:
     ...
-    invalid_keys = ["doca4", "bare-metal", "hosted", "unknown"]
-    if version.lower() in invalid_keys:
+    if version.lower() in INFRASTRUCTURE_TYPES:
         return False
     ...

 def is_infrastructure_type(value: str) -> bool:
     ...
-    infrastructure_types = ["doca4", "bare-metal", "hosted", "unknown"]
-    return value.lower() in infrastructure_types
+    return value.lower() in INFRASTRUCTURE_TYPES
Also applies to: 88-89
workflows/nno_dashboard/generate_ci_dashboard.py (4)
1-1: Shebang without executable permission.

The file has a shebang (#!/usr/bin/env python3) but lacks executable permission. Either make the file executable or remove the shebang if it's only intended to be run as a module.

114-127: TODOs indicate incomplete data extraction.

Lines 116 and 126 have hardcoded placeholder values with TODO comments:

gpu_operator_version = "master-latest"

"hardware": "ConnectX-6 Dx"

These may produce misleading information in the dashboard. Consider extracting these values from actual test data or clearly marking them as "Unknown" until the data source is identified.

Would you like me to help identify where this data might be available in the test artifacts, or open an issue to track this improvement?

259-260: NNO version sorting has the same lexicographic issue.

The sorting at line 260 uses basic string comparison. Consider semantic version sorting for consistency with GPU operator version display.

324-327: Move import os to the top of the file.

The os module is imported inside main(). For consistency and clarity, place it at the top of the file with other imports.
 import argparse
 import json
+import os
 import semver
 from typing import Dict, List, Any, Set
workflows/common/data_fetching.py (1)
162-176: Consider adding error handling for invalid integer strings.

If value is not a valid integer string (e.g., "abc"), int(value) will raise a ValueError. While this is used with argparse where input is somewhat controlled, adding defensive handling would make the utility more robust.
 def int_or_none(value: Optional[str]) -> Optional[int]:
     if value is None:
         return None
     if value.lower() in ('none', 'unlimited'):
         return None
-    return int(value)
+    try:
+        return int(value)
+    except ValueError:
+        raise ValueError(f"Expected integer or 'none'/'unlimited', got: {value}")
workflows/nno_dashboard/fetch_ci_data.py (5)
1-1: Shebang without executable permission.

Same as the generate_ci_dashboard.py - either make executable or remove shebang.

142-142: Move imports to the top of the file.

The from workflows.common.data_fetching import ... at line 142 is inside the function body. For consistency and to avoid repeated import overhead, move this to the top-level imports.
 from workflows.common import (
     logger,
     fetch_gcs_file_content,
     fetch_filtered_files,
     build_prow_job_url,
     build_job_history_url,
     TestResult,
     OCP_FULL_VERSION,
     OPERATOR_VERSION,
     STATUS_SUCCESS,
     STATUS_FAILURE,
     STATUS_ABORTED,
     is_valid_ocp_version,
 )
+from workflows.common.data_fetching import (
+    extract_test_status,
+    extract_timestamp,
+    determine_repo_from_job_name,
+    merge_job_history_links,
+    convert_sets_to_lists_recursive,
+)
Then remove the inline imports at lines 142, 175, and 360.

271-274: Remove or prefix unused variables.

Variables pr_num and repo are extracted but never used. Either remove them or prefix with _ to indicate intentional non-use.
         job_name = match.group("job_name")
         build_id = match.group("build_id")
-        pr_num = match.group("pr_number")
-        repo = match.group("repo")
+        # pr_num and repo available via: match.group("pr_number"), match.group("repo")
202-222: Consider consolidating version file fetching.

The current approach makes 6 separate GCS API calls (3 patterns × 2 file types). Consider:

Using a broader glob pattern and filtering locally

Deduplicating results by file path to avoid potential duplicates

This is a performance consideration for PRs with many builds.

347-351: Fix type hint for optional parameter.

The existing_results parameter defaults to None but the type hint doesn't indicate it's optional. Use Optional or the union syntax.
+from typing import Optional  # if not already imported
+
 def merge_and_save_results(
     new_results: Dict[str, Dict[str, Any]],
     output_filepath: str,
-    existing_results: Dict[str, Dict[str, Any]] = None
+    existing_results: Optional[Dict[str, Dict[str, Any]]] = None
 ) -> None:

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4535eab and a38ffe2.

📒 Files selected for processing (20)

.github/workflows/generate_matrix_page.yaml (2 hunks)
.gitignore (1 hunks)
workflows/README.md (1 hunks)
workflows/common/__init__.py (1 hunks)
workflows/common/data_fetching.py (1 hunks)
workflows/common/data_structures.py (1 hunks)
workflows/common/gcs_utils.py (1 hunks)
workflows/common/html_builders.py (1 hunks)
workflows/common/validation.py (1 hunks)
workflows/gpu_operator_dashboard/fetch_ci_data.py (2 hunks)
workflows/gpu_operator_dashboard/generate_ci_dashboard.py (2 hunks)
workflows/nno_dashboard/README.md (1 hunks)
workflows/nno_dashboard/__init__.py (1 hunks)
workflows/nno_dashboard/fetch_ci_data.py (1 hunks)
workflows/nno_dashboard/generate_ci_dashboard.py (1 hunks)
workflows/nno_dashboard/requirements.txt (1 hunks)
workflows/nno_dashboard/templates/header.html (1 hunks)
workflows/nno_dashboard/templates/main_table.html (1 hunks)
workflows/nno_dashboard/templates/test_flavor_section.html (1 hunks)
workflows/test_matrix_dashboard/output/network_operator_matrix.json (1 hunks)

🧰 Additional context used

🧠 Learnings (4)

📓 Common learnings

Learnt from: empovit
Repo: rh-ecosystem-edge/nvidia-ci PR: 295
File: workflows/gpu_operator_dashboard/fetch_ci_data.py:61-64
Timestamp: 2025-08-31T15:44:50.325Z
Learning: In the nvidia-ci project, gcsweb URLs are preferred over prow.ci.openshift.org URLs for the GPU operator dashboard because gcsweb allows easier navigation up the directory structure, while Prow UI requires switching to "Artifacts" first.

📚 Learning: 2025-08-31T15:44:50.325Z

Learnt from: empovit
Repo: rh-ecosystem-edge/nvidia-ci PR: 295
File: workflows/gpu_operator_dashboard/fetch_ci_data.py:61-64
Timestamp: 2025-08-31T15:44:50.325Z
Learning: In the nvidia-ci project, gcsweb URLs are preferred over prow.ci.openshift.org URLs for the GPU operator dashboard because gcsweb allows easier navigation up the directory structure, while Prow UI requires switching to "Artifacts" first.

Applied to files:

workflows/nno_dashboard/README.md
workflows/gpu_operator_dashboard/fetch_ci_data.py

📚 Learning: 2025-08-27T18:08:59.287Z

Learnt from: empovit
Repo: rh-ecosystem-edge/nvidia-ci PR: 291
File: workflows/gpu-operator-versions/nvidia_gpu_operator.py:58-58
Timestamp: 2025-08-27T18:08:59.287Z
Learning: In the nvidia-ci repository, the GH_AUTH_TOKEN environment variable is set using base64 encoding of the GitHub token: `export GH_AUTH_TOKEN=$(echo ${{ secrets.GITHUB_TOKEN }} | base64)` and this works successfully with GitHub Container Registry (GHCR) Bearer authentication in workflows/gpu-operator-versions/nvidia_gpu_operator.py.

Applied to files:

.github/workflows/generate_matrix_page.yaml

📚 Learning: 2025-11-09T13:12:43.631Z

Learnt from: empovit
Repo: rh-ecosystem-edge/nvidia-ci PR: 359
File: workflows/gpu_operator_versions/update_versions.py:100-106
Timestamp: 2025-11-09T13:12:43.631Z
Learning: In the nvidia-ci repository's `workflows/gpu_operator_versions/update_versions.py`, the `gpu_releases` variable is intentionally limited to the latest 2 GPU operator versions (via `get_latest_versions(list(gpu_versions.keys()), 2)` in main()). For maintenance OCP versions with pinned GPU operators, if the pinned version is not among the latest 2 releases, the test is intentionally skipped with a warning rather than falling back to test with any available version. This is because testing with outdated GPU operators would fail in their infrastructure, and they prefer to notice missing tests rather than deal with failed test runs, especially since GitHub organization repositories don't send notification emails about failed workflows.

Applied to files:

workflows/gpu_operator_dashboard/fetch_ci_data.py
workflows/gpu_operator_dashboard/generate_ci_dashboard.py

🧬 Code graph analysis (7)

workflows/common/html_builders.py (1)

workflows/gpu_operator_dashboard/generate_ci_dashboard.py (2)

build_toc (160-171)

build_notes (142-157)

workflows/common/data_fetching.py (1)

workflows/common/gcs_utils.py (1)

fetch_gcs_file_content (41-61)

workflows/common/data_structures.py (1)

workflows/gpu_operator_dashboard/fetch_ci_data.py (2)

TestResult (67-110)

to_dict (75-82)

workflows/common/__init__.py (7)

workflows/common/utils.py (1)

get_logger (5-15)

workflows/common/templates.py (1)

load_template (9-37)

workflows/common/data_structures.py (1)

TestResult (22-58)

workflows/common/gcs_utils.py (5)

http_get_json (21-38)

fetch_gcs_file_content (41-61)

build_prow_job_url (64-75)

fetch_filtered_files (78-116)

build_job_history_url (119-129)

workflows/common/html_builders.py (5)

build_toc (10-28)

build_notes (31-54)

build_history_bar (57-122)

build_last_updated_footer (125-144)

sanitize_id (147-161)

workflows/common/validation.py (3)

is_valid_ocp_version (10-43)

has_valid_semantic_versions (46-75)

is_infrastructure_type (78-89)

workflows/common/data_fetching.py (8)

build_version_lookups (12-40)

build_finished_lookup (43-66)

extract_test_status (69-90)

extract_timestamp (93-103)

determine_repo_from_job_name (106-116)

convert_sets_to_lists_recursive (119-136)

merge_job_history_links (139-159)

int_or_none (162-176)

workflows/gpu_operator_dashboard/fetch_ci_data.py (2)

workflows/common/gcs_utils.py (4)

http_get_json (21-38)

fetch_gcs_file_content (41-61)

build_prow_job_url (64-75)

fetch_filtered_files (78-116)

workflows/common/data_fetching.py (2)

int_or_none (162-176)

merge_job_history_links (139-159)

workflows/nno_dashboard/fetch_ci_data.py (5)

workflows/common/gcs_utils.py (4)

fetch_gcs_file_content (41-61)

fetch_filtered_files (78-116)

build_prow_job_url (64-75)

build_job_history_url (119-129)

workflows/common/data_structures.py (2)

TestResult (22-58)

to_dict (31-43)

workflows/gpu_operator_dashboard/fetch_ci_data.py (2)

TestResult (67-110)

to_dict (75-82)

workflows/common/validation.py (1)

is_valid_ocp_version (10-43)

workflows/common/data_fetching.py (5)

extract_test_status (69-90)

extract_timestamp (93-103)

determine_repo_from_job_name (106-116)

merge_job_history_links (139-159)

convert_sets_to_lists_recursive (119-136)

workflows/gpu_operator_dashboard/generate_ci_dashboard.py (1)

workflows/common/validation.py (1)

has_valid_semantic_versions (46-75)

🪛 LanguageTool

workflows/nno_dashboard/README.md

[uncategorized] ~20-~20: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...frastructure - Bare Metal: Tests on bare metal servers - Hosted: Tests in hosted e...

(EN_COMPOUND_ADJECTIVE_INTERNAL)

[style] ~105-~105: ‘with success’ might be wordy. Consider a shorter alternative.
Context: ...tor versions 5. Creates clickable links with success/failure styling ## Shared Utilities T...

(EN_WORDINESS_PREMIUM_WITH_SUCCESS)

[uncategorized] ~160-~160: The official name of this software platform is spelled with a capital “H”.
Context: ...alled by the GitHub Actions workflow in .github/workflows/generate_matrix_page.yaml: ...

(GITHUB)

[uncategorized] ~181-~181: The official name of this software platform is spelled with a capital “H”.
Context: ...ss dashboards - GitHub Actions Workflow - ...

(GITHUB)

🪛 markdownlint-cli2 (0.18.1)

workflows/nno_dashboard/README.md

121-121: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🪛 Ruff (0.14.7)

workflows/common/validation.py

41-41: Consider moving this statement to an else block

(TRY300)

workflows/common/gcs_utils.py

21-21: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

21-21: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

workflows/common/html_builders.py

125-125: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

workflows/common/data_fetching.py

36-36: Do not catch blind exception: Exception

(BLE001)

63-63: Do not catch blind exception: Exception

(BLE001)

workflows/common/__init__.py

48-93: __all__ is not sorted

Apply an isort-style sorting to __all__

(RUF022)

workflows/nno_dashboard/fetch_ci_data.py

1-1: Shebang is present but file is not executable

(EXE001)

242-242: Do not catch blind exception: Exception

(BLE001)

254-254: Do not catch blind exception: Exception

(BLE001)

273-273: Local variable pr_num is assigned to but never used

Remove assignment to unused variable pr_num

(F841)

274-274: Local variable repo is assigned to but never used

Remove assignment to unused variable repo

(F841)

282-282: Do not catch blind exception: Exception

(BLE001)

307-307: Do not catch blind exception: Exception

(BLE001)

350-350: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

374-374: Loop control variable flavor not used within loop body

(B007)

workflows/nno_dashboard/generate_ci_dashboard.py

1-1: Shebang is present but file is not executable

(EXE001)

🔇 Additional comments (29)

workflows/nno_dashboard/requirements.txt (1)

1-4: Requirements entries look appropriate for the new dashboard module

The requirements file is minimal and aligned with typical needs for HTTP access, version parsing, and data modeling in these workflows; no blocking issues from this review.

.gitignore (1)

5-6: Good to ignore local editor and OS artifacts

Adding .DS_Store and .vscode/ is a sensible safeguard against committing local environment files.

workflows/nno_dashboard/__init__.py (1)

1-3: Package docstring is clear and sufficient

The module-level docstring succinctly documents the purpose of the nno_dashboard package; no further initialization logic is needed here.

workflows/README.md (1)

9-11: Directory structure docs are up to date

Adding nno_dashboard/ and documenting the new common/ shared utilities keeps this overview aligned with the current workflows layout.
workflows/common/data_structures.py (1)
8-18: Tighten TestResult dict handling and guard against missing operator version

Two small robustness/consistency points:

to_dict hard-codes "operator_version" instead of using the shared OPERATOR_VERSION constant.

from_dict will set operator_version to None if both operator_version and gpu_operator_version are absent, which violates the type annotation and can blow up later in version-handling code.

You can address both with a small refactor:
@@
-        result = {
-            OCP_FULL_VERSION: self.ocp_full_version,
-            "operator_version": self.operator_version,
+        result = {
+            OCP_FULL_VERSION: self.ocp_full_version,
+            OPERATOR_VERSION: self.operator_version,
@@
-        # Handle backward compatibility with GPU operator data
-        operator_version = data.get("operator_version") or data.get(GPU_OPERATOR_VERSION)
-        
+        # Handle backward compatibility with GPU operator data
+        operator_version = data.get(OPERATOR_VERSION) or data.get(GPU_OPERATOR_VERSION)
+        if operator_version is None:
+            raise KeyError(
+                f"Missing '{OPERATOR_VERSION}' or '{GPU_OPERATOR_VERSION}' in TestResult data: {data}"
+            )
+
         return cls(
             ocp_full_version=data[OCP_FULL_VERSION],
             operator_version=operator_version,
This keeps the JSON schema centralized via constants and fails fast when the input is structurally invalid instead of propagating a None value.

Also applies to: 21-58
⛔ Skipped due to learnings
Learnt from: empovit
Repo: rh-ecosystem-edge/nvidia-ci PR: 359
File: workflows/gpu_operator_versions/update_versions.py:100-106
Timestamp: 2025-11-09T13:12:43.631Z
Learning: In the nvidia-ci repository's `workflows/gpu_operator_versions/update_versions.py`, the `gpu_releases` variable is intentionally limited to the latest 2 GPU operator versions (via `get_latest_versions(list(gpu_versions.keys()), 2)` in main()). For maintenance OCP versions with pinned GPU operators, if the pinned version is not among the latest 2 releases, the test is intentionally skipped with a warning rather than falling back to test with any available version. This is because testing with outdated GPU operators would fail in their infrastructure, and they prefer to notice missing tests rather than deal with failed test runs, especially since GitHub organization repositories don't send notification emails about failed workflows.
workflows/test_matrix_dashboard/output/network_operator_matrix.json (1)

1-1: LGTM!

Valid placeholder for the new Network Operator dashboard data. The empty JSON object will be populated by the fetch_ci_data script.

workflows/nno_dashboard/templates/header.html (1)

1-12: LGTM!

Clean HTML template header with proper accessibility attributes (lang="en") and responsive viewport configuration. The unclosed body tag is intentional for template composition with the footer.

workflows/nno_dashboard/templates/test_flavor_section.html (1)

1-15: LGTM!

Well-structured HTML template with semantic table elements and appropriate placeholders for dynamic content injection.

workflows/nno_dashboard/templates/main_table.html (1)

1-9: LGTM!

Good structure with anchor-based navigation support. The anchor ID pattern (ocp-{ocp_key}) aligns with the build_toc function that generates #ocp-{ocp_version} links.

.github/workflows/generate_matrix_page.yaml (3)

71-74: LGTM on adding NNO requirements.

Good practice to install both dashboard requirements to support parallel generation.

84-88: LGTM on NNO data fetching.

Correctly mirrors the GPU Operator fetch pattern with NNO-specific paths.

104-111: The stated concern about git diff failing with an error is incorrect.

git diff --exit-code does not fail when a file is new or doesn't exist in the branch. Instead, it compares the working tree against HEAD and returns exit code 0 (no differences) or 1 (differences found). Untracked and new files are handled gracefully without errors. The suggested fix of adding 2>/dev/null is unnecessary and doesn't address any actual problem.

The existing logic is sound: when network_operator_matrix.json is generated in the working directory and compared against gh-pages, git diff will detect it as a new file (exit code 1), correctly setting NNO_CHANGED=true.

Likely an incorrect or invalid review comment.

workflows/common/__init__.py (1)

1-46: LGTM! Clean module organization.

Well-structured public API surface with clear categorization of utilities, data structures, GCS helpers, HTML builders, validation, and data fetching.

workflows/common/validation.py (2)

10-43: LGTM! Robust validation logic.

Good defensive programming with early returns and proper exception handling. The validation correctly handles edge cases like empty strings and non-numeric parts.

46-75: LGTM! Proper semver validation with suffix handling.

Good handling of operator versions with suffixes like "23.9.0(bundle)" by splitting on ( before parsing.

workflows/gpu_operator_dashboard/generate_ci_dashboard.py (2)

11-11: LGTM!

The import of has_valid_semantic_versions from the centralized workflows.common.validation module is appropriate and aligns with the PR's goal of consolidating shared utilities.

40-41: Good use of the refactored validation function.

Explicitly passing OCP_FULL_VERSION and GPU_OPERATOR_VERSION as key arguments makes the code more readable and ensures correct field lookup in the result dictionary.

workflows/nno_dashboard/generate_ci_dashboard.py (1)

24-61: Well-organized flavor mappings.

The FLAVOR_COLUMN_MAP and COLUMN_ORDER constants provide a clean separation between job naming conventions and display columns, making future additions straightforward.

workflows/common/data_fetching.py (7)

12-40: LGTM!

The build_version_lookups function handles failures gracefully by logging warnings and continuing with other files. This resilience is appropriate for batch processing of external resources.

43-66: LGTM!

Similar resilient error handling pattern. The function correctly continues processing when individual files fail.

69-90: LGTM!

Clean implementation that normalizes test status with sensible defaults. Passing status constants as parameters provides flexibility for different dashboards.

93-103: LGTM!

Simple and correct.

106-116: LGTM!

Clean implementation matching the pattern used in other fetch_ci_data modules.

119-136: LGTM!

Correct recursive implementation for JSON serialization preparation.

139-159: LGTM!

Robust implementation that handles mixed input types (set or list) gracefully.

workflows/gpu_operator_dashboard/fetch_ci_data.py (2)

14-25: Good refactor to use centralized utilities.

The imports from workflows.common.gcs_utils and workflows.common.data_fetching consolidate shared functionality, reducing code duplication across dashboards.

541-544: LGTM!

Correct usage of the centralized merge_job_history_links function, replacing the previous inline merging logic.

workflows/nno_dashboard/fetch_ci_data.py (2)

40-117: LGTM!

Comprehensive parsing logic that handles various job name patterns. The fallback to "Standard" is sensible for unknown patterns.

414-440: LGTM!

The main function has proper error handling for loading existing data and follows the established pattern from the GPU operator dashboard.

coderabbitai · 2025-12-07T10:28:13Z

workflows/nno_dashboard/generate_ci_dashboard.py

+    # Sort GPU operator versions
+    sorted_gpu_ops = sorted(gpu_operators_data.keys())


⚠️ Potential issue | 🟡 Minor

Lexicographic sorting may produce incorrect version ordering.

sorted(gpu_operators_data.keys()) performs alphabetical sorting, which is incorrect for semantic versions. For example, "25.10.0" would sort before "25.4.0".

Consider using semantic version sorting as done in the GPU operator dashboard:

+import semver + def build_matrix_table(ocp_version: str, gpu_operators_data: Dict[str, Dict[str, Any]]) -> str: ... # Sort GPU operator versions - sorted_gpu_ops = sorted(gpu_operators_data.keys()) + sorted_gpu_ops = sorted( + gpu_operators_data.keys(), + key=lambda v: semver.VersionInfo.parse(v.split("(")[0].strip()) if v != "master-latest" else semver.VersionInfo(0, 0, 0), + reverse=True + )

Committable suggestion skipped: line range outside the PR's diff.

openshift-merge-robot · 2025-12-09T09:41:07Z

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

ggordaniRed added 5 commits December 7, 2025 10:21

Creating NNO dashboard

c399628

adding new utills for generating new dasboard for NNO

debug: Add logging to troubleshoot version lookup keys

677c921

Added debug logging to show: - The build_dir key being used for lookup - Available keys in ocp_lookup dictionary - Available keys in operator_lookup dictionary This will help identify why lookups are failing.

ggordaniRed requested review from TomerNewman, empovit and wabouhamad December 7, 2025 09:52

openshift-ci bot requested a review from vtruhpon December 7, 2025 09:52

openshift-ci bot added the approved label Dec 7, 2025

coderabbitai bot reviewed Dec 7, 2025

View reviewed changes

openshift-merge-robot added the needs-rebase label Dec 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New nno dashboard #385

New nno dashboard #385

Uh oh!

ggordaniRed commented Dec 7, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

openshift-ci bot commented Dec 7, 2025

Uh oh!

coderabbitai bot commented Dec 7, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Dec 7, 2025

Uh oh!

openshift-merge-robot commented Dec 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		# Sort GPU operator versions
		sorted_gpu_ops = sorted(gpu_operators_data.keys())

New nno dashboard #385

Are you sure you want to change the base?

New nno dashboard #385

Uh oh!

Conversation

ggordaniRed commented Dec 7, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

openshift-ci bot commented Dec 7, 2025

Uh oh!

coderabbitai bot commented Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

openshift-merge-robot commented Dec 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ggordaniRed commented Dec 7, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 7, 2025 •

edited

Loading