devtools: add node complexity analysis script #476

rjarry · 2026-01-13T16:41:48Z

Introduce a tool to measure and track the code complexity of datapath node processing functions. The script parses objdump disassembly output and reports instruction count, conditional branches, jumps, and calls for each node.

Node functions are identified by scanning source files for .process = function_name patterns in struct rte_node_register definitions. The script merges compiler-split .cold functions back into their parents and recursively traverses the call graph to compute total instruction counts.

Logging, tracing, and DPDK node infrastructure functions are excluded from the recursive count to focus on actual packet processing logic: rte_log, rte_vlog, gr_mbuf_trace*, trace_log_packet, and __rte_node_stream_alloc*.

Three output modes are supported:

Table format (default): human-readable, sorted by instruction count
JSON format (--json): for generating baseline files
Compare mode (--compare): shows deltas against a baseline and exits with code 1 if any function exceeds the regression threshold

Add a CI job that builds grout and compares the current complexity metrics against the checked-in baseline. This enables early detection of unintended code size increases in the datapath.

Summary by CodeRabbit

Chores
- Added an automated complexity-analysis job to CI.
- Updated license checks to exclude JSON files.
New Features
- Added a complexity-analysis tool and baseline dataset to measure per-component code complexity in CI and report regressions.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-13T16:46:10Z

📝 Walkthrough

Walkthrough

Adds datapath node complexity tracking: a new GitHub Actions job node-complexity runs a Fedora-based container to execute a Python analysis tool that disassembles binaries, computes per-function metrics (instructions, branches, calls, jumps), merges cold functions, maps binary functions to source-side nodes, and compares results against a baseline. Introduces devtools/node-complexity (analysis script) and devtools/complexity.json (baseline metrics). Updates GNUmakefile to exclude *.json from licensed_files checks.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In @.github/workflows/check.yml:
- Line 150: The checkout step using actions/checkout@v4 is missing the same
options used in other jobs; update that step to include persist-credentials:
false and the same fetch-depth value used elsewhere (e.g., fetch-depth: 0) so it
matches the other jobs' configuration and removes the credential persistence
inconsistency.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3c10bc6 and cbdfa66.

📒 Files selected for processing (4)

.github/workflows/check.yml
GNUmakefile
devtools/complexity.json
devtools/node-complexity

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)

GitHub Check: build-and-tests (clang-18, none, debugoptimized, ubuntu-24.04, false)
GitHub Check: build-and-tests (clang-16, none, debugoptimized, ubuntu-24.04, false)
GitHub Check: build-and-tests (gcc-13, none, debugoptimized, ubuntu-24.04, -Dfrr=enabled, false)
GitHub Check: build-and-tests (clang-15, none, debugoptimized, ubuntu-22.04, false)
GitHub Check: build-and-tests (gcc-14, address, debug, ubuntu-24.04, -Dfrr=enabled, true)
GitHub Check: node-complexity
GitHub Check: rpm
GitHub Check: deb

🔇 Additional comments (3)

GNUmakefile (1)

108-108: LGTM!

The exclusion pattern for JSON files is appropriate since the new complexity.json baseline file doesn't require license headers.

devtools/complexity.json (1)

1-344: LGTM!

Baseline data file for complexity metrics. The ip_output_process function shows notably high instruction count (7643) compared to other functions, which may warrant verification that this is the expected baseline.

devtools/node-complexity (1)

1-396: LGTM!

The script is well-structured with proper cycle detection in the recursive instruction counting, safe dictionary iteration (by pre-collecting .cold function names), and appropriate error handling for missing tools.

coderabbitai · 2026-01-13T16:46:13Z

.github/workflows/check.yml

+                python3-pyelftools scdoc libmnl-devel \
+                libcmocka-devel libedit-devel libevent-devel numactl-devel \
+                libsmartcols-devel libarchive-devel rdma-core-devel
+      - uses: actions/checkout@v4


⚠️ Potential issue | 🟡 Minor

Missing checkout options compared to other jobs.

The actions/checkout@v4 step is missing persist-credentials: false and fetch-depth options that are explicitly set in the other jobs (build-and-tests, build-cross-aarch64, lint, commits). While the shallow clone may work for this job, the inconsistency with persist-credentials could be a security consideration.

Suggested fix

- uses: actions/checkout@v4 + with: + persist-credentials: false + fetch-depth: 0 + ref: ${{ github.event.pull_request.head.sha || github.ref }}

🤖 Prompt for AI Agents

In @.github/workflows/check.yml at line 150, The checkout step using actions/checkout@v4 is missing the same options used in other jobs; update that step to include persist-credentials: false and the same fetch-depth value used elsewhere (e.g., fetch-depth: 0) so it matches the other jobs' configuration and removes the credential persistence inconsistency.

Introduce a tool to measure and track the code complexity of datapath node processing functions. The script parses objdump disassembly output and reports instruction count, conditional branches, jumps, and calls for each node. Node functions are identified by scanning source files for `.process = function_name` patterns in struct rte_node_register definitions. The script merges compiler-split `.cold` functions back into their parents and recursively traverses the call graph to compute total instruction counts. Logging, tracing, and DPDK node infrastructure functions are excluded from the recursive count to focus on actual packet processing logic: rte_log, rte_vlog, gr_mbuf_trace*, trace_log_packet, and __rte_node_stream_alloc*. Three output modes are supported: - Table format (default): human-readable, sorted by instruction count - JSON format (--json): for generating baseline files - Compare mode (--compare): shows deltas against a baseline and exits with code 1 if any function exceeds the regression threshold Add a CI job that builds grout and compares the current complexity metrics against the checked-in baseline. Force cpu_instruction_set to x86-64-v3 to ensure consistent results regardless of the CI runner CPU features. Signed-off-by: Robin Jarry <rjarry@redhat.com>

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In @devtools/node-complexity:
- Around line 291-292: The open call that reads the baseline JSON
(open(baseline_path)) does not specify an encoding; update the file read so open
is called with an explicit encoding (e.g., encoding='utf-8') when loading the
baseline with json.load to avoid issues on systems with non-UTF-8 locales
(ensure you still pass baseline_path and use json.load(f)).
- Around line 168-217: The shared memoization `cache` in
`compute_total_instructions` causes stale totals when multiple target functions
participate in cycles; modify `compute_total_instructions` so the `cache` is
reset per target computation (e.g., move `cache = {}` inside the loop over
`target_funcs` or call `cache.clear()` before invoking `total_insns(func_name,
set())`) so cached totals cannot omit contributions that were excluded by a
prior `visited` set; ensure you still use `total_insns`, `func_excluded`, and
`visited` as before and only change cache scope/clearing.

🧹 Nitpick comments (1)

.github/workflows/check.yml (1)
138-140: Debian-specific env vars in Fedora container are harmless but unnecessary.

DEBIAN_FRONTEND and NEEDRESTART_MODE have no effect on Fedora/dnf and appear to be copy-paste from other jobs. Not a bug, just noise.
     env:
-      DEBIAN_FRONTEND: noninteractive
-      NEEDRESTART_MODE: l
       CC: ccache gcc

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between df5e0c0 and b5306fe.

📒 Files selected for processing (4)

.github/workflows/check.yml
GNUmakefile
devtools/complexity.json
devtools/node-complexity

🚧 Files skipped from review as they are similar to previous changes (2)

devtools/complexity.json
GNUmakefile

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)

GitHub Check: build-and-tests (gcc-13, none, debugoptimized, ubuntu-24.04, -Dfrr=enabled, false)
GitHub Check: build-and-tests (clang-18, none, debugoptimized, ubuntu-24.04, false)
GitHub Check: build-and-tests (clang-15, none, debugoptimized, ubuntu-22.04, false)
GitHub Check: build-and-tests (gcc-14, address, debug, ubuntu-24.04, -Dfrr=enabled, true)
GitHub Check: node-complexity
GitHub Check: build-and-tests (clang-16, none, debugoptimized, ubuntu-24.04, false)
GitHub Check: deb
GitHub Check: rpm

🔇 Additional comments (3)

devtools/node-complexity (1)

1-396: LGTM overall — solid tool design.

The script handles objdump fallback, .cold function merging, and call graph traversal correctly for the common case. The regex patterns for x86_64 disassembly look appropriate, and the exclusion list for infrastructure functions makes sense for isolating packet-processing logic.
.github/workflows/check.yml (2)
153-153: Missing checkout options that other jobs specify.

Other jobs in this workflow include persist-credentials: false, fetch-depth: 0, and ref: ${{ github.event.pull_request.head.sha || github.ref }}. Without the explicit ref, the action defaults to the merge commit for PRs, which should be fine for analysis. However, the inconsistency could matter if future changes need git history or consistent credential handling.

Consider aligning with other jobs for consistency:
      - uses: actions/checkout@v4
        with:
          persist-credentials: false
          ref: ${{ github.event.pull_request.head.sha || github.ref }}
132-182: New CI job structure looks correct.

The job correctly installs dependencies, uses ccache, builds the project, and runs the complexity comparison. Cache handling mirrors existing patterns.

coderabbitai · 2026-01-13T21:39:23Z

devtools/node-complexity

+def compute_total_instructions(functions, call_graph, target_funcs) -> dict[str, dict]:
+    """
+    Compute total instructions for each target function including called functions.
+
+    Uses memoization and handles cycles. Excludes infrastructure functions.
+    """
+    cache = {}
+
+    def func_excluded(name: str) -> bool:
+        # Check PLT functions (libc)
+        if "@plt" in name:
+            return True
+        # Handle .cold, .part, etc.
+        base = name.split(".")[0]
+        for pattern in EXCLUDED_FUNCS:
+            if pattern in base:
+                return True
+        return False
+
+    def total_insns(func_name: str, visited: set[str]):
+        if func_name in cache:
+            return cache[func_name]
+        if func_name not in functions:
+            return 0
+        if func_name in visited:
+            return 0  # Cycle detected, don't double count
+        if func_excluded(func_name):
+            return 0
+
+        visited = visited | {func_name}
+        total = functions[func_name]["instructions"]
+
+        for called in call_graph.get(func_name, []):
+            total += total_insns(called, visited)
+
+        cache[func_name] = total
+
+        return total
+
+    result = {}
+    for func_name in target_funcs:
+        if func_name in functions:
+            result[func_name] = {
+                "instructions": total_insns(func_name, set()),
+                "branches": functions[func_name]["branches"],
+                "jumps": functions[func_name]["jumps"],
+                "calls": functions[func_name]["calls"],
+            }
+
+    return result


⚠️ Potential issue | 🟡 Minor

Cache invalidation bug when cycles involve multiple target functions.

The memoization cache is shared across all target function computations, but cached values depend on the visited set at computation time. In a mutual cycle where both A and B are target functions (A→B→A):

Computing A first: B gets cached with value excluding A (since A was in visited)

Computing B later: returns stale cached value that's missing A's contribution

For typical node process functions that don't form mutual recursion, this won't manifest. However, if it does occur, reported metrics will be incorrect.

Possible fix: clear cache between target computations or track visited in cache key

result = {} for func_name in target_funcs: if func_name in functions: + cache.clear() # Ensure fresh computation for each target result[func_name] = { "instructions": total_insns(func_name, set()), "branches": functions[func_name]["branches"], "jumps": functions[func_name]["jumps"], "calls": functions[func_name]["calls"], }

🤖 Prompt for AI Agents

In @devtools/node-complexity around lines 168 - 217, The shared memoization `cache` in `compute_total_instructions` causes stale totals when multiple target functions participate in cycles; modify `compute_total_instructions` so the `cache` is reset per target computation (e.g., move `cache = {}` inside the loop over `target_funcs` or call `cache.clear()` before invoking `total_insns(func_name, set())`) so cached totals cannot omit contributions that were excluded by a prior `visited` set; ensure you still use `total_insns`, `func_excluded`, and `visited` as before and only change cache scope/clearing.

coderabbitai · 2026-01-13T21:39:23Z

devtools/node-complexity

+    with open(baseline_path) as f:
+        baseline = json.load(f)


⚠️ Potential issue | 🟡 Minor

Specify encoding when opening JSON file.

Missing explicit encoding could cause issues on systems with non-UTF-8 default locale.

Suggested fix

- with open(baseline_path) as f: + with open(baseline_path, encoding="utf-8") as f: baseline = json.load(f)

🤖 Prompt for AI Agents

In @devtools/node-complexity around lines 291 - 292, The open call that reads the baseline JSON (open(baseline_path)) does not specify an encoding; update the file read so open is called with an explicit encoding (e.g., encoding='utf-8') when loading the baseline with json.load to avoid issues on systems with non-UTF-8 locales (ensure you still pass baseline_path and use json.load(f)).

coderabbitai bot reviewed Jan 13, 2026

View reviewed changes

rjarry force-pushed the node-complexity branch from cbdfa66 to df5e0c0 Compare January 13, 2026 16:54

rjarry force-pushed the node-complexity branch from df5e0c0 to b5306fe Compare January 13, 2026 21:34

coderabbitai bot reviewed Jan 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

devtools: add node complexity analysis script #476

devtools: add node complexity analysis script #476

rjarry commented Jan 13, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 13, 2026 •

edited

Loading

Walkthrough

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 13, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 13, 2026

Uh oh!

coderabbitai bot Jan 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

devtools: add node complexity analysis script #476

Are you sure you want to change the base?

devtools: add node complexity analysis script #476

Conversation

rjarry commented Jan 13, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rjarry commented Jan 13, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 13, 2026 •

edited

Loading