Skip to content

[PWCI] "[v13] net: optimize __rte_raw_cksum and add tests"#618

Open
ovsrobot wants to merge 1 commit intomainfrom
series_37013
Open

[PWCI] "[v13] net: optimize __rte_raw_cksum and add tests"#618
ovsrobot wants to merge 1 commit intomainfrom
series_37013

Conversation

@ovsrobot
Copy link
Owner

@ovsrobot ovsrobot commented Jan 11, 2026

NOTE: This is an auto submission for "[v13] net: optimize __rte_raw_cksum and add tests".

See "http://patchwork.dpdk.org/project/dpdk/list/?series=37013" for details.

Summary by CodeRabbit

Release Notes

  • Tests

    • Added comprehensive fuzz testing for checksum validation, covering edge cases, various buffer alignments, random-length inputs, and multiple initial sum scenarios.
    • Expanded checksum performance tests to include additional buffer sizes for broader coverage.
  • Performance

    • Enhanced checksum calculation performance through optimized processing algorithms.

✏️ Tip: You can customize this high-level summary in your review settings.

__rte_raw_cksum uses a loop with memcpy on each iteration.
GCC 15+ is able to vectorize the loop but Clang 18.1 is not.

Replace memcpy with direct pointer access using a packed struct with
__rte_may_alias attribute (same pattern as rte_memcpy.h). This enables
both GCC and Clang to vectorize with SSE/AVX/AVX-512 while avoiding
GCC strict-aliasing bugs without pragma workarounds.

This patch adds comprehensive fuzz testing and updates the performance
test to measure the optimization impact.

Performance results from cksum_perf_autotest on Intel Xeon
(Cascade Lake, AVX-512) built with Clang 18.1 (TSC cycles/byte):

  Block size    Before    After    Improvement
         100      0.40     0.24        ~40%
        1500      0.50     0.06        ~8x
        9000      0.49     0.06        ~8x

Signed-off-by: Scott Mitchell <scott.k.mitch1@gmail.com>
Signed-off-by: 0-day Robot <robot@bytheb.org>
Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @ovsrobot, you have reached your weekly rate limit of 500000 diff characters.

Please try again later or upgrade to continue using Sourcery

@coderabbitai
Copy link

coderabbitai bot commented Jan 11, 2026

📝 Walkthrough

Walkthrough

This PR introduces fuzz testing infrastructure for checksum validation and optimizes the core checksum implementation. It adds a new fuzz test module (240 lines) with comprehensive edge-case and random data coverage, expands performance test data sizes, refactors the 16-bit processing loop in the checksum library using vectorized access, and updates the build configuration to include test dependencies.

Changes

Cohort / File(s) Summary
Build Configuration
app/test/meson.build
Added test file dependency for test_cksum_fuzz.c with ['net'] dependency mapping.
Fuzz Testing
app/test/test_cksum_fuzz.c
New 240-line test module implementing comprehensive fuzz testing for __rte_raw_cksum optimization. Includes reference checksum implementation, edge-case coverage (lengths 0–65536, including GRO boundaries), random-length tests, aligned/unaligned buffer variants, and diagnostic output on mismatches. Exposes public test entry cksum_fuzz_autotest via REGISTER_FAST_TEST macro.
Performance Testing
app/test/test_cksum_perf.c
Expanded data_sizes array with four additional block sizes: 9000, 9001, 65536, 65537.
Core Library Optimization
lib/net/rte_cksum.h
Refactored 16-bit processing loop to replace memcpy-based approach with vectorized loop using packed alias struct for safe unaligned 16-bit word access. Maintains overflow behavior and odd-length tail handling.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 Checksums dancing, tested with care,
Vectors now aligned, optimized pair!
Fuzz and perf together, edge-cases bare,
From 9000 to 65536, we go there!
Fast and reliable, a recipe most fair.

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 63.64% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main changes: optimization of __rte_raw_cksum and addition of tests, which aligns with the changeset modifications across lib/net/rte_cksum.h and app/test files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
app/test/test_cksum_fuzz.c (1)

182-202: Solid random testing approach.

The random length generation correctly covers the full range [0, 65536]. The 1000-iteration default balances coverage with test execution time.

For reproducibility when debugging failures, you might consider logging the random seed or providing a mechanism to replay specific seeds. However, this is optional since rte_rand() may have its own seeding infrastructure.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cd60dcd and d9ac602.

📒 Files selected for processing (4)
  • app/test/meson.build
  • app/test/test_cksum_fuzz.c
  • app/test/test_cksum_perf.c
  • lib/net/rte_cksum.h
🧰 Additional context used
🧬 Code graph analysis (1)
app/test/test_cksum_fuzz.c (1)
lib/net/rte_cksum.h (1)
  • __rte_raw_cksum (42-68)
🔇 Additional comments (8)
app/test/meson.build (1)

40-42: LGTM!

The new fuzz test dependency entry correctly follows the existing pattern and maintains alphabetical ordering with the other checksum test files.

app/test/test_cksum_perf.c (1)

18-18: LGTM!

The expanded test sizes appropriately cover jumbo frame boundaries (9000/9001) and 64K GRO limits (65536/65537), aligning with the fuzz test coverage. The paired even/odd values ensure both aligned and odd-length paths are exercised.

lib/net/rte_cksum.h (2)

59-65: Odd-byte handling correctly preserved.

The byte-order independent approach using memcpy for the trailing byte maintains correctness across different endianness, consistent with the reference implementation.


45-57: Well-designed optimization using the packed alias pattern.

The approach correctly addresses:

  1. Strict aliasing: __rte_may_alias prevents UB from type-punning through uint16_t*
  2. Unaligned access: __rte_packed ensures safe reads regardless of buffer alignment
  3. Vectorization: Simple loop structure allows GCC/Clang to auto-vectorize with SSE/AVX

The local struct definition within function scope is unconventional but valid, and keeps the alias type private to this implementation.

app/test/test_cksum_fuzz.c (4)

26-52: Good choice to retain the reference implementation for comparison.

The reference implementation correctly mirrors the original DPDK v23.11 __rte_raw_cksum using memcpy for safe unaligned access, providing a reliable baseline for validating the optimization.


72-114: Comprehensive test function with good diagnostics.

The test properly handles edge cases (zero length, allocation constraints) and provides helpful hexdump output on failures for debugging. Memory management is correct on both success and failure paths.


138-177: Excellent edge case coverage.

The test array strategically targets boundaries that commonly expose bugs: powers of 2 (vectorization boundaries), MTU sizes (1500/1501), and 64K GRO limits. Testing each length with both zero and random initial sums strengthens the validation.


204-240: Well-organized test harness.

The test progression from edge cases to random testing is logical—edge cases run quickly and catch common issues first. The progress output clearly indicates which phase is executing or has failed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants