feat(sync): 010 - LevelWise Sync Strategy by xilosada · Pull Request #1932 · calimero-network/core

xilosada · 2026-02-11T11:46:14Z

Summary

Adds LevelWise sync protocol types optimized for wide, shallow trees (depth ≤ 2 with many children per level)
Implements level-by-level breadth-first synchronization for efficient sync of wide tree structures
Ports the LevelWise protocol from PR feat(sync): 010 - LevelWise Sync Strategy #1873 to current master, following the same modular pattern as SubtreePrefetch (PR feat(sync): 009 - SubtreePrefetch Sync Strategy #1929)

Changes

New Module: `levelwise.rs`

LevelWiseRequest: Request nodes at a specific tree level, optionally filtered by parent IDs
LevelWiseResponse: Response containing level nodes with metadata
LevelNode: Individual node with ID, hash, optional parent, and optional leaf data
LevelCompareResult: Categorized comparison results (matching, differing, local_missing, remote_missing)
compare_level_nodes(): Compare local vs remote level nodes
should_use_levelwise(): Heuristic for protocol selection (depth ≤ 2 && avg_children > 10)

Security Hardening (DoS Prevention)

MAX_LEVELWISE_DEPTH = 64: Prevent depth exhaustion attacks
MAX_PARENTS_PER_REQUEST = 1000: Limit request size
MAX_NODES_PER_LEVEL = 10_000: Prevent memory exhaustion
is_valid() methods on all wire protocol types with nested validation

Test Coverage

Comprehensive test suite with 56 tests covering:

Basic functionality and Borsh serialization roundtrips
Boundary conditions and edge cases
Security/exploit prevention scenarios (overflow, memory exhaustion, malicious deserialization)
Cross-validation consistency
All CRDT type variants

Test Plan

cargo check -p calimero-node-primitives
cargo test -p calimero-node-primitives --lib levelwise (56 tests pass)
cargo fmt
cargo clippy (no new warnings in levelwise module)

Supersedes feat(sync): 010 - LevelWise Sync Strategy #1873 (stale LevelWise PR)
Follows pattern from feat(sync): 009 - SubtreePrefetch Sync Strategy #1929 (SubtreePrefetch)
Part of Sync Protocol issue series ([Sync Protocol] 001: Add CrdtType to Entity Metadata #1768 → [Sync Protocol] 018: Compliance Test Suite #1785)

Note

Medium Risk
Adds a new sync protocol surface and adjusts protocol-selection behavior, which could change which strategy is chosen in production. Includes explicit validation limits and tests, reducing DoS and correctness risk but still touches core sync negotiation.

Overview
Adds a new sync/levelwise.rs module implementing the wire types for level-by-level (breadth-first) synchronization (LevelWiseRequest/LevelWiseResponse and LevelNode), plus comparison/selection helpers (compare_level_nodes, should_use_levelwise).

Exposes the new module via sync.rs re-exports and updates select_protocol to delegate the “wide shallow tree” rule to should_use_levelwise, with built-in bounds/validation helpers and extensive tests aimed at preventing oversized/overflow inputs from untrusted peers.

^{Written by Cursor Bugbot for commit 46bb80c. This will update automatically on new commits. Configure here.}

meroreviewer

🤖 AI Code Reviewer

Reviewed by 1 agents | Quality score: 33% | Review time: 70.4s

🟡 1 warnings, 💡 1 suggestions, 📝 2 nitpicks. See inline comments.

_{🤖 Generated by AI Code Reviewer | Review ID: review-ba77570c}

crates/node/primitives/src/sync/levelwise.rs

meroreviewer

🤖 AI Code Reviewer

Reviewed by 1 agents | Quality score: 33% | Review time: 88.6s

🟡 1 warnings, 💡 1 suggestions, 📝 2 nitpicks. See inline comments.

_{🤖 Generated by AI Code Reviewer | Review ID: review-11782d7b}

crates/node/primitives/src/sync/levelwise.rs

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is ON. A Cloud Agent has been kicked off to fix the reported issue.}

crates/node/primitives/src/sync/levelwise.rs

cursor · 2026-02-11T12:24:00Z

Bugbot Autofix prepared fixes for 1 of the 1 bugs found in the latest run.

✅ Fixed: Redundant heuristic duplicates existing protocol selection logic
- Modified select_protocol to call the canonical should_use_levelwise function instead of duplicating the heuristic logic inline, ensuring a single authoritative location for the LevelWise selection heuristic.

Or push these changes by commenting:

@cursor push 612e3e030b

Preview (612e3e030b)

diff --git a/crates/node/primitives/src/sync/protocol.rs b/crates/node/primitives/src/sync/protocol.rs
--- a/crates/node/primitives/src/sync/protocol.rs
+++ b/crates/node/primitives/src/sync/protocol.rs
@@ -5,6 +5,7 @@
 use borsh::{BorshDeserialize, BorshSerialize};
 
 use super::handshake::{SyncCapabilities, SyncHandshake};
+use super::levelwise::should_use_levelwise;
 
 // =============================================================================
 // Protocol Kind (Discriminant-only)
@@ -258,17 +259,20 @@
     }
 
     // Rule 6: Wide shallow tree (depth 1-2 with many children per level)
-    // Skip depth=0 (no hierarchy) - LevelWise requires actual tree structure
-    if remote.max_depth >= 1 && remote.max_depth <= 2 {
-        let avg_children_per_level = remote.entity_count / u64::from(remote.max_depth);
-        if avg_children_per_level > 10 {
-            return ProtocolSelection {
-                protocol: SyncProtocol::LevelWise {
-                    max_depth: remote.max_depth,
-                },
-                reason: "wide shallow tree, using level-wise sync",
-            };
-        }
+    // Delegate to the canonical heuristic in levelwise module
+    let max_depth_usize = remote.max_depth as usize;
+    let avg_children_per_level = if remote.max_depth > 0 {
+        (remote.entity_count / u64::from(remote.max_depth)) as usize
+    } else {
+        0
+    };
+    if should_use_levelwise(max_depth_usize, avg_children_per_level) {
+        return ProtocolSelection {
+            protocol: SyncProtocol::LevelWise {
+                max_depth: remote.max_depth,
+            },
+            reason: "wide shallow tree, using level-wise sync",
+        };
     }
 
     // Rule 7: Default fallback

meroreviewer

🤖 AI Code Reviewer

Reviewed by 1 agents | Quality score: 33% | Review time: 50.9s

💡 2 suggestions, 📝 2 nitpicks. See inline comments.

_{🤖 Generated by AI Code Reviewer | Review ID: review-ff56a554}

meroreviewer · 2026-02-11T12:45:27Z

crates/node/primitives/src/sync/levelwise.rs

+    #[must_use]
+    pub fn is_valid(&self) -> bool {
+        // Validate leaf data if present
+        if let Some(ref leaf_data) = self.leaf_data {


💡 Non-deterministic iteration order in compare_level_nodes

Iterating over remote_by_id.values() produces non-deterministic order, causing the output vectors (matching, differing, etc.) to have inconsistent ordering across runs, which could complicate debugging and snapshot-based testing.

Suggested fix:

Consider sorting the result vectors by node ID before returning, or document that ordering is non-deterministic.

meroreviewer · 2026-02-11T12:45:29Z

crates/node/primitives/src/sync/levelwise.rs

+    }
+
+    /// Get number of parents being queried (None if full level).
+    #[must_use]


💡 is_valid() does not check for duplicate parent_ids

A malicious peer could send a request with thousands of duplicate parent IDs (all the same value), passing the length check but causing redundant work on the responder.

Suggested fix:

Consider adding a uniqueness check for parent_ids in is_valid(), or document that the responder must handle duplicates.

meroreviewer · 2026-02-11T12:45:31Z

crates/node/primitives/src/sync/levelwise.rs

+    /// Check if any sync work is needed.
+    #[must_use]
+    pub fn needs_sync(&self) -> bool {
+        !self.differing.is_empty() || !self.local_missing.is_empty()


📝 Nit: Docstring omits upper depth bound

The docstring says "Requires max_depth >= 1" but the implementation also requires max_depth <= 2; the full range 1..=2 should be documented for clarity.

Suggested fix:

Update docstring to: "Requires `1 <= max_depth <= 2` because depth-0 has no hierarchy and deeper trees favor HashComparison."

meroreviewer · 2026-02-11T12:45:33Z

crates/node/primitives/src/sync/levelwise.rs

+    /// Check if this is an internal node.
+    #[must_use]
+    pub fn is_internal(&self) -> bool {
+        self.leaf_data.is_none()


📝 Nit: Deduplication behavior not documented in compare_level_nodes

The function deduplicates remote nodes by ID (keeping the first occurrence) but this behavior is not mentioned in the docstring; callers may not expect silent deduplication.

Suggested fix:

Add to the docstring: "Deduplicates remote nodes by ID, keeping the first occurrence."

Add LevelWise sync protocol types optimized for wide, shallow trees (depth <= 2 with many children per level). Implements level-by-level breadth-first synchronization for efficient sync of wide tree structures. This ports the LevelWise protocol from PR #1873 to current master, following the same modular pattern as SubtreePrefetch (PR #1929). - Add `levelwise.rs` module with: - `LevelWiseRequest`: Request nodes at a specific tree level - `LevelWiseResponse`: Response containing level nodes with metadata - `LevelNode`: Individual node with optional leaf data - `LevelCompareResult`: Categorized comparison results - `compare_level_nodes()`: Compare local vs remote level nodes - `should_use_levelwise()`: Heuristic for protocol selection - Security hardening with DoS prevention: - `MAX_LEVELWISE_DEPTH = 64`: Prevent depth exhaustion attacks - `MAX_PARENTS_PER_REQUEST = 1000`: Limit request size - `MAX_NODES_PER_LEVEL = 10_000`: Prevent memory exhaustion - `is_valid()` methods on all wire protocol types - Comprehensive test suite (56 tests) covering: - Basic functionality and serialization roundtrips - Boundary conditions and edge cases - Security/exploit prevention scenarios - Cross-validation consistency

- Fix should_use_levelwise to require tree_depth >= 1 (not just <= 2) since depth-0 trees have no hierarchy to traverse breadth-first, matching the protocol selection logic in protocol.rs - Fix compare_level_nodes to deduplicate remote nodes by ID before comparison, preventing duplicate IDs from corrupting the result categories (matching, differing, local_missing, remote_missing) - Update tests to reflect the corrected behavior

- Rename `tree_depth` param to `max_depth` in `should_use_levelwise()` to align with documentation (fixes naming mismatch) - Add `debug_assert!` in `for_parents()` constructor to catch misuse early in development - Optimize `nodes_to_process()` to pre-allocate capacity instead of cloning (reduces allocations for large result sets) - Fix clippy warnings: inline format args, use RangeInclusive::contains

meroreviewer

🤖 AI Code Reviewer

Reviewed by 1 agents | Quality score: 33% | Review time: 223.2s

🟡 1 warnings, 💡 1 suggestions, 📝 2 nitpicks. See inline comments.

_{🤖 Generated by AI Code Reviewer | Review ID: review-bf7b2361}

meroreviewer · 2026-02-11T12:53:22Z

crates/node/primitives/src/sync/protocol.rs

 use borsh::{BorshDeserialize, BorshSerialize};

 use super::handshake::{SyncCapabilities, SyncHandshake};
+use super::levelwise::should_use_levelwise;


🟡 Unused import violates No Dead Code rule

The diff adds use super::levelwise::should_use_levelwise; but select_protocol implements the heuristic inline (lines 260-272) instead of calling this function.

Suggested fix:

Either remove the unused import, or refactor `select_protocol` Rule 6 to call `should_use_levelwise(remote.max_depth as usize, avg_children_per_level as usize)` instead of duplicating the logic.

meroreviewer · 2026-02-11T12:53:24Z

crates/node/primitives/src/sync/levelwise.rs

+    pub differing: Vec<[u8; 32]>,
+
+    /// Nodes missing locally - need to fetch.
+    pub local_missing: Vec<[u8; 32]>,


💡 Logic duplication between should_use_levelwise and select_protocol

The LevelWise selection heuristic is implemented twice: in should_use_levelwise() and inline in select_protocol(). If one changes without the other, behavior will silently diverge.

Suggested fix:

Have `select_protocol` call the exported `should_use_levelwise()` function to ensure a single source of truth for the heuristic.

meroreviewer · 2026-02-11T12:53:26Z

crates/node/primitives/src/sync/levelwise.rs

+    }
+
+    /// Check if node is within valid bounds.
+    ///


📝 Nit: Non-deterministic iteration order in compare_level_nodes

Iterating over HashMap::values() and HashMap::keys() produces non-deterministic ordering in the result vectors, which could complicate debugging or affect downstream code that assumes stable ordering.

Suggested fix:

Consider sorting the result vectors if deterministic ordering is desirable, or document that ordering is not guaranteed.

meroreviewer · 2026-02-11T12:53:28Z

crates/node/primitives/src/sync/levelwise.rs

+    pub differing: Vec<[u8; 32]>,
+
+    /// Nodes missing locally - need to fetch.
+    pub local_missing: Vec<[u8; 32]>,


📝 Nit: Heuristic thresholds could use named constants

Magic numbers 2 and 10 in should_use_levelwise could be extracted to named constants for consistency with other limits like MAX_LEVELWISE_DEPTH.

Suggested fix:

Consider `const LEVELWISE_MAX_DEPTH_THRESHOLD: usize = 2;` and `const LEVELWISE_MIN_AVG_CHILDREN: usize = 10;`

xilosada mentioned this pull request Feb 11, 2026

feat(sync): 010 - LevelWise Sync Strategy #1873

Closed