feat(sync): 009 - SubtreePrefetch Sync Strategy by xilosada · Pull Request #1929 · calimero-network/core

xilosada · 2026-02-11T11:34:25Z

Summary

Add SubtreePrefetch sync protocol types for deep trees with clustered changes
Optimized for tree depth > 3, divergence < 20%, clustered changes
O(1) round trips per subtree vs HashComparison's O(depth)

Types Added

SubtreePrefetchRequest: Request subtrees by root hash with depth limit
SubtreePrefetchResponse: Contains fetched subtrees and not-found roots
SubtreeData: Single subtree with entities for CRDT merge
should_use_subtree_prefetch(): Heuristic for protocol selection

Security

MAX_SUBTREE_DEPTH (64): Prevents deep traversal attacks
MAX_SUBTREES_PER_REQUEST (100): Limits request size
MAX_ENTITIES_PER_SUBTREE (10,000): Bounds per-subtree data
MAX_TOTAL_ENTITIES (100,000): Caps total response size
is_valid() methods on all types for post-deserialization validation
Saturating arithmetic in total_entity_count() to prevent overflow

Test Plan

39 unit tests covering:
- Request/Response construction and serialization
- Depth clamping and validation
- Entity count limits
- Memory exhaustion prevention
- Overflow protection
- Heuristic boundary conditions
- Edge cases (empty, zeros, max values)

Note

Low Risk
Mostly additive sync wire/type definitions and tests with explicit size/depth validation; minimal impact on existing behavior aside from exposing new protocol types/constants.

Overview
Adds a new SubtreePrefetch sync protocol primitives module (sync/subtree.rs) with Borsh-serializable request/response/data types and a heuristic (should_use_subtree_prefetch) for choosing this strategy on deep, low-divergence, clustered changes.

Updates sync.rs to register the new submodule and re-export its types/constants, including explicit is_valid() bounds checks (depth, subtree counts, per-subtree and total entity limits) plus extensive unit tests covering serialization roundtrips and resource-exhaustion edge cases.

^{Written by Cursor Bugbot for commit abdf8ee. This will update automatically on new commits. Configure here.}

Add SubtreePrefetch sync protocol types for deep trees with clustered changes. This protocol is optimized for scenarios where: - Tree depth > 3 levels - Divergence < 20% - Changes are clustered in subtrees Trade-off: O(1) round trips per subtree vs HashComparison's O(depth), but may over-fetch data compared to HashComparison's minimal transfer. Types added: - SubtreePrefetchRequest: Request subtrees by root hash with depth limit - SubtreePrefetchResponse: Contains fetched subtrees and not-found roots - SubtreeData: Single subtree with entities for CRDT merge - should_use_subtree_prefetch(): Heuristic for protocol selection Security: - MAX_SUBTREE_DEPTH (64): Prevents deep traversal attacks - MAX_SUBTREES_PER_REQUEST (100): Limits request size - MAX_ENTITIES_PER_SUBTREE (10,000): Bounds per-subtree data - MAX_TOTAL_ENTITIES (100,000): Caps total response size - is_valid() methods on all types for post-deserialization validation - Saturating arithmetic in total_entity_count() to prevent overflow Includes 39 unit tests covering edge cases and exploit prevention.

meroreviewer

🤖 AI Code Reviewer

Reviewed by 1 agents | Quality score: 33% | Review time: 99.8s

🟡 1 warnings, 💡 1 suggestions, 📝 1 nitpicks. See inline comments.

_{🤖 Generated by AI Code Reviewer | Review ID: review-cb995cc9}

crates/node/primitives/src/sync/subtree.rs

Add LevelWise sync protocol types optimized for wide, shallow trees (depth <= 2 with many children per level). Implements level-by-level breadth-first synchronization for efficient sync of wide tree structures. This ports the LevelWise protocol from PR #1873 to current master, following the same modular pattern as SubtreePrefetch (PR #1929). ## Changes - Add `levelwise.rs` module with: - `LevelWiseRequest`: Request nodes at a specific tree level - `LevelWiseResponse`: Response containing level nodes with metadata - `LevelNode`: Individual node with optional leaf data - `LevelCompareResult`: Categorized comparison results - `compare_level_nodes()`: Compare local vs remote level nodes - `should_use_levelwise()`: Heuristic for protocol selection - Security hardening with DoS prevention: - `MAX_LEVELWISE_DEPTH = 64`: Prevent depth exhaustion attacks - `MAX_PARENTS_PER_REQUEST = 1000`: Limit request size - `MAX_NODES_PER_LEVEL = 10_000`: Prevent memory exhaustion - `is_valid()` methods on all wire protocol types - Comprehensive test suite (56 tests) covering: - Basic functionality and serialization roundtrips - Boundary conditions and edge cases - Security/exploit prevention scenarios - Cross-validation consistency

Fixes issues raised in PR review: 1. **depth() now always returns bounded value (High Severity)** - Changed return type from `Option<usize>` to `usize` - Returns `MAX_SUBTREE_DEPTH` when `max_depth` is `None` - Consumers always get a safe, bounded depth value 2. **is_valid() now validates max_depth (Medium Severity)** - Added check: `max_depth <= MAX_SUBTREE_DEPTH` when `Some` - Catches invalid values from untrusted deserialization 3. **max_depth field is now private (Medium Severity)** - Matches encapsulation pattern from hash_comparison module - Added `is_unlimited()` accessor to check if unlimited was requested - Prevents bypassing `depth()` accessor 4. **Extracted heuristic magic numbers to constants (Nitpick)** - `DEEP_TREE_THRESHOLD = 3` - `MAX_DIVERGENCE_RATIO = 0.20` - `MAX_CLUSTERED_SUBTREES = 5` Tests updated: - Added test_subtree_request_max_depth_validation - Added test_heuristic_constants_are_sensible - Updated existing tests to use new API (41 tests total)

meroreviewer

🤖 AI Code Reviewer

Reviewed by 1 agents | Quality score: 33% | Review time: 110.2s

🟡 1 warnings, 💡 1 suggestions, 📝 1 nitpicks. See inline comments.

_{🤖 Generated by AI Code Reviewer | Review ID: review-c5e2be46}

crates/node/primitives/src/sync/subtree.rs

cursor · 2026-02-11T11:58:02Z

Bugbot Autofix prepared fixes for 1 of the 1 bugs found in the latest run.

✅ Fixed: New threshold constants duplicate magic numbers in select_protocol
- Replaced hardcoded magic numbers (3 and 0.2) in select_protocol() with imports of DEEP_TREE_THRESHOLD and MAX_DIVERGENCE_RATIO from subtree.rs to ensure a single source of truth.

Or push these changes by commenting:

@cursor push 174dba2cc8

Preview (174dba2cc8)

diff --git a/crates/node/primitives/src/sync/protocol.rs b/crates/node/primitives/src/sync/protocol.rs
--- a/crates/node/primitives/src/sync/protocol.rs
+++ b/crates/node/primitives/src/sync/protocol.rs
@@ -5,6 +5,7 @@
 use borsh::{BorshDeserialize, BorshSerialize};
 
 use super::handshake::{SyncCapabilities, SyncHandshake};
+use super::subtree::{DEEP_TREE_THRESHOLD, MAX_DIVERGENCE_RATIO};
 
 // =============================================================================
 // Protocol Kind (Discriminant-only)
@@ -236,7 +237,11 @@
     }
 
     // Rule 4: Deep tree with localized changes
-    if remote.max_depth > 3 && divergence < 0.2 {
+    #[expect(
+        clippy::cast_possible_truncation,
+        reason = "DEEP_TREE_THRESHOLD is always small (currently 3)"
+    )]
+    if remote.max_depth > DEEP_TREE_THRESHOLD as u32 && divergence < MAX_DIVERGENCE_RATIO {
         return ProtocolSelection {
             protocol: SyncProtocol::SubtreePrefetch {
                 subtree_roots: vec![], // Will be populated during sync

meroreviewer

🤖 AI Code Reviewer

Reviewed by 1 agents | Quality score: 33% | Review time: 133.2s

🟡 1 warnings, 📝 1 nitpicks. See inline comments.

_{🤖 Generated by AI Code Reviewer | Review ID: review-e7c1cabf}

meroreviewer · 2026-02-11T12:49:01Z

crates/node/primitives/src/sync/subtree.rs

+    }
+
+    #[test]
+    fn test_should_use_subtree_prefetch_edge_cases() {


🟡 Test doesn't actually verify total entity limit

The test creates 101 subtrees (100,000/1,000 + 1), which exceeds MAX_SUBTREES_PER_REQUEST (100), so is_valid() fails on the subtree count check before ever reaching the total entity count check.

Suggested fix:

Use fewer subtrees (e.g., 20) with more entities each (e.g., 5,001) so total exceeds 100,000 while each subtree stays under 10,000 and subtree count stays under 100.

meroreviewer · 2026-02-11T12:49:03Z

crates/node/primitives/src/sync/subtree.rs

+            truncated: false,
+        }
+    }
+


📝 Nit: Consider documenting NaN/Inf behavior for divergence_ratio

The heuristic function silently rejects NaN values (returns false) since NaN < anything is false; this is probably correct but undocumented.

Suggested fix:

Add a note in the doc comment that NaN/Inf values will cause the function to return false.

Add LevelWise sync protocol types optimized for wide, shallow trees (depth <= 2 with many children per level). Implements level-by-level breadth-first synchronization for efficient sync of wide tree structures. This ports the LevelWise protocol from PR #1873 to current master, following the same modular pattern as SubtreePrefetch (PR #1929). - Add `levelwise.rs` module with: - `LevelWiseRequest`: Request nodes at a specific tree level - `LevelWiseResponse`: Response containing level nodes with metadata - `LevelNode`: Individual node with optional leaf data - `LevelCompareResult`: Categorized comparison results - `compare_level_nodes()`: Compare local vs remote level nodes - `should_use_levelwise()`: Heuristic for protocol selection - Security hardening with DoS prevention: - `MAX_LEVELWISE_DEPTH = 64`: Prevent depth exhaustion attacks - `MAX_PARENTS_PER_REQUEST = 1000`: Limit request size - `MAX_NODES_PER_LEVEL = 10_000`: Prevent memory exhaustion - `is_valid()` methods on all wire protocol types - Comprehensive test suite (56 tests) covering: - Basic functionality and serialization roundtrips - Boundary conditions and edge cases - Security/exploit prevention scenarios - Cross-validation consistency

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is ON, but it could not run because the branch was deleted or merged before Autofix could start.}

cursor · 2026-02-11T12:53:06Z

crates/node/primitives/src/sync/subtree.rs

+
+        let response = SubtreePrefetchResponse::complete(subtrees);
+        assert!(!response.is_valid()); // Should be invalid due to total entity count
+    }


Test passes for wrong reason, total entity limit untested

Medium Severity

The test_subtree_response_validation_total_entity_limit test doesn't actually exercise the MAX_TOTAL_ENTITIES check. With entities_per_subtree = 1000, num_subtrees computes to (100_000 / 1000) + 1 = 101, which exceeds MAX_SUBTREES_PER_REQUEST (100). So is_valid() returns false at the subtree count check (line 287) before ever reaching the total entity count check (line 295). The assertion passes for the wrong reason, leaving the MAX_TOTAL_ENTITIES validation effectively untested — if that check were removed, no test would catch it.

Additional Locations (1)

crates/node/primitives/src/sync/subtree.rs#L284-L301

meroreviewer bot reviewed Feb 11, 2026

View reviewed changes

crates/node/primitives/src/sync/subtree.rs Show resolved Hide resolved

meroreviewer bot reviewed Feb 11, 2026

View reviewed changes

crates/node/primitives/src/sync/subtree.rs Show resolved Hide resolved

meroreviewer bot reviewed Feb 11, 2026

View reviewed changes

crates/node/primitives/src/sync/subtree.rs Show resolved Hide resolved

cursor bot reviewed Feb 11, 2026

View reviewed changes

crates/node/primitives/src/sync/subtree.rs Show resolved Hide resolved

crates/node/primitives/src/sync/subtree.rs Show resolved Hide resolved

crates/node/primitives/src/sync/subtree.rs Outdated Show resolved Hide resolved

This comment has been minimized.

Sign in to view

This was referenced Feb 11, 2026

feat(sync): 010 - LevelWise Sync Strategy #1932

Merged

feat(sync): 010 - LevelWise Sync Strategy #1873

Closed

meroreviewer bot reviewed Feb 11, 2026

View reviewed changes

crates/node/primitives/src/sync/subtree.rs Show resolved Hide resolved

meroreviewer bot reviewed Feb 11, 2026

View reviewed changes

crates/node/primitives/src/sync/subtree.rs Show resolved Hide resolved

meroreviewer bot reviewed Feb 11, 2026

View reviewed changes

crates/node/primitives/src/sync/subtree.rs Show resolved Hide resolved

xilosada mentioned this pull request Feb 11, 2026

feat(sync): 011 - Snapshot Sync (Fresh Nodes Only) #1933

Merged

4 tasks

cursor bot reviewed Feb 11, 2026

View reviewed changes

crates/node/primitives/src/sync/subtree.rs Show resolved Hide resolved

chefsale approved these changes Feb 11, 2026

View reviewed changes

Merge branch 'master' into feat/sync-009-subtree-prefetch

abdf8ee

xilosada merged commit f675578 into master Feb 11, 2026
9 checks passed

xilosada deleted the feat/sync-009-subtree-prefetch branch February 11, 2026 12:46

meroreviewer bot reviewed Feb 11, 2026

View reviewed changes

cursor bot reviewed Feb 11, 2026

View reviewed changes

Conversation

xilosada commented Feb 11, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Types Added

Security

Test Plan

Uh oh!

meroreviewer bot left a comment

Choose a reason for hiding this comment

🤖 AI Code Reviewer

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment has been minimized.

meroreviewer bot left a comment

Choose a reason for hiding this comment

🤖 AI Code Reviewer

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot commented Feb 11, 2026

Uh oh!

Uh oh!

meroreviewer bot left a comment

Choose a reason for hiding this comment

🤖 AI Code Reviewer

Uh oh!

meroreviewer bot Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

meroreviewer bot Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Feb 11, 2026

Choose a reason for hiding this comment

Test passes for wrong reason, total entity limit untested

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

xilosada commented Feb 11, 2026 •

edited by cursor bot

Loading