Draft
Conversation
Implements issue #1774 - HashComparison Sync Strategy (P0 Primary) from the Sync Protocol series. Adds types for recursive Merkle tree comparison (CIP §4 STATE-BASED): - TreeNodeRequest for traversal (node_id, max_depth) - TreeNodeResponse with nodes and not_found flag - TreeNode (internal with children OR leaf with data) - TreeLeafData (key, value, metadata) - LeafMetadata (crdt_type, hlc, version, collection_id, parent_id) - CrdtType enum (LwwRegister, GCounter, PnCounter, etc.) - TreeCompareResult and compare_tree_nodes() function Key design decisions: - Metadata MUST include crdt_type for proper CRDT merge - TreeNode can be internal (has children) or leaf (has data) - compare_tree_nodes() identifies differing subtrees Protocol flow: 1. Start at root 2. Request children, compare hashes 3. Recurse on differing subtrees 4. Fetch leaf data, apply via CRDT merge Includes 13 new unit tests (79 total). CIP Reference: §4 - State Machine (STATE-BASED branch)
Fixes issues identified in PR #1870 review: 1. **Critical**: compare_tree_nodes now properly handles bidirectional sync - Changed signature to (Option<&TreeNode>, Option<&TreeNode>) - Returns RemoteMissing when local has data remote doesn't - Leaf nodes with different hashes return Different (not empty) 2. **Warnings**: - TreeCompareResult::Different now includes local_only_children for bidirectional sync support - O(n*m) → O(n+m) using HashSet for child comparison 3. **Suggestions**: - Added MAX_TREE_DEPTH = 64 constant - TreeNodeRequest::with_depth clamps to MAX_TREE_DEPTH 4. **Tests**: Added 4 new tests for edge cases: - test_compare_tree_nodes_leaf_content_differs - test_compare_tree_nodes_remote_missing - test_compare_tree_nodes_local_only_children - test_tree_node_request_max_depth_validation All 46 sync tests pass.
…th docs Address AI reviewer comments: - Add BorshSerialize/BorshDeserialize to TreeCompareResult for wire protocol consistency - Document max_depth security implications and MAX_TREE_DEPTH enforcement
Address AI reviewer comments (review-50843406): - Add depth() accessor that always clamps max_depth for safe deserialization - Add MAX_NODES_PER_RESPONSE constant with is_valid() for response validation - Add MAX_CHILDREN_PER_NODE constant with is_valid() for node validation - Add needs_push() method for bidirectional sync scenarios - Clarify TreeNode id vs hash semantic distinction in documentation - Add comprehensive tests for all new validation methods
Add TODO comment noting CrdtType is duplicated between sync and storage modules, with a recommendation to consolidate into calimero-primitives. Document LeafMetadata as a wire-protocol-optimized subset of storage Metadata, explaining the design decision to keep it separate for smaller payloads.
…for leaf nodes - Add common_children field to TreeCompareResult::Different to track children present on both sides that need recursive comparison - Rename differing_children to remote_only_children for clarity - Fix needs_push() to return true for differing leaf nodes (when all child vecs are empty but hashes differ) - Update tests to cover the new behavior Fixes: - Merkle comparison now properly reports shared children for recursive traversal when parent hashes differ - Bidirectional CRDT sync now works for differing leaf nodes
Cherry-pick Bugbot PR #1914: - Add common_children field to TreeCompareResult::Different for proper Merkle traversal - Rename differing_children → remote_only_children for clarity - Fix needs_push() to return true for differing leaf nodes Additional improvements: - TreeNodeResponse::is_valid() now validates all nested TreeNodes - TreeNode::is_valid() now checks structural invariant (children XOR leaf_data) - Add MAX_LEAF_VALUE_SIZE (1MB) constant with TreeLeafData::is_valid() - Add over-limit validation tests for all bounds - Move HashSet import to module level (StdExternalCrate pattern) - Change leaves() to return iterator instead of Vec (zero allocation) Tests: 52 sync tests passing
- Make max_depth field private to prevent bypass of depth() clamping - TreeNode::is_valid() now rejects empty nodes (no children AND no leaf_data) - Improve needs_push() documentation explaining leaf detection invariant - Simplify CrdtType TODO comment to just reference tracking issue #1912 - Update tests for stricter validation rules All 52 sync tests passing
…ergence The needs_push method now correctly returns true when local is a leaf node and remote is an internal node (structural divergence). Previously, the leaf detection heuristic only returned true when all child vecs were empty, but this missed the case where local is a leaf (no local_only or common children) while remote has children (remote_only is non-empty). The fix changes the final check from: remote_only_children.is_empty() && common_children.is_empty() to: common_children.is_empty() Since we already checked local_only_children.is_empty() above, if common_children is also empty, local has no children at all (is a leaf) and its data needs to be pushed for CRDT merge.
Contributor
Author
|
Cursor Agent can help with this pull request. Just |
|
Your PR title does not adhere to the Conventional Commits convention: Common errors to avoid:
|
There was a problem hiding this comment.
🤖 AI Code Reviewer
Reviewed by 1 agents | Quality score: 80% | Review time: 159.4s
✅ No Issues Found
All agents reviewed the code and found no issues. LGTM! 🎉
🤖 Generated by AI Code Reviewer | Review ID: review-9ce77f19
|
This pull request has been automatically marked as stale. If this pull request is still relevant, please leave any comment (for example, "bump"), and we'll keep it open. We are sorry that we haven't been able to prioritize reviewing it yet. Your contribution is very much appreciated. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix:
needs_pushmisses local leaf data in structural divergenceDescription
This PR fixes a bug in the
TreeCompareResult::needs_pushmethod where it incorrectly returnedfalsewhen a local node was a leaf with data, but the remote node at the same position was an internal node with children. This scenario, representing structural divergence, meant local leaf data was not being pushed, violating the method's contract.The fix modifies the
Differentcase inneeds_pushto correctly identify when the local node is a leaf (i.e., has nolocal_only_childrenand nocommon_children). In such cases, the local leaf data needs to be pushed for proper bidirectional CRDT sync, even if the remote node has children. The inline documentation forneeds_pushhas been updated to reflect this behavior.This addresses bug
ref1_0ca2a992-7860-4c86-8856-6838e5d20455.Test plan
The existing unit test
with_remote_onlyincrates/node/primitives/src/sync.rswas updated to asserttrueforneeds_pushin theDifferentcase wherelocal_only_childrenandcommon_childrenare empty, butremote_only_childrenis non-empty. This test now correctly verifies the fix for structural divergence.All
cargo test,cargo fmt --check, andcargo clippychecks were run and passed.Documentation update
The inline documentation for the
needs_pushmethod incrates/node/primitives/src/sync.rshas been updated to clarify its behavior regarding structural divergence. No other documentation updates are immediately required.