Conversation
…tocol testing Replaces the flat `DigestCache` HashMap in `SimNode` with `SimStorage`, which uses the real `calimero-storage::Index<MainStorage>` implementation backed by `InMemoryDB`. This enables accurate simulation of sync protocols that depend on tree structure (e.g., HashComparison). Key changes: - Add `SimStorage` with in-memory Merkle tree using `Store + InMemoryDB` - Add `RuntimeEnv` bridge to connect storage Key operations - Update `SimNode` to use hybrid storage: real tree + metadata cache - Add `insert_entity_hierarchical()` for creating proper tree depth - Make `Index::get_index()` and `get_children_of()` public for traversal - Add tree structure verification tests for protocol selection - Fix: prevent self-referencing cycle in hierarchical insertion The entity counting now correctly excludes intermediate tree nodes, and `iter_entities()` returns only "real" entities (with metadata). Tree depth now affects protocol selection: - SubtreePrefetch scenarios have max_depth > 3 - LevelWise scenarios have max_depth <= 2 Spec reference: Simulation Framework Spec §5, §7, §11 Co-authored-by: cursor[bot] <cursor@calimero.network>
- Delegate apply_storage_op Insert/Update to insert_entity_with_metadata to avoid duplicating dual-write logic (cursor bugbot feedback) - Extract magic number 24 to MAX_HIERARCHICAL_DEPTH constant with docs - Add comprehensive documentation for max_depth() semantics explaining the difference between storage-level (root-inclusive) and protocol-level (root-exclusive) depth values
Implements the HashComparison sync protocol (CIP §4) with proper integration into SyncManager and comprehensive test coverage. Key changes: - Add HashComparison protocol implementation with iterative DFS traversal - Integrate with SyncManager for initiator and responder roles - Add wire protocol types (TreeNodeRequest, TreeNodeResponse) in new wire.rs - Implement CRDT merge at leaves with proper timestamp extraction (I5) - Add force_protocol mechanism in SimNode for testing - Add 10 HashComparison simulation tests - Add compliance tests for I4 (convergence) and I5 (CRDT merge) Invariants verified: - I5: CRDT merge only, no overwrite for initialized nodes - I4: Strategy equivalence verified via compliance tests Test coverage: 230 tests passing
Fix Bug 1: Child nodes incorrectly compared against parent's local version - Only compare the first node (index 0) in the response with local_node - For children (index > 0), skip comparison and let them be processed when popped from to_compare stack with their correct local versions - This prevents the debug_assert in compare_tree_nodes from firing Fix Bug 2: Missing RuntimeEnv causes incorrect LWW timestamp comparison - Pass runtime_env through apply_leaf_with_crdt_merge to merge_entity_values - Wrap Index::get_index call in with_runtime_env to access actual storage - This ensures existing timestamps are correctly retrieved for LWW merge Fix Bug 3: CRDT merge uses wrong storage key format - Use Key::Entry(entity_id).to_bytes() to apply SHA256 hashing - This matches the key format used by create_storage_callbacks - Without this fix, reads/writes target wrong storage locations
|
Cursor Agent can help with this pull request. Just |
There was a problem hiding this comment.
🤖 AI Code Reviewer
Reviewed by 3 agents | Quality score: 93% | Review time: 257.4s
🟡 2 warnings, 💡 3 suggestions. See inline comments.
🤖 Generated by AI Code Reviewer | Review ID: review-dbcfba6f
| @@ -573,11 +589,14 @@ impl SyncManager { | |||
|
|
|||
There was a problem hiding this comment.
🟡 Silent fallback to timestamp 0 on get_index failure masks errors
If with_runtime_env or get_index fails due to an unexpected error (not just missing entry), the code silently uses timestamp 0, which could cause incoming values to incorrectly win LWW resolution and violate invariant I5.
Suggested fix:
Log a warning when get_index returns an error (distinct from None) so storage failures during sync are observable, even if the fallback behavior is retained.
| context_id: ContextId, | ||
| leaf: &TreeLeafData, | ||
| runtime_env: &RuntimeEnv, | ||
| ) -> Result<()> { |
There was a problem hiding this comment.
💡 Critical key-hashing fix lacks unit test coverage
The storage key format fix is critical for correctness but the PR notes unit tests couldn't run; consider adding an integration test that verifies the hashed key matches the expected storage location.
Suggested fix:
Add a test that stores an entity via normal path and verifies `apply_leaf_with_crdt_merge` can read it back using the same key derivation.
| } else if idx == 0 { | ||
| // Internal node at index 0: this is the requested node, compare with local | ||
| // For children (idx > 0), skip comparison here - they will be processed | ||
| // when they are popped from the to_compare stack with their correct local versions. |
There was a problem hiding this comment.
💡 Children data in response is discarded and re-requested
Internal nodes at idx > 0 are skipped entirely even though their data is already in the response; they will be re-fetched when popped from to_compare stack, causing redundant network round-trips for max_depth=1 requests.
Suggested fix:
Consider caching child node data from the response in a local map keyed by node_id, then checking this cache before sending TreeNodeRequest to avoid duplicate fetches.
| self.apply_leaf_with_crdt_merge(context_id, leaf_data) | ||
| self.apply_leaf_with_crdt_merge(context_id, leaf_data, &runtime_env) | ||
| .await | ||
| .wrap_err("failed to merge leaf entity")?; |
There was a problem hiding this comment.
💡 Implicit fall-through for internal child nodes reduces clarity
Internal nodes at idx > 0 silently fall through the if-else chain; an explicit else branch with a trace log would make the intentional no-op explicit and aid future debugging.
Suggested fix:
Add `else { trace!(%context_id, node_id = ?remote_node.node_id, "Skipping child internal node, will process when popped from stack"); }` after line 390.
| @@ -573,11 +589,14 @@ impl SyncManager { | |||
|
|
|||
There was a problem hiding this comment.
🟡 Clone of runtime_env inside loop iteration may be expensive if called frequently
The runtime_env.clone() occurs for every LWW merge; if RuntimeEnv contains Rc-wrapped callbacks this is cheap, but this should be verified or documented.
Suggested fix:
Confirm RuntimeEnv::clone is cheap (Rc clones) or move the closure outside the match arm if performance matters.
|
This pull request has been automatically marked as stale. If this pull request is still relevant, please leave any comment (for example, "bump"), and we'll keep it open. We are sorry that we haven't been able to prioritize reviewing it yet. Your contribution is very much appreciated. |
Fixes for Hash Comparison and CRDT Merge Logic
Description
This PR addresses three high-severity logic bugs found in the hash comparison and CRDT merge protocol, improving correctness and efficiency.
Child nodes incorrectly compared against parent's local version (bug_id: 68164f61-f021-47f2-a59c-b8b4ad0a75c5):
max_depth=Some(1)request returns the parent node and its children, the original logic incorrectly compared child nodes against the parent's local version. This led todebug_assertfailures in debug builds and silent incorrect comparisons in release, breaking the subtree-skipping optimization.node_idis popped from theto_comparestack with their respective local versions. Leaf nodes continue to be merged via CRDT regardless of their position in the response.Missing RuntimeEnv causes incorrect LWW timestamp comparison (bug_id: 2ed1efc4-22e5-41b2-98b9-24a95e5fe207):
Index::<MainStorage>::get_indexcall withinmerge_entity_valueswas not wrapped inwith_runtime_env, causing it to fall back toMockedStorage. This resulted inexisting_tsalways defaulting to 0, leading to incoming values always overwriting existing local values, violating CRDT merge invariant I5.runtime_envis now passed toapply_leaf_with_crdt_mergeand subsequently tomerge_entity_values. Theget_indexcall inmerge_entity_valuesis now correctly wrapped withwith_runtime_envto ensure it accesses the actual storage.CRDT merge uses wrong storage key format (bug_id: ref1_5b9ac4a8-9873-4837-b10a-fa3ecdfafd84):
apply_leaf_with_crdt_mergewas constructingContextStateKeyusing raw entity ID bytes (leaf.key). However, the actual entity data is stored under a SHA256-hashed key (e.g.,SHA256(discriminant || entity_id)). This mismatch caused the merge to read from a nonexistent location, skip the merge, and write the incoming value to the wrong location, effectively preventing any CRDT merge or update to the correct entity data.apply_leaf_with_crdt_mergenow constructs theContextStateKeyby first creating acalimero_storage::Key::Entryfrom the entity ID and then calling.to_bytes()on it. This ensures the key is correctly SHA256-hashed, matching the format used for storing entity data.Test plan
The changes were verified by:
cargo checkcargo buildcargo clippyAll checks passed successfully.
Unit tests for the
hash_comparisonmodule were attempted, but environment-specific linking issues prevented their execution. The fixes address specific logic errors identified by static analysis. Existing end-to-end tests for synchronization should now behave more correctly.Documentation update
No public or internal documentation updates are required for these internal logic fixes.