Conversation
Add infrastructure to run sync protocols through in-memory channels, enabling testing of actual message flow and state convergence. Key changes: - Add SyncTransport trait abstracting network operations (send/recv/close) - Add StreamTransport for production Stream wrapper - Add SimStream for in-memory channel-based transport - Add protocol.rs with execute_hash_comparison_sync for simulation - Add SimNode::new_in_context for shared context testing - Fix entity_count to use storage leaf_count (source of truth) The simulation now uses the exact same storage code path as production (Index<MainStorage>, Interface<MainStorage>, RuntimeEnv callbacks), with only the Database implementation differing (InMemoryDB vs RocksDB). Phase 1 of sim-transport-abstraction plan.
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Introduces a trait-based architecture for sync protocols that enables: - Same protocol code to run in production and simulation - Shared storage bridge (create_runtime_env) for both backends - Standalone HashComparisonProtocol implementation Changes: - Add SyncProtocolExecutor trait in node-primitives - Add create_runtime_env shared helper in storage_bridge.rs - Extract HashComparisonProtocol to standalone module - Clean up hash_comparison.rs (responder only, ~875 → ~285 lines) - Fix wire protocol to use u64 for sequence_id (portability) - Update simulation tests to use production protocol directly This removes ~730 lines of duplicated code and ensures the simulation tests exercise the exact same protocol logic as production.
- Optimize RuntimeEnv: create once before responder loop instead of per-request, reducing allocations - Document breaking wire format change: sequence_id changed from usize to u64 for cross-platform portability - Improve _nonce parameter docs: clarify it's reserved for future encrypted sync
Add comprehensive 3-node sync tests to verify protocol logic: - test_three_node_chain_sync: Chain sync A←B, A←C, B←A, C←A - test_three_node_mesh_sync: Full mesh sync between all pairs - test_three_node_fresh_join: Empty node joining existing network - test_three_node_crdt_conflict: CRDT merge with conflicting updates All tests pass, confirming HashComparisonProtocol logic is correct for multi-node scenarios.
… dev docs - Remove duplicate MAX_REQUEST_DEPTH constants from both hash_comparison.rs and hash_comparison_protocol.rs, using MAX_TREE_REQUEST_DEPTH from primitives - Extract get_local_tree_node as shared canonical function in hash_comparison_protocol.rs, have SyncManager delegate to it - Make production responder use Index-based root hash detection consistent with standalone responder (both now use Index::get_hashes_for) - Remove development planning document plans/sim-transport-abstraction.md
|
Cursor Agent can help with this pull request. Just |
|
Your PR title does not adhere to the Conventional Commits convention: Common errors to avoid:
|
There was a problem hiding this comment.
🤖 AI Code Reviewer
Reviewed by 3 agents | Quality score: 100% | Review time: 160.1s
🟡 2 warnings, 💡 1 suggestions. See inline comments.
🤖 Generated by AI Code Reviewer | Review ID: review-36178718
| @@ -105,11 +98,27 @@ impl SyncManager { | |||
| let datastore = self.context_client.datastore_handle().into_inner(); | |||
There was a problem hiding this comment.
🟡 Silent error handling with unwrap_or([0; 32]) changes failure behavior
If Index::get_hashes_for fails, the code silently defaults to [0; 32], causing all requests to have is_root_request=false. The old code returned TreeNodeResponse::not_found() when context lookup failed, which is more explicit failure handling.
Suggested fix:
Consider logging a warning when the index lookup fails, or propagating the error to maintain explicit failure semantics.
| let datastore = self.context_client.datastore_handle().into_inner(); | ||
| let runtime_env = create_runtime_env(&datastore, context_id, our_identity); | ||
|
|
||
| // Get our root hash from Index (consistent with standalone responder) |
There was a problem hiding this comment.
🟡 Silent error handling for root hash may cause unexpected root-request behavior
If Index lookup fails, local_root_hash defaults to [0; 32]}, meaning a peer requesting node_id = [0; 32] would be incorrectly treated as a root request, potentially leaking context root data to unauthorized requests.
Suggested fix:
Return an error or log a warning when Index lookup fails instead of silently defaulting to zeros; alternatively ensure the zero-hash case is explicitly handled in `is_root_request` logic.
|
|
||
| /// Get a tree node from the local Merkle tree Index. | ||
| fn get_local_tree_node( | ||
| /// |
There was a problem hiding this comment.
💡 Public function lacks full doc comment
Now that get_local_tree_node is public and shared between modules, its doc comment could include the return semantics (None vs error) and example usage.
Suggested fix:
Expand the doc comment to describe when `Ok(None)` vs `Err` is returned and the expected `with_runtime_env` setup.
|
This pull request has been automatically marked as stale. If this pull request is still relevant, please leave any comment (for example, "bump"), and we'll keep it open. We are sorry that we haven't been able to prioritize reviewing it yet. Your contribution is very much appreciated. |
Node Sync: Refactor Hash Comparison Protocol and Fix Divergence
Description
This PR addresses several code quality and consistency issues within the node's synchronization hash comparison protocol and removes an accidentally committed development document.
Specifically, it fixes:
MAX_REQUEST_DEPTHconstant to useMAX_TREE_REQUEST_DEPTHfromcrates/node/src/sync/primitives.rs.get_local_tree_nodelogic into a shared public function inhash_comparison_protocol.rs, which is now called bySyncManager::get_local_tree_node_from_indexinhash_comparison.rs. This reduces code duplication and improves maintainability.SyncManager::handle_tree_node_request) with the standalone responder by updating it to useIndex::get_hashes_for(root_id)for root hash detection, ensuring consistent behavior and more accurate simulation testing.plans/sim-transport-abstraction.mdfile, which was an internal development document and not intended for the repository.The motivation is to improve code quality, reduce maintenance burden, and ensure that simulation tests accurately reflect the production environment's behavior.
Test plan
The changes were verified by successfully running
cargo checkandcargo fmton thenodecrate. No new end-to-end tests were added as these are refactoring and bug fixes to existing logic; existing simulation tests should now provide more accurate coverage. No user-interface changes were made.Documentation update
The
plans/sim-transport-abstraction.mddocument was removed. No other public or internal documentation requires updates.