Skip to content

Hash comparison protocol cleanup#1975

Draft
cursor[bot] wants to merge 6 commits intomasterfrom
cursor/hash-comparison-protocol-cleanup-305b
Draft

Hash comparison protocol cleanup#1975
cursor[bot] wants to merge 6 commits intomasterfrom
cursor/hash-comparison-protocol-cleanup-305b

Conversation

@cursor
Copy link
Contributor

@cursor cursor bot commented Feb 13, 2026

Node Sync: Refactor Hash Comparison Protocol and Fix Divergence

Description

This PR addresses several code quality and consistency issues within the node's synchronization hash comparison protocol and removes an accidentally committed development document.

Specifically, it fixes:

  • Duplicate tree-node lookup function across two modules (Bug 1):
    • Consolidated the MAX_REQUEST_DEPTH constant to use MAX_TREE_REQUEST_DEPTH from crates/node/src/sync/primitives.rs.
    • Extracted the duplicated get_local_tree_node logic into a shared public function in hash_comparison_protocol.rs, which is now called by SyncManager::get_local_tree_node_from_index in hash_comparison.rs. This reduces code duplication and improves maintainability.
  • Simulation responder diverges from production responder path (Bug 2):
    • Aligned the production responder (SyncManager::handle_tree_node_request) with the standalone responder by updating it to use Index::get_hashes_for(root_id) for root hash detection, ensuring consistent behavior and more accurate simulation testing.
  • Development planning document committed to repository (Bug 3):
    • Removed the plans/sim-transport-abstraction.md file, which was an internal development document and not intended for the repository.

The motivation is to improve code quality, reduce maintenance burden, and ensure that simulation tests accurately reflect the production environment's behavior.

Test plan

The changes were verified by successfully running cargo check and cargo fmt on the node crate. No new end-to-end tests were added as these are refactoring and bug fixes to existing logic; existing simulation tests should now provide more accurate coverage. No user-interface changes were made.

Documentation update

The plans/sim-transport-abstraction.md document was removed. No other public or internal documentation requires updates.


xilosada and others added 6 commits February 12, 2026 16:21
Add infrastructure to run sync protocols through in-memory channels,
enabling testing of actual message flow and state convergence.

Key changes:
- Add SyncTransport trait abstracting network operations (send/recv/close)
- Add StreamTransport for production Stream wrapper
- Add SimStream for in-memory channel-based transport
- Add protocol.rs with execute_hash_comparison_sync for simulation
- Add SimNode::new_in_context for shared context testing
- Fix entity_count to use storage leaf_count (source of truth)

The simulation now uses the exact same storage code path as production
(Index<MainStorage>, Interface<MainStorage>, RuntimeEnv callbacks),
with only the Database implementation differing (InMemoryDB vs RocksDB).

Phase 1 of sim-transport-abstraction plan.
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Introduces a trait-based architecture for sync protocols that enables:
- Same protocol code to run in production and simulation
- Shared storage bridge (create_runtime_env) for both backends
- Standalone HashComparisonProtocol implementation

Changes:
- Add SyncProtocolExecutor trait in node-primitives
- Add create_runtime_env shared helper in storage_bridge.rs
- Extract HashComparisonProtocol to standalone module
- Clean up hash_comparison.rs (responder only, ~875 → ~285 lines)
- Fix wire protocol to use u64 for sequence_id (portability)
- Update simulation tests to use production protocol directly

This removes ~730 lines of duplicated code and ensures the simulation
tests exercise the exact same protocol logic as production.
- Optimize RuntimeEnv: create once before responder loop instead of
  per-request, reducing allocations
- Document breaking wire format change: sequence_id changed from
  usize to u64 for cross-platform portability
- Improve _nonce parameter docs: clarify it's reserved for future
  encrypted sync
Add comprehensive 3-node sync tests to verify protocol logic:
- test_three_node_chain_sync: Chain sync A←B, A←C, B←A, C←A
- test_three_node_mesh_sync: Full mesh sync between all pairs
- test_three_node_fresh_join: Empty node joining existing network
- test_three_node_crdt_conflict: CRDT merge with conflicting updates

All tests pass, confirming HashComparisonProtocol logic is correct
for multi-node scenarios.
… dev docs

- Remove duplicate MAX_REQUEST_DEPTH constants from both hash_comparison.rs
  and hash_comparison_protocol.rs, using MAX_TREE_REQUEST_DEPTH from primitives
- Extract get_local_tree_node as shared canonical function in
  hash_comparison_protocol.rs, have SyncManager delegate to it
- Make production responder use Index-based root hash detection consistent
  with standalone responder (both now use Index::get_hashes_for)
- Remove development planning document plans/sim-transport-abstraction.md
@cursor
Copy link
Contributor Author

cursor bot commented Feb 13, 2026

Cursor Agent can help with this pull request. Just @cursor in comments and I'll start working on changes in this branch.
Learn more about Cursor Agents

@github-actions
Copy link

Your PR title does not adhere to the Conventional Commits convention:

<type>(<scope>): <subject>

Common errors to avoid:

  1. The title must be in lower case.
  2. Allowed type values are: build, ci, docs, feat, fix, perf, refactor, test.

Copy link

@meroreviewer meroreviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Reviewer

Reviewed by 3 agents | Quality score: 100% | Review time: 160.1s

🟡 2 warnings, 💡 1 suggestions. See inline comments.


🤖 Generated by AI Code Reviewer | Review ID: review-36178718

@@ -105,11 +98,27 @@ impl SyncManager {
let datastore = self.context_client.datastore_handle().into_inner();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Silent error handling with unwrap_or([0; 32]) changes failure behavior

If Index::get_hashes_for fails, the code silently defaults to [0; 32], causing all requests to have is_root_request=false. The old code returned TreeNodeResponse::not_found() when context lookup failed, which is more explicit failure handling.

Suggested fix:

Consider logging a warning when the index lookup fails, or propagating the error to maintain explicit failure semantics.

let datastore = self.context_client.datastore_handle().into_inner();
let runtime_env = create_runtime_env(&datastore, context_id, our_identity);

// Get our root hash from Index (consistent with standalone responder)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Silent error handling for root hash may cause unexpected root-request behavior

If Index lookup fails, local_root_hash defaults to [0; 32]}, meaning a peer requesting node_id = [0; 32] would be incorrectly treated as a root request, potentially leaking context root data to unauthorized requests.

Suggested fix:

Return an error or log a warning when Index lookup fails instead of silently defaulting to zeros; alternatively ensure the zero-hash case is explicitly handled in `is_root_request` logic.


/// Get a tree node from the local Merkle tree Index.
fn get_local_tree_node(
///
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Public function lacks full doc comment

Now that get_local_tree_node is public and shared between modules, its doc comment could include the return semantics (None vs error) and example usage.

Suggested fix:

Expand the doc comment to describe when `Ok(None)` vs `Err` is returned and the expected `with_runtime_env` setup.

Base automatically changed from feat/sim-transport-abstraction to master February 13, 2026 09:52
@github-actions
Copy link

This pull request has been automatically marked as stale. If this pull request is still relevant, please leave any comment (for example, "bump"), and we'll keep it open. We are sorry that we haven't been able to prioritize reviewing it yet. Your contribution is very much appreciated.

@github-actions github-actions bot added the Stale label Feb 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants