Skip to content

Hash comparison sync issues#1974

Draft
cursor[bot] wants to merge 4 commits intomasterfrom
cursor/hash-comparison-sync-issues-8f24
Draft

Hash comparison sync issues#1974
cursor[bot] wants to merge 4 commits intomasterfrom
cursor/hash-comparison-sync-issues-8f24

Conversation

@cursor
Copy link
Contributor

@cursor cursor bot commented Feb 12, 2026

Fix: Sync for non-Public storage types and deduplicate tree node lookup

Description

This PR addresses two issues related to the hash comparison synchronization protocol:

  1. Metadata defaults break sync for non-Public storage types (d747f08a-6a0d-41d7-8d53-6c1786b87b5f): The apply_leaf_with_crdt_merge function was creating Metadata::default() (which has StorageType::Public) for updates. This caused sync failures for entities with StorageType::User or StorageType::Frozen because Interface::apply_action would reject the update due to a StorageType mismatch. The fix ensures that the existing entity's storage_type is read from the index and preserved in the update action's metadata, allowing correct synchronization of all storage types.
  2. Duplicated tree node lookup across two modules (6b5c2a0f-a43b-4ad9-bb73-3a2c47185407): The logic for retrieving a Merkle tree node from the local index was duplicated in hash_comparison.rs and hash_comparison_protocol.rs. This common logic has been extracted into a new shared helper function, get_local_tree_node, within crates/node/primitives/src/sync/storage_bridge.rs. Both modules now import and utilize this single, shared implementation, improving code quality and maintainability.

Test plan

The changes were verified by:

  • Successful compilation of all affected crates (crates/node/primitives, crates/node/src/sync).
  • Running cargo fmt and cargo clippy to ensure code style and catch potential issues.
  • The fix for bug (1) resolves a logic error in Interface::apply_action's verify_action_update for non-Public storage types during sync. This would typically require an integration test with User or Frozen entities.
  • The fix for bug (2) is a refactoring to deduplicate code, which was verified by successful compilation and ensuring correct usage of the new shared helper.

Documentation update

No public or internal documentation updates are required as these are internal code fixes and refactorings.


xilosada and others added 4 commits February 12, 2026 16:21
Add infrastructure to run sync protocols through in-memory channels,
enabling testing of actual message flow and state convergence.

Key changes:
- Add SyncTransport trait abstracting network operations (send/recv/close)
- Add StreamTransport for production Stream wrapper
- Add SimStream for in-memory channel-based transport
- Add protocol.rs with execute_hash_comparison_sync for simulation
- Add SimNode::new_in_context for shared context testing
- Fix entity_count to use storage leaf_count (source of truth)

The simulation now uses the exact same storage code path as production
(Index<MainStorage>, Interface<MainStorage>, RuntimeEnv callbacks),
with only the Database implementation differing (InMemoryDB vs RocksDB).

Phase 1 of sim-transport-abstraction plan.
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Introduces a trait-based architecture for sync protocols that enables:
- Same protocol code to run in production and simulation
- Shared storage bridge (create_runtime_env) for both backends
- Standalone HashComparisonProtocol implementation

Changes:
- Add SyncProtocolExecutor trait in node-primitives
- Add create_runtime_env shared helper in storage_bridge.rs
- Extract HashComparisonProtocol to standalone module
- Clean up hash_comparison.rs (responder only, ~875 → ~285 lines)
- Fix wire protocol to use u64 for sequence_id (portability)
- Update simulation tests to use production protocol directly

This removes ~730 lines of duplicated code and ensures the simulation
tests exercise the exact same protocol logic as production.
- Bug 1: Preserve existing storage_type when updating entities in
  apply_leaf_with_crdt_merge to prevent 'Cannot change StorageType'
  errors for User/Frozen storage types during HashComparison sync

- Bug 2: Extract get_local_tree_node helper to storage_bridge.rs to
  deduplicate identical Merkle tree lookup logic between
  hash_comparison_protocol.rs and hash_comparison.rs
@cursor
Copy link
Contributor Author

cursor bot commented Feb 12, 2026

Cursor Agent can help with this pull request. Just @cursor in comments and I'll start working on changes in this branch.
Learn more about Cursor Agents

@github-actions
Copy link

Your PR title does not adhere to the Conventional Commits convention:

<type>(<scope>): <subject>

Common errors to avoid:

  1. The title must be in lower case.
  2. Allowed type values are: build, ci, docs, feat, fix, perf, refactor, test.

Copy link

@meroreviewer meroreviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Reviewer

Reviewed by 3 agents | Quality score: 100% | Review time: 154.4s

🟡 1 warnings, 💡 1 suggestions. See inline comments.


🤖 Generated by AI Code Reviewer | Review ID: review-37b8c80f

metadata.crdt_type = Some(leaf.metadata.crdt_type.clone());
metadata.updated_at = leaf.metadata.hlc_timestamp.into();

// Preserve existing storage_type for updates to avoid "Cannot change StorageType" error.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Bug fix lacks regression test

The storage_type preservation fix addresses a specific failure mode but has no accompanying test to prevent regression.

Suggested fix:

Add a unit or integration test that syncs an entity with StorageType::User or StorageType::Frozen to verify the fix works and prevent future regression.

};

// Get the entity's index from the Merkle tree
let index = match Index::<MainStorage>::get_index(entity_id) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Error swallowing may mask storage failures

Returning Ok(None) on Index::get_index error makes it indistinguishable from 'entity not found', potentially masking real storage failures.

Suggested fix:

Consider returning the error or using a distinct return variant so callers can differentiate between 'not found' and 'storage error'.

Base automatically changed from feat/sim-transport-abstraction to master February 13, 2026 09:52
@github-actions
Copy link

This pull request has been automatically marked as stale. If this pull request is still relevant, please leave any comment (for example, "bump"), and we'll keep it open. We are sorry that we haven't been able to prioritize reviewing it yet. Your contribution is very much appreciated.

@github-actions github-actions bot added the Stale label Feb 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants