Draft
Conversation
Add SubtreePrefetch sync protocol types for deep trees with clustered changes. This protocol is optimized for scenarios where: - Tree depth > 3 levels - Divergence < 20% - Changes are clustered in subtrees Trade-off: O(1) round trips per subtree vs HashComparison's O(depth), but may over-fetch data compared to HashComparison's minimal transfer. Types added: - SubtreePrefetchRequest: Request subtrees by root hash with depth limit - SubtreePrefetchResponse: Contains fetched subtrees and not-found roots - SubtreeData: Single subtree with entities for CRDT merge - should_use_subtree_prefetch(): Heuristic for protocol selection Security: - MAX_SUBTREE_DEPTH (64): Prevents deep traversal attacks - MAX_SUBTREES_PER_REQUEST (100): Limits request size - MAX_ENTITIES_PER_SUBTREE (10,000): Bounds per-subtree data - MAX_TOTAL_ENTITIES (100,000): Caps total response size - is_valid() methods on all types for post-deserialization validation - Saturating arithmetic in total_entity_count() to prevent overflow Includes 39 unit tests covering edge cases and exploit prevention.
Fixes issues raised in PR review: 1. **depth() now always returns bounded value (High Severity)** - Changed return type from `Option<usize>` to `usize` - Returns `MAX_SUBTREE_DEPTH` when `max_depth` is `None` - Consumers always get a safe, bounded depth value 2. **is_valid() now validates max_depth (Medium Severity)** - Added check: `max_depth <= MAX_SUBTREE_DEPTH` when `Some` - Catches invalid values from untrusted deserialization 3. **max_depth field is now private (Medium Severity)** - Matches encapsulation pattern from hash_comparison module - Added `is_unlimited()` accessor to check if unlimited was requested - Prevents bypassing `depth()` accessor 4. **Extracted heuristic magic numbers to constants (Nitpick)** - `DEEP_TREE_THRESHOLD = 3` - `MAX_DIVERGENCE_RATIO = 0.20` - `MAX_CLUSTERED_SUBTREES = 5` Tests updated: - Added test_subtree_request_max_depth_validation - Added test_heuristic_constants_are_sensible - Updated existing tests to use new API (41 tests total)
Replace hardcoded magic numbers (3 and 0.2) in select_protocol() with DEEP_TREE_THRESHOLD and MAX_DIVERGENCE_RATIO constants from subtree.rs. This ensures protocol selection logic stays in sync with the canonical threshold definitions and avoids having two sources of truth for the same values.
Contributor
Author
|
Cursor Agent can help with this pull request. Just |
1 task
|
Your PR title does not adhere to the Conventional Commits convention: Common errors to avoid:
|
|
This pull request has been automatically marked as stale. If this pull request is still relevant, please leave any comment (for example, "bump"), and we'll keep it open. We are sorry that we haven't been able to prioritize reviewing it yet. Your contribution is very much appreciated. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Node Primitives Sync Protocol: Refactor
select_protocolthresholdsDescription
This PR refactors the
select_protocolfunction inprotocol.rsto use the canonicalDEEP_TREE_THRESHOLDandMAX_DIVERGENCE_RATIOconstants fromsubtree.rsinstead of hardcoded magic numbers.This fixes bug
aae0e2bf-d69e-4a39-8987-278f96e4d8f3, whereselect_protocolduplicated threshold values, creating two sources of truth. The motivation is to ensure consistency and maintainability, preventing silent divergence if these thresholds are updated in the future.Test plan
The existing unit tests for the
primitivescrate were run usingcargo test -p primitivesand all passed, verifying that the change did not alter the expected behavior ofselect_protocol. No new test cases were added as this is a refactor of existing logic.Documentation update
No public or internal documentation updates are required as this is an internal code quality improvement and refactor that does not change external behavior or APIs.