-
-
Notifications
You must be signed in to change notification settings - Fork 0
Feat/distributed backend #50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…l to distributed cache - Add hinted handoff mechanism to queue writes for temporarily unavailable replicas * Queue hints with TTL and replay them when nodes recover * Add configuration options for hint TTL, replay interval, and max hints per node * Include metrics for queued, replayed, expired, and dropped hints - Implement parallel reads for improved performance * Enable concurrent fan-out to replica nodes for quorum/all consistency * Early termination when quorum is satisfied * Configurable via WithDistParallelReads option - Add simple gossip protocol for membership information sharing * Periodic random peer selection for gossip exchange * Automatic node state synchronization based on incarnation numbers * Configurable gossip interval - Enhance replica failure handling in replicateTo method - Add comprehensive metrics tracking for new distributed features - Update cspell configuration to include nosec directive This significantly improves the resilience and performance of the distributed cache system by handling temporary node failures gracefully and enabling more efficient read operations.
Add comprehensive Merkle tree-based anti-entropy mechanism for distributed cache synchronization: - Implement BuildMerkleTree() to create hash trees from cache data with configurable chunk sizes - Add SyncWith() method for comparing local/remote trees and pulling newer versions - Extend DistTransport interface with FetchMerkle() for remote tree retrieval - Add /internal/merkle HTTP endpoint for tree access over network - Include merkle sync metrics (operations count and keys pulled) - Add comprehensive test coverage for sync convergence scenarios - Support both in-process and HTTP transports (HTTP fetch merkle marked as unsupported) This enables efficient detection and repair of data inconsistencies between distributed cache nodes by comparing compact tree representations rather than full data sets.
- Add FetchMerkle() and ListKeys() methods to HTTP transport - Implement periodic auto-sync with configurable intervals and peer limits - Add /internal/keys endpoint for key enumeration - Refactor sync logic into modular helper methods - Add comprehensive HTTP Merkle sync test coverage - Enhance distributed metrics with auto-sync tracking - Clean up HTTP transport code and remove redundant comments This enables full anti-entropy synchronization over HTTP transport and provides automatic background sync capabilities for distributed cache consistency.
…semantics - Add Merkle tree synchronization with timing metrics (build, diff, fetch durations) - Implement tombstone versioning to prevent key resurrection during anti-entropy - Add new HTTP endpoints for Merkle tree inspection (/internal/merkle, /internal/keys) - Introduce configuration options for Merkle chunk size, auto-sync, and key enumeration caps - Enhance delete operations with versioned tombstones to maintain consistency - Add comprehensive test suite for Merkle sync edge cases (empty trees, no-diff, single missing keys, tombstone preservation) - Update documentation with new distributed memory capabilities and configuration options This enables robust distributed consistency by preventing stale data resurrection and providing efficient anti-entropy synchronization between cache nodes.
…ogress table - Add comprehensive tombstone versioning and anti-resurrection guard details - Document Merkle phase timing metrics and anti-entropy pull counters - Include roadmap progress table showing current implementation status - Expand descriptions of delete semantics and remote sync behavior - Clarify DebugInject tombstone clearing functionality for testing This update provides much clearer documentation for users and contributors about the current state of distributed cache features and deletion handling.
- Add configurable tombstone TTL and periodic compaction to reclaim memory - Implement WithDistTombstoneTTL and WithDistTombstoneSweep options - Add tombstone metrics tracking (TombstonesActive, TombstonesPurged) - Enhance quorum reads with targeted stale owner repair - Refactor consistency logic with collectQuorum helper method - Update README with new configuration options and metrics - Add comprehensive test coverage for stale quorum scenarios - Improve error handling and code formatting in existing tests This addresses memory management concerns with tombstone accumulation while improving distributed consistency guarantees through better read repair mechanisms.
…stale tracking - Fix variable capture in goroutines for Go <1.22 compatibility - Add owner tracking to parallel fetch results to enable targeted repairs - Implement stale owner detection during parallel consensus building - Add targeted repair mechanism before full replica repair - Improve code structure and comments for better maintainability This ensures parallel quorum reads correctly identify and repair stale replicas while maintaining compatibility with older Go versions that require explicit variable capture in goroutine closures.
|
Running Code Quality on PRs by uploading data to Trunk will soon be removed. You can still run checks on your PRs using trunk-action - see the migration guide for more information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a comprehensive distributed backend system with Merkle tree-based anti-entropy synchronization. The implementation adds Merkle trees for efficient divergence detection, tombstone-based delete semantics to prevent resurrection of deleted keys, and various supporting features like hinted handoff, gossip, and automatic synchronization.
- Merkle tree anti-entropy for efficient sync between distributed nodes
- Tombstone-based delete semantics with version ordering and TTL-based compaction
- Auto-sync mechanism with configurable intervals and peer limits
Reviewed Changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/merkle_sync_test.go | Test for Merkle sync convergence between nodes |
| tests/merkle_single_missing_key_test.go | Test for detecting and pulling single remote-only keys |
| tests/merkle_no_diff_test.go | Test for handling identical trees (no-op sync) |
| tests/merkle_empty_tree_test.go | Test for syncing between empty trees |
| tests/merkle_delete_tombstone_test.go | Test for tombstone-based delete semantics |
| tests/hypercache_http_merkle_test.go | HTTP transport Merkle tree operations test |
| tests/hypercache_distmemory_stale_quorum_test.go | Test for quorum reads with stale replica repair |
| tests/hypercache_distmemory_versioning_test.go | Minor spacing adjustment |
| tests/hypercache_distmemory_remove_readrepair_test.go | Minor spacing adjustment |
| tests/hypercache_distmemory_integration_test.go | Minor spacing adjustment |
| pkg/backend/dist_memory.go | Core distributed memory implementation with Merkle trees and tombstones |
| pkg/backend/dist_http_transport.go | HTTP transport with Merkle tree and key listing endpoints |
| pkg/backend/dist_http_server.go | HTTP server endpoints for Merkle and key listing |
| cspell.config.yaml | Spell check configuration updates |
| README.md | Documentation updates for new features |
| .github/instructions/instructions.md | Development guidelines |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| if t == nil { | ||
| return nil, errNoTransport | ||
| } |
Copilot
AI
Aug 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function checks if t == nil but this should never happen in normal Go usage since methods cannot be called on nil receivers without panicking. This check is redundant and might indicate a design issue.
| if t == nil { | |
| return nil, errNoTransport | |
| } |
No description provided.