Migrate from openark/raft fork to upstream hashicorp/raft v1.7#62
Migrate from openark/raft fork to upstream hashicorp/raft v1.7#62renecannao merged 3 commits intomasterfrom
Conversation
Remove the replace directive that redirected github.com/hashicorp/raft to the openark/raft fork (pinned at a 2017 commit). This switches to upstream hashicorp/raft v1.7.3, which includes nine years of security fixes, bug fixes, and improvements including the pre-vote protocol, improved snapshot handling, and numerous race condition fixes. Closes #61 (dependency update portion)
Migrate all orchestrator raft code from the removed openark/raft fork
APIs to their upstream hashicorp/raft v1.7 equivalents:
- Remove PeerStore field; replace with raft.BootstrapCluster() using
raft.Configuration{Servers: [...]} for initial cluster setup
- Replace raft.AddUniquePeer() with local addUniquePeer() helper
- Remove EnableSingleNode/DisableBootstrapAfterElect config fields;
single-node mode is now handled by bootstrapping with one server
- Update NewRaft() from 7-arg to 6-arg (no PeerStore parameter)
- Set config.LocalID = raft.ServerID(advertise) as required by upstream
- Replace AddPeer(addr) with AddVoter(ServerID, ServerAddress, 0, 0)
- Replace RemovePeer(addr) with RemoveServer(ServerID, 0, 0)
- Replace GetPeers() via PeerStore with GetConfiguration() to extract
server addresses from the raft configuration
- Replace Yield() and StepDown() with LeadershipTransfer()
- Cast raft.Leader() return from ServerAddress to string
- Update FileSnapshotStore.Create() signature to match upstream:
(version, index, term, configuration, configurationIndex, trans)
replacing the old (index, term, peers []byte) signature
Closes #61
There was a problem hiding this comment.
Pull request overview
Migrates the project from the long-pinned openark/raft fork to upstream github.com/hashicorp/raft v1.7.x, updating APIs and bootstrapping/snapshot behavior to match the upstream Raft interfaces.
Changes:
- Switch dependency wiring to upstream
hashicorp/raftand update msgpack dependency to/v2. - Replace deprecated peer-store APIs with
BootstrapCluster,GetConfiguration, and voter/server management APIs. - Update snapshot store
Create()signature and snapshot metadata to use configuration fields.
Reviewed changes
Copilot reviewed 4 out of 82 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| go/raft/store.go | Removes PeerStore usage; adds cluster bootstrapping via BootstrapCluster; updates join/remove to AddVoter/RemoveServer. |
| go/raft/raft.go | Updates leader/peer discovery and leadership yield/stepdown to upstream APIs. |
| go/raft/file_snapshot.go | Updates snapshot creation signature and metadata to the upstream raft snapshot format. |
| go.mod | Removes raft fork replace and updates dependencies for upstream raft. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| log.Infof("bootstrapping cluster with peers: %+v", peers) | ||
| for _, peer := range peers { | ||
| servers = append(servers, raft.Server{ | ||
| ID: raft.ServerID(peer), | ||
| Address: raft.ServerAddress(peer), | ||
| }) | ||
| } |
There was a problem hiding this comment.
In the multi-node bootstrap path, the local server is only included if peerNodes happens to contain store.raftAdvertise. If the local address is not in peers, BootstrapCluster will create an initial configuration that excludes this node, which can prevent it from participating correctly (and can break join/leadership behavior). Ensure the bootstrapped configuration.Servers always includes the local server (and avoid duplicates if it’s already present).
| log.Infof("bootstrapping cluster with peers: %+v", peers) | |
| for _, peer := range peers { | |
| servers = append(servers, raft.Server{ | |
| ID: raft.ServerID(peer), | |
| Address: raft.ServerAddress(peer), | |
| }) | |
| } | |
| log.Infof("bootstrapping cluster with peers: %+v", peers) | |
| localIncluded := false | |
| for _, peer := range peers { | |
| if peer == store.raftAdvertise { | |
| localIncluded = true | |
| } | |
| servers = append(servers, raft.Server{ | |
| ID: raft.ServerID(peer), | |
| Address: raft.ServerAddress(peer), | |
| }) | |
| } | |
| if !localIncluded { | |
| // Ensure the local server is always part of the initial configuration | |
| servers = append(servers, raft.Server{ | |
| ID: raft.ServerID(store.raftAdvertise), | |
| Address: raft.ServerAddress(store.raftAdvertise), | |
| }) | |
| } |
| config.SnapshotThreshold = 1 | ||
| config.SnapshotInterval = snapshotInterval | ||
| config.ShutdownOnRemove = false | ||
| config.LocalID = raft.ServerID(store.raftAdvertise) |
There was a problem hiding this comment.
Using store.raftAdvertise as LocalID couples server identity to its network address. If the advertised address changes across restarts (IP change, port change, DNS change, etc.), raft will treat the node as a different server ID, which can complicate recovery and membership operations. Prefer a stable node identity (e.g., configured node name/UUID) for LocalID, while keeping ServerAddress as the address.
| config.LocalID = raft.ServerID(store.raftAdvertise) | |
| config.LocalID = raft.ServerID(store.raftDir) |
| log.Infof("received join request for remote node %s", addr) | ||
|
|
||
| f := store.raft.AddPeer(addr) | ||
| f := store.raft.AddVoter(raft.ServerID(addr), raft.ServerAddress(addr), 0, 0) |
There was a problem hiding this comment.
Passing 0 as the timeout to AddVoter/RemoveServer can result in the call waiting indefinitely under some failure modes (unreachable peer, stalled leadership, network partition), which is risky for request handlers and operations. Consider supplying a bounded timeout (and optionally a prevIndex when appropriate) so join/remove requests fail predictably.
| log.Infof("received remove request for remote node %s", addr) | ||
|
|
||
| f := store.raft.RemovePeer(addr) | ||
| f := store.raft.RemoveServer(raft.ServerID(addr), 0, 0) |
There was a problem hiding this comment.
Passing 0 as the timeout to AddVoter/RemoveServer can result in the call waiting indefinitely under some failure modes (unreachable peer, stalled leadership, network partition), which is risky for request handlers and operations. Consider supplying a bounded timeout (and optionally a prevIndex when appropriate) so join/remove requests fail predictably.
| return store.peerStore.Peers() | ||
| future := store.raft.GetConfiguration() | ||
| if err := future.Error(); err != nil { | ||
| return nil, err |
There was a problem hiding this comment.
GetPeers() returns []string{} on ErrRaftNotRunning, but returns nil on GetConfiguration() errors. This inconsistency can surprise callers that expect a non-nil slice even when err != nil (especially since the function already establishes an explicit empty-slice convention). Consider returning []string{} consistently on error paths, or consistently returning nil slices on all error paths (including ErrRaftNotRunning).
| return nil, err | |
| return []string{}, err |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly upgrades the project's core consensus mechanism by transitioning from a long-outdated fork to the actively maintained upstream HashiCorp Raft library. This change not only modernizes the codebase by adopting current API standards but also addresses critical security and maintenance concerns inherent in using unmaintained dependencies. The update ensures the project benefits from ongoing improvements and bug fixes from the official Raft project. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
|
Warning Gemini encountered an error creating the review. You can try again by commenting |
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (81)
📝 WalkthroughWalkthroughThis pull request migrates the Raft consensus library from the deprecated openark/raft fork (unmaintained since 2017) to upstream hashicorp/raft v1.7.x. The migration updates API calls across three core files, removes the PeerStore abstraction, implements new cluster bootstrapping logic, and upgrades go-msgpack from v0.5.5 to v2.1.2 as a transitive dependency. The upstream library is vendored into the repository. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary
replacedirective that redirectedgithub.com/hashicorp/raftto theopenark/raftfork (pinned at a September 2017 commit), switching to upstreamhashicorp/raftv1.7.3PeerStoretoBootstrapCluster/GetConfiguration,AddPeer/RemovePeertoAddVoter/RemoveServer,Yield/StepDowntoLeadershipTransfer, and updatesFileSnapshotStore.Create()to the new signatureChanges by file
go.mod / go.sum / vendor/
openark/raftreplace directivehashicorp/raftv1.7.3 and its new dependencyhashicorp/go-msgpack/v2go/raft/store.go
peerStorefield fromStorestructraft.StaticPeers/SetPeerswithraft.BootstrapCluster()usingraft.Configurationraft.AddUniquePeer()with local helperEnableSingleNode/DisableBootstrapAfterElectconfig; single-node handled via bootstrapconfig.LocalID(required by upstream)NewRaft()from 7-arg to 6-arg constructorAddPeer(addr)withAddVoter(ServerID, ServerAddress, 0, 0)RemovePeer(addr)withRemoveServer(ServerID, 0, 0)go/raft/raft.go
raft.Leader()return fromServerAddresstostringGetPeers()viaPeerStore.Peers()withGetConfiguration()extracting server addressesStepDown()andYield()withLeadershipTransfer()go/raft/file_snapshot.go
FileSnapshotStore.Create()signature:(version, index, term, configuration, configurationIndex, trans)replacing old(index, term, peers)SnapshotMetainitialization to useConfiguration/ConfigurationIndexfields instead ofPeersTest plan
go build -o /dev/null ./go/cmd/orchestrator— full binary compiles cleanlygo test ./go/... -vet=off -count=1— all 12 test packages passCloses #61
Summary by CodeRabbit
Release Notes
Dependencies
Improvements