Helix is a high-availability dual-database client library for Go, designed to support "Shared Nothing" architecture with active-active dual writes, sticky reads, and asynchronous reconciliation.
Biomimetic Fault Tolerance.
Helix is named after the DNA double helix: two independent strands carrying the same genetic code.
In this architecture, your database clusters are the strands. They share nothing—no state, no gossip, no master-slave tether. They exist in parallel universes.
- Dual Writes replicate the code to both strands simultaneously.
- Sticky Reads latch onto a single strand for maximum locality.
- Replay acts as the repair enzyme, asynchronously healing "mutations" (inconsistencies) when a strand temporarily fails.
If one strand snaps, the other keeps the organism alive. It's 4 billion years of evolution applied to high-availability engineering. 🧬
- Dual Active-Active Writes - Concurrent writes to two independent clusters for maximum availability
- Sticky Read Routing - Per-client sticky reads to maximize cache hits across clusters
- Active Failover - Immediate failover to secondary cluster on read failures
- Replay System - Asynchronous reconciliation via in-memory queue or NATS JetStream
- Drop-in Replacement - Interface-based design mirrors
gocqlAPI for minimal migration effort - SQL Support - Simple wrapper for
database/sqlwith dual-write semantics
go get github.com/arloliu/helixpackage main
import (
"log"
"time"
"github.com/arloliu/helix"
v1 "github.com/arloliu/helix/adapter/cql/v1"
"github.com/arloliu/helix/policy"
"github.com/arloliu/helix/replay"
"github.com/gocql/gocql"
)
func main() {
// Create gocql sessions for both clusters
clusterA := gocql.NewCluster("cluster-a.example.com")
clusterA.Keyspace = "myapp"
sessionA, _ := clusterA.CreateSession()
defer sessionA.Close()
clusterB := gocql.NewCluster("cluster-b.example.com")
clusterB.Keyspace = "myapp"
sessionB, _ := clusterB.CreateSession()
defer sessionB.Close()
// Create Helix client
client, err := helix.NewCQLClient(
v1.NewSession(sessionA),
v1.NewSession(sessionB),
helix.WithReplayer(replay.NewMemoryReplayer()),
helix.WithReadStrategy(policy.NewStickyRead()),
helix.WithWriteStrategy(policy.NewConcurrentDualWrite()),
helix.WithFailoverPolicy(policy.NewActiveFailover()),
)
if err != nil {
log.Fatal(err)
}
defer client.Close()
// Dual-write to both clusters
err = client.Query(
"INSERT INTO users (id, name, email) VALUES (?, ?, ?)",
gocql.TimeUUID(), "Alice", "alice@example.com",
).Exec()
if err != nil {
log.Printf("Both clusters failed: %v", err)
}
// If only one cluster failed, it's automatically queued for replay
// Read with sticky routing and failover
var name, email string
err = client.Query(
"SELECT name, email FROM users WHERE id = ?",
userID,
).Scan(&name, &email)
}package main
import (
"database/sql"
"log"
"github.com/arloliu/helix"
"github.com/arloliu/helix/replay"
_ "github.com/lib/pq"
)
func main() {
// Connect to both databases
primary, _ := sql.Open("postgres", "host=primary.example.com ...")
secondary, _ := sql.Open("postgres", "host=secondary.example.com ...")
// Create Helix SQL client
client, err := helix.NewSQLClientFromDB(primary, secondary,
helix.WithReplayer(replay.NewMemoryReplayer()),
)
if err != nil {
log.Fatal(err)
}
defer client.Close()
// Dual-write
_, err = client.Exec(
"INSERT INTO users (id, name) VALUES ($1, $2)",
"user-1", "Alice",
)
}%%{init:{'theme':'neutral'}}%%
flowchart TD
Client[Dual-Session Client]
subgraph Clusters [Cassandra Clusters]
CA[(Cassandra Cluster A)]
CB[(Cassandra Cluster B)]
end
subgraph ReplaySys [Replay System]
NATS["NATS JetStream<br/>(DLQ / Replay Log)"]
Worker[Background Replay Worker]
end
%% Dual Write Path
Client -- "1. Dual Write (Concurrent)" --> CA
Client -- "1. Dual Write (Concurrent)" --> CB
%% Failure Path
Client -- "2. On Failure (e.g., B fails)" --> NATS
%% Replay Path
NATS -- "3. Consume Failed Write" --> Worker
Worker -- "4. Replay Write (Idempotent)" --> CB
classDef db fill:#e1f5fe,stroke:#01579b,stroke-width:2px;
classDef component fill:#fff9c4,stroke:#fbc02d,stroke-width:2px;
class CA,CB db;
class NATS,Worker component;
%% --- Stylesheet ---
classDef app fill:#e3f2fd,stroke:#1565c0,stroke-width:2px,color:#000;
classDef db fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px,color:#000;
classDef infra fill:#fff3e0,stroke:#e65100,stroke-width:2px,color:#000;
class Client app;
class CA,CB db;
class NATS,Worker infra;
| Strategy | Description |
|---|---|
ConcurrentDualWrite |
Writes to both clusters concurrently (default) |
SyncDualWrite |
Writes sequentially (A then B, or B then A) |
AdaptiveDualWrite |
Latency-aware: healthy clusters wait, degraded clusters fire-and-forget |
| Strategy | Description |
|---|---|
StickyRead |
Sticks to one cluster per client instance (default) |
PrimaryOnlyRead |
Always reads from Cluster A |
RoundRobinRead |
Alternates between clusters |
| Policy | Description |
|---|---|
ActiveFailover |
Immediately tries secondary on failure (default) |
CircuitBreaker |
Switches after N consecutive failures |
LatencyCircuitBreaker |
CircuitBreaker + treats slow responses as soft failures |
See Strategy & Policy Documentation for detailed configuration and interaction patterns.
Helix provides two replay implementations for handling partial write failures:
| Implementation | Durability | Use Case |
|---|---|---|
MemoryReplayer |
Volatile | Development, testing |
NATSReplayer |
Durable | Production (requires NATS JetStream) |
See Replay System Documentation for detailed usage patterns.
For production dual-cluster deployments, always configure:
| Component | Why It Matters |
|---|---|
Replayer |
Critical: Without a replayer, partial write failures are lost permanently. Use NATSReplayer for durability. |
ReadStrategy |
Improves read performance. StickyRead maximizes cache hits by routing reads to the same cluster. |
WriteStrategy |
Controls write behavior. AdaptiveDualWrite handles degraded clusters gracefully. |
FailoverPolicy |
Enables automatic read failover. ActiveFailover immediately retries on the secondary cluster. |
Warning: A warning is logged if you create a dual-cluster client without a Replayer configured.
client, err := helix.NewCQLClient(
v1.NewSession(sessionA),
v1.NewSession(sessionB),
// REQUIRED for production: enables failure recovery
helix.WithReplayer(replay.NewNATSReplayer(nc, nats.JetStreamContext(nc))),
helix.WithReplayWorker(replay.NewWorker(replayer)),
// RECOMMENDED: optimizes read/write behavior
helix.WithReadStrategy(policy.NewStickyRead()),
helix.WithWriteStrategy(policy.NewAdaptiveDualWrite()),
helix.WithFailoverPolicy(policy.NewActiveFailover()),
)helix.NewCQLClient(sessionA, sessionB,
// Strategies
helix.WithReadStrategy(policy.NewStickyRead(
policy.WithStickyReadCooldown(5*time.Minute), // Prevent rapid cluster switching
)),
helix.WithWriteStrategy(policy.NewConcurrentDualWrite()),
helix.WithFailoverPolicy(policy.NewActiveFailover()),
// Replay
helix.WithReplayer(replayer),
helix.WithReplayWorker(worker), // Optional: auto-start worker
// Timestamps (critical for idempotency)
helix.WithTimestampProvider(func() int64 {
return time.Now().UnixMicro()
}),
)See the examples directory:
- basic - Simple dual-write and read operations
- failover - Failover behavior demonstration
- custom-strategy - Creating custom strategies
- replay - Replay system usage
- AdaptiveDualWrite Guide - Latency-aware write strategy tuning
- Replay System - Replay patterns and best practices
- Strategy & Policy - Read/write strategies and failover policies
- Go 1.25+
- For CQL: v1:
github.com/gocql/gocqlor v2:github.com/apache/cassandra-gocql-driver - For NATS Replay:
github.com/nats-io/nats.go
MIT License - see LICENSE for details.