A lightweight Go CDC handler that reads PostgreSQL logical replication changes and publishes normalized events to NATS JetStream. Comes with Docker-based local stack (Postgres + NATS JetStream + Redis) for quick end-to-end testing.
- PostgreSQL logical replication via
pgoutput(default) orwal2json. - Deterministic event IDs, before/after images, commit timestamps.
- NATS JetStream publisher with automatic stream creation and publish retries.
- High throughput pipeline with configurable buffered channels and batch async publishing.
- Checkpoint persistence (Redis by default, in-memory fallback).
- Table allowlisting and publication support.
- Simple health endpoint and lightweight metrics logger.
-
Start infra (Postgres with logical replication, NATS, Redis):
docker compose up -d
-
Run the CDC app:
export DATABASE_URL=postgres://postgres:postgres@localhost:5432/postgres export CDC_SLOT_NAME=better_cdc_slot export CDC_PLUGIN=pgoutput export CDC_PUBLICATIONS=better_cdc_pub go run ./cmd/cdc-handler
-
Generate changes:
psql -h localhost -U postgres -d postgres \ -c "insert into public.orders(account_id,total_cents,status) values (1,2999,'pending');" -
Inspect NATS (JetStream) messages on the
CDCstream (subjectscdc.>). For example usingnatsCLI:nats --server localhost:4222 sub 'cdc.>' -
Debug logging: set
DEBUG=trueto enable verbose zap logging of WAL events, publishes, and checkpoints.
Environment variables (defaults in internal/config):
Database & Replication:
DATABASE_URL(defaultpostgres://postgres:postgres@localhost:5432/postgres)CDC_SLOT_NAME(defaultbetter_cdc_slot)CDC_PLUGIN(pgoutput|wal2json)CDC_PUBLICATIONS(comma-separated; defaultbetter_cdc_pub)TABLE_FILTERS(comma-separatedschema.tableallowlist)
Batching & Throughput:
BATCH_SIZE(default500) - events per batch before flushBATCH_TIMEOUT(default100ms) - max time before flushRAW_MESSAGE_BUFFER_SIZE(default5000) - buffer between WAL reader and parserPARSED_EVENT_BUFFER_SIZE(default5000) - buffer between parser and engine
Checkpoint:
CHECKPOINT_INTERVAL(default1s)CHECKPOINT_KEY(defaultbetter-cdc:checkpoint)CHECKPOINT_TTL(default24h)REDIS_URL(defaultredis://localhost:6379)
NATS:
NATS_URL(comma-separated; defaultnats://localhost:4222)NATS_USERNAME,NATS_PASSWORDNATS_TIMEOUT(default5s)
Other:
HEALTH_ADDR(default:8080)DEBUG(settruefor verbose logging)
PostgreSQL WAL ──► [buffer] ──► Parser ──► [buffer] ──► Engine ──► JetStream
│
▼
Checkpoint
- WAL Reader (
internal/wal): replication connection,pgoutput/wal2json, emits begin/commit markers and row changes; configurable output buffer for backpressure handling. - Parser (
internal/parser): decodes WAL messages into structured events; buffered output channel. - Transformer (
internal/transformer): builds normalized CDC events with deterministic IDs. - Publisher (
internal/publisher): connects to NATS JetStream, ensures stream exists (defaults: nameCDC, subjectscdc.>). Supports batch async publishing for high throughput. - Engine (
internal/engine): orchestrates the pipeline; batches events by size/timeout/commit boundaries; uses async batch publishing when available. - Checkpoint Manager (
internal/checkpoint): saves LSNs (Redis/in-memory) on commit boundaries only after all events are acknowledged. - Health/Metrics:
/healthendpoint; periodic counters/gauges logger.
- Postgres replication feedback is only advanced to the last durably persisted checkpoint, so a crash/restart can replay already-published events (at-least-once).
- Consumers that need exactly-once processing should de-duplicate using the deterministic
event_id(or an equivalent idempotency key).
The pipeline uses buffered channels and batch async publishing to achieve high throughput (target: 1-2k TPS).
Buffer sizes control backpressure between pipeline stages:
RAW_MESSAGE_BUFFER_SIZE=5000 # Increase for bursty workloads
PARSED_EVENT_BUFFER_SIZE=5000 # Increase if parser is faster than publisherBatch settings control how events are grouped before publishing:
BATCH_SIZE=500 # Larger = higher throughput, more latency
BATCH_TIMEOUT=100ms # Lower = less latency, more flushesSet buffer sizes to 0 to revert to unbuffered (sequential) behavior for debugging.
- Postgres in compose is configured with
wal_level=logical,max_wal_senders=10,max_replication_slots=10and initializes schema, publication, and replication slot viadocker/postgres/init/001_init.sql. - If you change the publication/slot names, update both the init SQL and env vars.
- JetStream stream can be customized via
JetStreamOptions(stream name, subjects); defaults targetcdc.*topics. - wal2json path now emits begin/commit markers so checkpoints persist;
pgoutputremains the recommended plugin for RDS-like environments.
- Requires Go 1.23+
- Run
go test ./...(unit-level only; no integration harness yet). - Code lives under
internal/; entrypointcmd/cdc-handler/main.go.