Skip to content

Commit d2f859b

Browse files
authored
Replication docs (#45)
1 parent a5a33a2 commit d2f859b

File tree

6 files changed

+157
-2
lines changed

6 files changed

+157
-2
lines changed

docs/administration/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ The admin database name is [configurable](../configuration/pgdog.toml/admin.md).
2727
| `SHOW PREPARED` | List prepared statements currently in the cache. |
2828
| `SHOW QUERY_CACHE` | List statements currently in the AST cache used for query routing. |
2929
| [`MAINTENANCE`](maintenance_mode.md) | Pause all queries to synchronize configuration changes across multiple instances of PgDog. |
30+
| [`SHOW REPLICATION`](replication.md) | Show the status of PostgreSQL replication for each database, including replica lag. |
3031

3132
## Shutting down PgDog
3233

docs/administration/replication.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# Replication
2+
3+
PgDog provides a real time view into PostgreSQL replication for the purposes of monitoring [replication delay](../features/load-balancer/replication-failover.md#replication) and performing query traffic [failover](../features/load-balancer/replication-failover.md#failover).
4+
5+
You can view this data by connecting to the [admin](index.md) database and running this query:
6+
7+
```
8+
SHOW REPLICATION;
9+
```
10+
11+
The following information is returned for each database:
12+
13+
| Name | Description | Example |
14+
|------|-------------|---------|
15+
| `id` | Connection pool identifier matching pools in [`SHOW POOLS`](pools.md). | `4` |
16+
| `database` | Name of the PostgreSQL database. | `postgres` |
17+
| `user` | User used to connect to the database. | `postgres` |
18+
| `addr` | IP address or DNS name of the server. | `127.0.0.1` |
19+
| `port` | TCP port of the server. | `45001` |
20+
| `shard` | Shard number of the database. | `0` |
21+
| `role` | Database role, either `primary` or `replica`. | `replica` |
22+
| `replica_lag` | Replication lag in milliseconds. | `0` |
23+
| `pg_lsn` | Current LSN (Log Sequence Number) of the database. | `0/21000168` |
24+
| `lsn_age` | Time since the last transaction in milliseconds. | `2245` |
25+
| `pg_is_in_recovery` | Whether the database is in recovery mode (`t` = true, `f` = false). | `t` |

docs/configuration/pgdog.toml/databases.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -48,8 +48,7 @@ Default: **`5432`**
4848

4949
### `role`
5050

51-
Type of role this host performs in your database cluster. This can be either `primary` for primary databases that serve writes (and reads),
52-
and `replica` for PostgreSQL replicas that can only serve reads.
51+
Type of role this host performs in your database cluster. This can be `primary` for primary databases that serve writes (and reads), `replica` for PostgreSQL replicas that can only serve reads, or `auto` to let PgDog decide (see [failover](../../features/load-balancer/replication-failover.md#failover) for more details).
5352

5453
Default: **`primary`**
5554

docs/configuration/pgdog.toml/general.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -424,3 +424,23 @@ Default: **`15_000`** (15s)
424424
Overrides the TTL set on DNS records received from DNS servers. Allows for faster failover when the primary/replica hostnames are changed by the database hosting provider.
425425

426426
Default: **none** (disabled)
427+
428+
## Replication
429+
430+
### `lsn_check_delay`
431+
432+
For how long to delay checking for [replication delay](../../features/load-balancer/replication-failover.md).
433+
434+
Default: **`infinity`** (disabled)
435+
436+
### `lsn_check_interval`
437+
438+
How frequently to run the [replication delay](../../features/load-balancer/replication-failover.md) check.
439+
440+
Default: **`5_000`** (5s)
441+
442+
### `lsn_check_timeout`
443+
444+
Maximum amount of time allowed for the [replication delay](../../features/load-balancer/replication-failover.md) query to return a result.
445+
446+
Default: **`5_000`** (5s)
Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
---
2+
icon: material/chart-timeline-variant
3+
---
4+
5+
# Replication and failover
6+
7+
!!! note "Experimental feature"
8+
This feature is new and experimental. Make sure to test it before deploying
9+
to production.
10+
11+
PgDog has built-in functionality for monitoring the state of Postgres replica databases. If configured, it can also automatically detect when a replica is promoted and redirect write queries to the new primary.
12+
13+
## Replication
14+
15+
When enabled, PgDog will periodically query all databases configured in [`pgdog.toml`](../../configuration/pgdog.toml/databases.md) to fetch the following information:
16+
17+
1. Log Sequence Number (LSN)
18+
2. Value returned by [`pg_is_in_recovery()`](https://www.postgresql.org/docs/18/functions-admin.html#FUNCTIONS-RECOVERY-CONTROL)
19+
3. Timestamp of the last transaction
20+
21+
This information can be viewed in real-time by querying the [admin](../../administration/index.md) database with the [`SHOW REPLICATION`](../../administration/replication.md) command.
22+
23+
### Replication lag
24+
25+
In addition to fetching raw metrics, PgDog can calculate the replication lag (also known as "replication delay") between the primary and each replica. The lag is calculated in milliseconds and uses the following formula:
26+
27+
| Step | Description |
28+
|-|-|
29+
| Primary LSN | Get the LSN from the primary using `pg_current_wal_lsn()`. |
30+
| Replica LSN | Get the LSN from each replica using `pg_last_wal_replay_lsn()` or `pg_last_wal_receive_lsn()`. |
31+
| LSN check | If the two LSNs are identical, replication lag is 0. |
32+
| Calculate lag | If the two LSNs are different, replication lag is `now() - pg_last_xact_replay_timestamp()`. |
33+
34+
This formula assumes that when the replica's LSN is behind the primary, the primary is still receiving write requests. While this is not always the case, it will show replication lag growing over time if the replication stream is falling behind or is broken.
35+
36+
!!! note "Formula accuracy"
37+
It is possible to calculate the exact replication delay in bytes by subtracting a replica LSN from the primary LSN. While this provides an exact measurement,
38+
that metric isn't very useful: it's hard to translate bytes into a measurement of how stale the data on the replica truly is.
39+
40+
Approximating the lag in milliseconds is more informative and will be reasonably accurate the majority of the time.
41+
42+
### Configuration
43+
44+
By default, PgDog will not query databases for their replication status. To enable this feature, configure it in [`pgdog.toml`](../../configuration/pgdog.toml/general.md#replication):
45+
46+
```toml
47+
[general]
48+
# Start running the LSN check immediately.
49+
lsn_check_delay = 0
50+
51+
# Run LSN check every second.
52+
lsn_check_interval = 1_000
53+
```
54+
55+
| Setting | Description |
56+
|-|-|
57+
| `lsn_check_delay` | For how long to delay fetching replication status on PgDog launch. By default, this is set to infinity, so the feature is disabled. |
58+
| `lsn_check_interval` | How frequently to re-fetch the replication status. The query used is fast, so you can configure it to run frequently. |
59+
60+
Decreasing the value of `lsn_check_interval` will produce more accurate statistics, at the cost of running additional queries through the same connection pool used by normal client connections.
61+
62+
It's common for PgDog deployments to be serving upwards of 30,000-50,000 queries per second per pooler process, so you can run the LSN check query quite frequently without noticeable impact on system latency.
63+
64+
## Failover
65+
66+
<center>
67+
<img src="/images/failover.png" width="95%" alt="Failover" />
68+
</center>
69+
70+
If the `pg_is_in_recovery()` function returns `true`, the database is configured as a standby. It can only serve read queries (e.g. `SELECT`) and is expected to be reasonably up-to-date with the primary database.
71+
72+
Replica databases can be promoted to serve write queries. If that happens, `pg_is_in_recovery()` will start returning `false`. You can read more about this in the [PostgreSQL documentation](https://www.postgresql.org/docs/18/functions-admin.html#FUNCTIONS-RECOVERY-CONTROL).
73+
74+
!!! warning "Failover trigger"
75+
PgDog does not detect primary failure and **will not** call `pg_promote()`. It is expected that the databases are managed externally by another tool, like Patroni or AWS RDS, which handle replica promotion.
76+
77+
If the `pg_is_in_recovery()` function returns `false`, PgDog will assume that the database is the primary and will start sending it write query traffic. The old primary is demoted to the replica role.
78+
79+
### Configuration
80+
81+
Failover is disabled by default. To enable it, change all configured databases in [`pgdog.toml`](../../configuration/pgdog.toml/databases.md) to use the `"auto"` role, for example:
82+
83+
```toml
84+
[[databases]]
85+
name = "prod"
86+
host = "10.0.0.1"
87+
role = "auto"
88+
89+
[[databases]]
90+
name = "prod"
91+
host = "10.0.0.2"
92+
role = "auto"
93+
```
94+
95+
On startup, PgDog will connect to each database, find out if they are in recovery, and automatically reload its configuration with the determined roles.
96+
97+
### Split brain
98+
99+
If a replica is promoted while the existing primary is alive and serving queries, write queries can be routed to either database, causing data loss. This type of error is called "split brain", indicating that the database cluster no longer has an authoritative source of data it's managing.
100+
101+
PgDog doesn't currently protect against this condition: it solely relies on the value returned by `pg_is_in_recovery()` to make its routing decisions.
102+
103+
To avoid causing split-brain failures, make sure to use tools like Patroni or managed offerings like AWS RDS, Aurora and others, which correctly orchestrate failovers. If managing replicas manually, make sure to **shut down** the primary _before_ calling `pg_promote()` on a replica.
104+
105+
!!! note "Rewinding"
106+
Since PgDog doesn't trigger failovers, it doesn't decide which (most up-to-date) replica should be the failover candidate. It exclusively relies on the value of `pg_is_in_recovery()` to update its routing tables.
107+
108+
### Logical replication
109+
110+
Replica databases that use [logical replication](https://www.postgresql.org/docs/current/logical-replication.html) to synchronize data are, underneath, regular primaries. PgDog is currently unable to detect which database in a logically replicated cluster is the primary and which are replicas.

docs/images/failover.png

61.2 KB
Loading

0 commit comments

Comments
 (0)