-
Couldn't load subscription status.
- Fork 476
Add storage.wal.failover.write_and_sync.latency
#20566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add storage.wal.failover.write_and_sync.latency
#20566
Conversation
✅ Deploy Preview for cockroachdb-api-docs canceled.
|
✅ Deploy Preview for cockroachdb-interactivetutorials-docs canceled.
|
Files changed:
|
This change marks the `storage.wal.failover.write_and_sync.latency` metric as "Essential" so it gets automatically pulled into the 'Essential Metrics' documentation at e.g., https://www.cockroachlabs.com/docs/stable/essential-metrics-self-hosted.html#storage This is necessary since we are adding some words about this metric to the docs via cockroachdb/docs#20566 We would like to then backport this change to all supported versions of CockroachDB which have WAL failover (i.e., v24.1 and later).
✅ Netlify Preview
To edit notification comments on pull requests, go to your Netlify project configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sumeerbhola reviewed 1 of 1 files at r1, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @rmloveland)
src/current/_includes/v25.4/wal-failover-metrics.md line 7 at r1 (raw file):
- `storage.wal.failover.switch.count`: Count of the number of times WAL writing has switched from primary to secondary store, and vice versa. - `storage.wal.fsync.latency` monitors the latencies of WAL files. If you have WAL failover enabled and are failing over, `storage.wal.fsync.latency` will include the latency of the stalled primary. - `storage.wal.failover.write_and_sync.latency` metric is up one level from `storage.wal.fsync.latency`, and during the failover will report the latency actually observed by higher levels (which should be ~equivalent to the latency of the secondary).
It is not just during the failover. We should say something like.
When WAL failover is configured in a cluster, the operator should monitor this metric which shows the effective latency observed by the higher layer writing to the WAL. This metric is expected to stay low in a healthy system, regardless of whether WAL files are being written to the primary or secondary.
|
thanks @sumeerbhola, i've updated in 998c744 - PTAL once we're happy with this change and it's had docs review I would like to backport it to the WAL failover docs for all versions where this metric is supported which previous versions have this metric available? is it everything v24.1+ or only a subset? |
155395: storage: mark add'l WAL latency metric essential r=rmloveland a=rmloveland This change marks the `storage.wal.failover.write_and_sync.latency` metric as "Essential" so it gets automatically pulled into the 'Essential Metrics' documentation at e.g., https://www.cockroachlabs.com/docs/stable/essential-metrics-self-hosted.html#storage This is necessary since we are adding some words about this metric to the docs via cockroachdb/docs#20566 We would like to then backport this change to all supported versions of CockroachDB which have WAL failover (i.e., v24.1 and later). Addresses part of DOC-13184 Co-authored-by: Rich Loveland <rich@cockroachlabs.com>
This change marks the `storage.wal.failover.write_and_sync.latency` metric as "Essential" so it gets automatically pulled into the 'Essential Metrics' documentation at e.g., https://www.cockroachlabs.com/docs/stable/essential-metrics-self-hosted.html#storage This is necessary since we are adding some words about this metric to the docs via cockroachdb/docs#20566 We would like to then backport this change to all supported versions of CockroachDB which have WAL failover (i.e., v24.1 and later).
This change marks the `storage.wal.failover.write_and_sync.latency` metric as "Essential" so it gets automatically pulled into the 'Essential Metrics' documentation at e.g., https://www.cockroachlabs.com/docs/stable/essential-metrics-self-hosted.html#storage This is necessary since we are adding some words about this metric to the docs via cockroachdb/docs#20566 We would like to then backport this change to all supported versions of CockroachDB which have WAL failover (i.e., v24.1 and later).
155395: storage: mark add'l WAL latency metric essential r=rmloveland a=rmloveland This change marks the `storage.wal.failover.write_and_sync.latency` metric as "Essential" so it gets automatically pulled into the 'Essential Metrics' documentation at e.g., https://www.cockroachlabs.com/docs/stable/essential-metrics-self-hosted.html#storage This is necessary since we are adding some words about this metric to the docs via cockroachdb/docs#20566 We would like to then backport this change to all supported versions of CockroachDB which have WAL failover (i.e., v24.1 and later). Addresses part of DOC-13184 Co-authored-by: Rich Loveland <rich@cockroachlabs.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which previous versions have this metric available? is it everything v24.1+ or only a subset?
Since v24.1 cockroachdb/cockroach#123232
@sumeerbhola reviewed 1 of 1 files at r2, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @rmloveland)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm!
|
TFTRs! Backporting to all supported versions v24.1+ before merge |
Fixes DOC-13184 Summary of changes: - Add a mention of the `storage.wal.failover.write_and_sync.latency` metric to the `wal-failover-metrics.md` include file, which will pull it into the 'WAL failover' and 'cockroach start' pages. - We're also doing a cockroachdb/cockroach PR to mark this metric as 'essential', so it shows up in the list of Storage essential metrics at e.g. https://www.cockroachlabs.com/docs/v25.3/essential-metrics-self-hosted.html#storage
998c744 to
d0d710f
Compare
Fixes DOC-13184
Summary of changes:
Add a mention of the
storage.wal.failover.write_and_sync.latencymetric to thewal-failover-metrics.mdinclude file, which will pull it into the 'WAL failover' and 'cockroach start' pages.We're also doing storage: mark add'l WAL latency metric essential cockroach#155395 to mark this metric as 'essential', so it shows up in the list of Storage essential metrics at e.g. https://www.cockroachlabs.com/docs/v25.3/essential-metrics-self-hosted.html#storage