Skip to content

Add PgBouncer metrics collection to agent#178

Merged
Dhanrajkshirsagar merged 3 commits intomainfrom
feature/pgbouncer-metrics
Apr 27, 2026
Merged

Add PgBouncer metrics collection to agent#178
Dhanrajkshirsagar merged 3 commits intomainfrom
feature/pgbouncer-metrics

Conversation

@Dhanrajkshirsagar
Copy link
Copy Markdown
Contributor

Collect connection pool statistics from PgBouncer admin console (port 6432) using SHOW POOLS + SHOW STATS. Uses a try-connect approach — silently marks Up: false when PgBouncer is not running, so the server can track pooler state without requiring any config flag changes.

  • New pgbouncermetrics package: connects via selfhostadmin credential, aggregates pool/stats counters into PgBouncerMetrics struct
  • domain/metrics: adds MetricTypePgBouncer constant and PgBouncerMetrics type
  • metrics service: includes pgbouncer.stats metric set in every push cycle
  • metrics_test: adds MockPgBouncerCollector and updates all Push tests for the new always-included metric set

Collect connection pool statistics from PgBouncer admin console (port 6432)
using SHOW POOLS + SHOW STATS. Uses a try-connect approach — silently marks
Up: false when PgBouncer is not running, so the server can track pooler state
without requiring any config flag changes.

- New pgbouncermetrics package: connects via selfhostadmin credential,
  aggregates pool/stats counters into PgBouncerMetrics struct
- domain/metrics: adds MetricTypePgBouncer constant and PgBouncerMetrics type
- metrics service: includes pgbouncer.stats metric set in every push cycle
- metrics_test: adds MockPgBouncerCollector and updates all Push tests for
  the new always-included metric set

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@iAziz786 iAziz786 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Comment removed by reviewer]

Copy link
Copy Markdown
Contributor

@iAziz786 iAziz786 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review findings for PgBouncer metrics collection

Comment thread internal/pgbouncermetrics/collector.go Outdated
Comment thread internal/pgbouncermetrics/collector.go Outdated
…rages

maxwait_us is the sub-second microsecond remainder of the wait, not the full
duration. Combine with maxwait (whole seconds) using the correct formula:
maxwait*1000 + maxwait_us/1000. Previously maxwait_us alone was used whenever
nonzero, dropping the whole-seconds component entirely.

Latency averages in collectStats are now weighted by avg_query_count so
high-traffic databases dominate the aggregate rather than each database row
contributing equally regardless of query volume.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@iAziz786 iAziz786 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second-pass review findings after the follow-up fix

Comment thread internal/pgbouncermetrics/collector.go Outdated
Comment on lines +142 to +151
qps := parseFloat(row["avg_query_count"])
m.TotalQueriesPerSec += qps
weightedQueryTime += parseFloat(row["avg_query_time"]) * qps // microseconds · qps
weightedWaitTime += parseFloat(row["avg_wait_time"]) * qps // microseconds · qps
totalWeight += qps
}

if totalWeight > 0 {
m.AvgQueryTimeMs = weightedQueryTime / totalWeight / 1000
m.AvgWaitTimeMs = weightedWaitTime / totalWeight / 1000
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: avg_wait_time is being weighted by query rate, but PgBouncer defines it per server assignment

avg_query_time should be weighted by avg_query_count, but avg_wait_time should not. In PgBouncer source (calc_average in src/stats.c), avg_wait_time = delta wait_time / server_assignment_count, not / query_count.

Weighting wait time by avg_query_count overweights databases with many queries per server assignment (session pooling / multi-query transactions). AvgWaitTimeMs should use avg_server_assignment_count as the weight, or be derived from SHOW TOTALS as total_wait_time / total_server_assignment_count.

PgBouncer emits SHOW STATS/SHOW POOLS NUMERIC columns as []byte via
lib/pq. fmt.Sprintf("%v", []byte) produces "[49 50 46 53]" rather than
"12.5", causing parseFloat/parseInt to return 0. Add a type switch to
call string(v) for []byte values so avg_query_count, avg_query_time,
and avg_wait_time parse correctly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@Dhanrajkshirsagar Dhanrajkshirsagar merged commit 76d6375 into main Apr 27, 2026
1 check passed
@Dhanrajkshirsagar Dhanrajkshirsagar deleted the feature/pgbouncer-metrics branch April 27, 2026 16:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants