feat: add heat metrics by fitbeard · Pull Request #52 · vexxhost/openstack_database_exporter

Tadas Sutkaitis (fitbeard) · 2025-12-16T10:06:26Z

No description provided.

Signed-off-by: Tadas Sutkaitis <tadas.sutkaitis@vexxhost.com>

Copilot

Pull request overview

Adds a Heat (OpenStack Orchestration) collector backed by sqlc-generated MySQL queries to export stack status metrics to Prometheus.

Changes:

Added sqlc configuration plus Heat schema/query definitions and generated Go DB accessors.
Implemented Heat orchestration and stacks collectors emitting per-stack status, totals, and an up metric.
Wired Heat collector into the global registry and CLI/config flags.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
sqlc.yaml	Adds a new sqlc target for Heat (MySQL) to generate Go query code.
sql/heat/schema.sql	Defines the Heat `stack` table schema used by the collector queries.
sql/heat/queries.sql	Introduces `GetStackMetrics` query for collecting stack attributes/status.
internal/db/heat/db.go	sqlc-generated DB wrapper for Heat queries.
internal/db/heat/models.go	sqlc-generated models for Heat tables.
internal/db/heat/queries.sql.go	sqlc-generated implementation for `GetStackMetrics`.
internal/collector/heat/heat.go	Registers Heat collectors with a registry when configured.
internal/collector/heat/orchestration.go	Adds orchestration `up` metric and delegates to stacks collector.
internal/collector/heat/stacks.go	Implements stack metrics collection and status mapping.
internal/collector/heat/orchestration_test.go	Tests `up` behavior based on stacks collection success/failure.
internal/collector/heat/stacks_test.go	Tests stack metrics output for success/empty/error cases.
internal/collector/collector.go	Adds Heat DB URL to config and registers Heat collectors.
cmd/openstack-database-exporter/main.go	Adds `--heat.database-url` flag and plumbs it into config.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-12T01:36:45Z

internal/collector/heat/stacks.go

+	// Initialize status counters
+	stackStatusCounter := make(map[string]int, len(knownStackStatuses))
+	for _, status := range knownStackStatuses {
+		stackStatusCounter[status] = 0
+	}


Emitting stack_status_counter by iterating over a map produces non-deterministic metric output order, which can make scrapes and unit tests flaky. Prefer emitting counters in a stable order (e.g., iterate over knownStackStatuses and read stackStatusCounter[status]).

Copilot · 2026-02-12T01:36:45Z

internal/collector/heat/stacks.go

+	// Stack status counter metrics
+	for status, count := range stackStatusCounter {
+		ch <- prometheus.MustNewConstMetric(
+			stackStatusCounterDesc,
+			prometheus.GaugeValue,
+			float64(count),
+			status,
+		)
+	}


Emitting stack_status_counter by iterating over a map produces non-deterministic metric output order, which can make scrapes and unit tests flaky. Prefer emitting counters in a stable order (e.g., iterate over knownStackStatuses and read stackStatusCounter[status]).

Copilot · 2026-02-12T01:36:45Z

internal/collector/heat/stacks.go

+		// Count status occurrences
+		stackStatusCounter[stack.Status]++


Incrementing stackStatusCounter[stack.Status] will create new entries for unexpected/empty statuses, which then get exported as additional time series (and also impacts output ordering). If you only want the known statuses, guard the increment by checking membership in knownStackStatuses (or normalize unknowns into a single label such as UNKNOWN).

Suggested change

// Count status occurrences

stackStatusCounter[stack.Status]++

// Count status occurrences only for known statuses

if _, ok := stackStatusCounter[stack.Status]; ok {

stackStatusCounter[stack.Status]++

}

Copilot · 2026-02-12T01:36:46Z

internal/collector/heat/stacks.go

+func mapStackStatusValue(status string) int {
+	for idx, s := range knownStackStatuses {
+		if status == s {
+			return idx
+		}
+	}
+	return -1
+}


mapStackStatusValue performs a linear scan of knownStackStatuses for every stack. For large numbers of stacks this becomes unnecessary overhead; consider building a map[string]int (status -> value) once (e.g., in init() or as a package-level var) and doing O(1) lookups.

Copilot · 2026-02-12T01:36:46Z

internal/collector/heat/stacks.go

+func (c *StacksCollector) Collect(ch chan<- prometheus.Metric) error {
+	ctx := context.Background()
+
+	stacks, err := c.queries.GetStackMetrics(ctx)


Using context.Background() means the DB query cannot be cancelled and has no upper bound on runtime during a scrape. Consider using context.WithTimeout (with a reasonable scrape-aligned timeout) so a slow/bad DB doesn’t hang collection.

Signed-off-by: Tadas Sutkaitis <tadas.sutkaitis@vexxhost.com>

Copilot

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 6 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-12T14:14:26Z

internal/collector/heat/stacks.go

+	stackStatusDesc = prometheus.NewDesc(
+		prometheus.BuildFQName(Namespace, Subsystem, "stack_status"),
+		"stack_status",
+		[]string{
+			"id",
+			"name",
+			"project_id",
+			"status",
+			"action",
+		},
+		nil,
+	)
+
+	stackStatusCounterDesc = prometheus.NewDesc(
+		prometheus.BuildFQName(Namespace, Subsystem, "stack_status_counter"),
+		"stack_status_counter",
+		[]string{
+			"status",
+		},
+		nil,
+	)


The metric help strings are identical to the metric names, which makes them much less useful in Prometheus/UIs. Also, stack_status_counter is emitted as a gauge (counts per scrape) but its name suggests a Prometheus counter; consider renaming to something like stack_status_count (or similar) and updating the help text to describe what the metric represents (e.g., “Current number of stacks in each status”).

Skip - original exporter is using stack_status_counter

Copilot · 2026-02-12T14:14:26Z

internal/collector/heat/stacks.go

+	stackStatusValueMap = func() map[string]int {
+		m := make(map[string]int, len(knownStackStatuses))
+		for idx, s := range knownStackStatuses {
+			m[s] = idx
+		}
+		return m
+	}()


stack_status encodes the status as an index based on slice order, which makes the metric values fragile: inserting/reordering statuses will silently change the numeric meaning and break dashboards/alerts. Consider using a stable, explicit mapping (map literal with fixed values), or avoid numeric encoding entirely (e.g., emit value 1 and rely on the status label since it’s already present per stack).

Suggested change

stackStatusValueMap = func() map[string]int {

m := make(map[string]int, len(knownStackStatuses))

for idx, s := range knownStackStatuses {

m[s] = idx

}

return m

}()

// NOTE: Values are explicitly assigned to keep metric encodings stable even

// if knownStackStatuses is reordered or modified.

stackStatusValueMap = map[string]int{

"INIT_IN_PROGRESS": 0,

"INIT_FAILED": 1,

"INIT_COMPLETE": 2,

"CREATE_IN_PROGRESS": 3,

"CREATE_FAILED": 4,

"CREATE_COMPLETE": 5,

"DELETE_IN_PROGRESS": 6,

"DELETE_FAILED": 7,

"DELETE_COMPLETE": 8,

"UPDATE_IN_PROGRESS": 9,

"UPDATE_FAILED": 10,

"UPDATE_COMPLETE": 11,

"ROLLBACK_IN_PROGRESS": 12,

"ROLLBACK_FAILED": 13,

"ROLLBACK_COMPLETE": 14,

"SUSPEND_IN_PROGRESS": 15,

"SUSPEND_FAILED": 16,

"SUSPEND_COMPLETE": 17,

"RESUME_IN_PROGRESS": 18,

"RESUME_FAILED": 19,

"RESUME_COMPLETE": 20,

"ADOPT_IN_PROGRESS": 21,

"ADOPT_FAILED": 22,

"ADOPT_COMPLETE": 23,

"SNAPSHOT_IN_PROGRESS": 24,

"SNAPSHOT_FAILED": 25,

"SNAPSHOT_COMPLETE": 26,

"CHECK_IN_PROGRESS": 27,

"CHECK_FAILED": 28,

"CHECK_COMPLETE": 29,

}

Copilot · 2026-02-12T14:14:27Z

internal/collector/heat/orchestration.go

+
+	// Collect metrics from stacks collector and track errors
+	if err := c.stacksCollector.Collect(ch); err != nil {
+		c.logger.Error("stacks collector failed", "error", err)


This logs errors that are already logged inside StacksCollector.Collect (which logs "failed to query stacks" before returning the error), resulting in duplicate error logs for the same failure. Prefer logging in one place (either remove logging inside StacksCollector and let the orchestrator own it, or keep it in StacksCollector and avoid logging again here).

Suggested change

c.logger.Error("stacks collector failed", "error", err)

Copilot · 2026-02-12T14:14:27Z

internal/collector/heat/orchestration.go

+	return &OrchestrationCollector{
+		logger:          logger,


StacksCollector enriches its logger with namespace/subsystem/collector fields, but OrchestrationCollector uses the raw logger, so its logs may miss consistent context. Consider applying the same logger.With(...) pattern here (e.g., include namespace/subsystem/collector like "collector", "orchestration"), especially since this component emits the service-level up metric.

Suggested change

return &OrchestrationCollector{

logger: logger,

orchestrationLogger := logger.With(

"namespace", Namespace,

"subsystem", Subsystem,

"collector", "orchestration",

)

return &OrchestrationCollector{

logger: orchestrationLogger,

Copilot · 2026-02-12T14:14:27Z

sql/heat/queries.sql

+    COALESCE(s.tenant, '') as tenant,
+    s.created_at,
+    s.updated_at,
+    s.deleted_at,
+    COALESCE(s.nested_depth, 0) as nested_depth,
+    COALESCE(s.disable_rollback, false) as disable_rollback


The collector currently only uses id, name, status, action, and tenant, but the query also selects timestamps, deleted_at, nested_depth, and disable_rollback. Selecting and scanning unused columns increases row size and scan overhead; consider trimming the SELECT list to only what the collector consumes (and regenerate sqlc output accordingly).

Suggested change

COALESCE(s.tenant, '') as tenant,

s.created_at,

s.updated_at,

s.deleted_at,

COALESCE(s.nested_depth, 0) as nested_depth,

COALESCE(s.disable_rollback, false) as disable_rollback

COALESCE(s.tenant, '') as tenant

Copilot · 2026-02-12T14:14:27Z

internal/collector/heat/stacks.go

+func mapStackStatusValue(status string) int {
+	if v, ok := stackStatusValueMap[status]; ok {
+		return v
+	}
+	return -1
+}


The “unknown status” behavior (returning -1) isn’t covered by the new unit tests. Add a test row with an unexpected status value and assert the exported openstack_orchestration_stack_status sample uses -1 (and that it does not affect stack_status_counter counts).

Signed-off-by: Tadas Sutkaitis <tadas.sutkaitis@vexxhost.com>

Tadas Sutkaitis (fitbeard) added 2 commits December 16, 2025 12:03

feat: add heat metrics

90d84b1

Signed-off-by: Tadas Sutkaitis <tadas.sutkaitis@vexxhost.com>

fix: end lines

cdcf51b

Signed-off-by: Tadas Sutkaitis <tadas.sutkaitis@vexxhost.com>

Tadas Sutkaitis (fitbeard) requested a review from Mohammed Naser (mnaser) December 16, 2025 10:08

Tadas Sutkaitis (fitbeard) added 2 commits December 16, 2025 16:20

fix: add missing heatDatabaseURL flag

b765264

Signed-off-by: Tadas Sutkaitis <tadas.sutkaitis@vexxhost.com>

Merge branch 'main' into feat/add_heat

a9dce66

Mohammed Naser (mnaser) requested a review from Copilot February 12, 2026 01:35

Copilot AI reviewed Feb 12, 2026

View reviewed changes

Tadas Sutkaitis (fitbeard) added 2 commits February 12, 2026 13:53

Merge branch 'main' into feat/add_heat

f9d21a2

feat: add more optimization

e2273c3

Signed-off-by: Tadas Sutkaitis <tadas.sutkaitis@vexxhost.com>

Tadas Sutkaitis (fitbeard) requested a review from Copilot February 12, 2026 14:12

Copilot AI reviewed Feb 12, 2026

View reviewed changes

feat: add more optimizations

00f653a

Signed-off-by: Tadas Sutkaitis <tadas.sutkaitis@vexxhost.com>

		// Count status occurrences
		stackStatusCounter[stack.Status]++

-	stackStatusValueMap = func() map[string]int {
-		m := make(map[string]int, len(knownStackStatuses))
-		for idx, s := range knownStackStatuses {
-			m[s] = idx
-		}
-		return m
-	}()
+	// NOTE: Values are explicitly assigned to keep metric encodings stable even
+	// if knownStackStatuses is reordered or modified.
+	stackStatusValueMap = map[string]int{
+		"INIT_IN_PROGRESS":     0,
+		"INIT_FAILED":          1,
+		"INIT_COMPLETE":        2,
+		"CREATE_IN_PROGRESS":   3,
+		"CREATE_FAILED":        4,
+		"CREATE_COMPLETE":      5,
+		"DELETE_IN_PROGRESS":   6,
+		"DELETE_FAILED":        7,
+		"DELETE_COMPLETE":      8,
+		"UPDATE_IN_PROGRESS":   9,
+		"UPDATE_FAILED":        10,
+		"UPDATE_COMPLETE":      11,
+		"ROLLBACK_IN_PROGRESS": 12,
+		"ROLLBACK_FAILED":      13,
+		"ROLLBACK_COMPLETE":    14,
+		"SUSPEND_IN_PROGRESS":  15,
+		"SUSPEND_FAILED":       16,
+		"SUSPEND_COMPLETE":     17,
+		"RESUME_IN_PROGRESS":   18,
+		"RESUME_FAILED":        19,
+		"RESUME_COMPLETE":      20,
+		"ADOPT_IN_PROGRESS":    21,
+		"ADOPT_FAILED":         22,
+		"ADOPT_COMPLETE":       23,
+		"SNAPSHOT_IN_PROGRESS": 24,
+		"SNAPSHOT_FAILED":      25,
+		"SNAPSHOT_COMPLETE":    26,
+		"CHECK_IN_PROGRESS":    27,
+		"CHECK_FAILED":         28,
+		"CHECK_COMPLETE":       29,
+	}

-	return &OrchestrationCollector{
-		logger:          logger,
+	orchestrationLogger := logger.With(
+		"namespace", Namespace,
+		"subsystem", Subsystem,
+		"collector", "orchestration",
+	)
+	return &OrchestrationCollector{
+		logger:          orchestrationLogger,

Comments

Conversation

Tadas Sutkaitis (fitbeard) commented Dec 16, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Tadas Sutkaitis (fitbeard) Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant