Skip to content

fix: calculate value of total field in OpenObserve logs, traces and spans response using a count query#65

Merged
nilushancosta merged 1 commit intoopenchoreo:mainfrom
nilushancosta:observability
Mar 20, 2026
Merged

fix: calculate value of total field in OpenObserve logs, traces and spans response using a count query#65
nilushancosta merged 1 commit intoopenchoreo:mainfrom
nilushancosta:observability

Conversation

@nilushancosta
Copy link
Copy Markdown
Contributor

@nilushancosta nilushancosta commented Mar 20, 2026

Purpose

The total field in query responses for both traces and logs endpoints returns an incorrect count that is capped at the limit parameter value, rather than the true number of matching records. This breaks pagination logic for API consumers.
openchoreo/openchoreo#2889

Goals

Return the accurate total count of matching records in the total field, independent of the limit/size parameter, for all query endpoints across both the tracing and logs modules.

Approach

Each query endpoint now executes a separate count query against OpenObserve's _search API with size: 0 and a SELECT count(*) as total SQL statement (using count(distinct trace_id) for traces) with the same filter conditions as the original query. The true total is extracted from hits[0].total in the count response.

Summary by CodeRabbit

  • Chores

    • Updated observability-logs-openobserve Helm chart version to 0.4.2
    • Updated observability-tracing-openobserve Helm chart version to 0.2.1
  • Documentation

    • Updated installation examples with new Helm chart versions
  • New Features

    • Enhanced accuracy of total result counts in component and workflow logs
    • Enhanced accuracy of total result counts in traces and spans

…pans response using a count query

Signed-off-by: Nilushan Costa <nilushan@wso2.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 20, 2026

📝 Walkthrough

Walkthrough

Version updates for two Helm charts (logs 0.4.1→0.4.2, tracing 0.2.0→0.2.1). Added separate count query execution paths in both logging and tracing clients to retrieve total counts independently from paginated search results, with new query builders and count-response parsing helpers.

Changes

Cohort / File(s) Summary
Version Bumps (Logs)
observability-logs-openobserve/README.md, observability-logs-openobserve/helm/Chart.yaml
Updated Helm chart version from 0.4.1 to 0.4.2 in all installation examples and chart metadata.
Version Bumps (Tracing)
observability-tracing-openobserve/README.md, observability-tracing-openobserve/helm/Chart.yaml
Updated Helm chart version from 0.2.0 to 0.2.1 in installation commands and chart metadata.
Observability Logs - Count Query Logic
observability-logs-openobserve/internal/openobserve/client.go, observability-logs-openobserve/internal/openobserve/queries.go, observability-logs-openobserve/internal/openobserve/client_test.go
Added separate count query execution for GetComponentLogs and GetWorkflowLogs methods. Introduced generateComponentLogsCountQuery and generateWorkflowLogsCountQuery builders. Added extractTotalCount helper and test utilities (isCountQuery) to differentiate and handle count-focused responses.
Observability Tracing - Count Query Logic
observability-tracing-openobserve/internal/openobserve/client.go, observability-tracing-openobserve/internal/openobserve/queries.go, observability-tracing-openobserve/internal/openobserve/client_test.go
Added separate count query execution for GetTraces and GetSpans methods. Introduced generateTracesCountQuery and generateSpansCountQuery builders. Added extractTotalCount helper and test utilities (isCountQuery) to parse count responses and handle branching logic in test handlers.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A count query hops through the logs and traces so fine,
Separate paths now tally totals, no need to divide,
Charts ascend to newer versions with grace,
Each response extracted with care and with place!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 65.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: implementing count queries to fix the total field calculation in OpenObserve responses for logs, traces and spans.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov-commenter
Copy link
Copy Markdown

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (4)
observability-tracing-openobserve/internal/openobserve/client.go (1)

327-342: Consider adding a nil check for defensive programming.

The function handles missing hits and missing "total" field gracefully, but will panic if resp is nil. While current callers always pass non-nil responses after successful executeSearchQuery, a nil check would make this helper more robust for future use.

♻️ Optional: Add nil check
 func extractTotalCount(resp *OpenObserveResponse) int {
+	if resp == nil {
+		return 0
+	}
 	if len(resp.Hits) > 0 {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@observability-tracing-openobserve/internal/openobserve/client.go` around
lines 327 - 342, The helper extractTotalCount should guard against a nil
response to avoid panics; update extractTotalCount to check if resp == nil (and
return 0) before accessing resp.Hits, keeping the existing logic for handling
empty hits and missing "total"; reference the extractTotalCount function and the
resp parameter to locate where to add the nil check.
observability-logs-openobserve/internal/openobserve/client.go (1)

187-198: Silent fallback to 0 when count extraction fails.

The extractTotalCount function returns 0 when it cannot extract the total count (empty hits, missing "total" key, or wrong type). This could mask issues where the count query returns an unexpected response format.

Consider logging a warning when extraction fails, to aid debugging:

♻️ Suggested improvement with logging
 // extractTotalCount extracts the total count from a count query response.
 // The response is expected to have hits[0].total as the count value.
-func extractTotalCount(resp *OpenObserveResponse) int {
+func extractTotalCount(resp *OpenObserveResponse, logger *slog.Logger) int {
 	if len(resp.Hits) > 0 {
 		if total, ok := resp.Hits[0]["total"]; ok {
 			if v, ok := total.(float64); ok {
 				return int(v)
 			}
+			logger.Warn("Unexpected type for total count", slog.Any("total", total))
 		}
+	} else {
+		logger.Warn("Count query returned no hits")
 	}
 	return 0
 }

This would require updating the call sites to pass the logger:

TotalCount: extractTotalCount(countResp, c.logger),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@observability-logs-openobserve/internal/openobserve/client.go` around lines
187 - 198, The extractTotalCount function currently swallows failures and
returns 0; change its signature to accept a logger (e.g., extractTotalCount(resp
*OpenObserveResponse, logger Logger)) and inside, when hits are empty, "total"
missing, or type assertion fails, emit a warning via logger with contextual info
(e.g., what part failed and resp contents) before returning 0; update all call
sites (e.g., where TotalCount: extractTotalCount(countResp, c.logger) is set) to
pass c.logger, and ensure the logger type matches the existing c.logger (or
adapt to the project's logging interface).
observability-logs-openobserve/internal/openobserve/queries.go (1)

171-176: Debug logging uses fmt.Printf instead of structured logger.

The debug logging in both count query functions uses fmt.Printf which bypasses the structured logger and goes directly to stdout. This is inconsistent with the rest of the codebase and makes log aggregation harder.

Consider using the logger parameter for consistency:

♻️ Suggested refactor for generateComponentLogsCountQuery (lines 171-176)
 	if logger.Enabled(nil, slog.LevelDebug) {
 		if prettyJSON, err := json.MarshalIndent(query, "", "    "); err == nil {
-			fmt.Printf("Generated count query for component logs:\n")
-			fmt.Println(string(prettyJSON))
+			logger.Debug("Generated count query for component logs", slog.String("query", string(prettyJSON)))
 		}
 	}
♻️ Suggested refactor for generateWorkflowLogsCountQuery (lines 217-222)
 	if logger.Enabled(nil, slog.LevelDebug) {
 		if prettyJSON, err := json.MarshalIndent(query, "", "    "); err == nil {
-			fmt.Printf("Generated count query for workflow logs:\n")
-			fmt.Println(string(prettyJSON))
+			logger.Debug("Generated count query for workflow logs", slog.String("query", string(prettyJSON)))
 		}
 	}

Note: The same pattern exists in the other query generation functions (lines 284-288, 352-357, 436-441), so this is a pre-existing pattern. Consider addressing all of them in a follow-up.

Also applies to: 217-222

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@observability-logs-openobserve/internal/openobserve/queries.go` around lines
171 - 176, Replace the direct stdout debug prints in
generateComponentLogsCountQuery and generateWorkflowLogsCountQuery with
structured logs using the provided logger: when logger.Enabled(nil,
slog.LevelDebug) and json.MarshalIndent succeeds, call the logger's debug-level
method (e.g., logger.Debug / logger.Log with slog.LevelDebug) and pass a clear
message plus the pretty JSON as a structured field (e.g., "query" or
"pretty_query") instead of using fmt.Printf / fmt.Println; apply the same change
to the other similar helpers (the other query generation functions flagged) for
consistency.
observability-logs-openobserve/internal/openobserve/client_test.go (1)

63-155: Consider adding test coverage for count query failures.

The current tests cover the happy path where both the search query and count query succeed. However, the client code can fail at the count query stage (lines 226-233 and 266-273 in client.go). Consider adding tests that verify proper error handling when:

  1. The count query generation fails
  2. The count query execution fails (e.g., server returns error for count query but not for main query)

Would you like me to generate test cases for count query failure scenarios?

Also applies to: 175-229

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@observability-logs-openobserve/internal/openobserve/client.go`:
- Around line 187-198: The extractTotalCount function currently swallows
failures and returns 0; change its signature to accept a logger (e.g.,
extractTotalCount(resp *OpenObserveResponse, logger Logger)) and inside, when
hits are empty, "total" missing, or type assertion fails, emit a warning via
logger with contextual info (e.g., what part failed and resp contents) before
returning 0; update all call sites (e.g., where TotalCount:
extractTotalCount(countResp, c.logger) is set) to pass c.logger, and ensure the
logger type matches the existing c.logger (or adapt to the project's logging
interface).

In `@observability-logs-openobserve/internal/openobserve/queries.go`:
- Around line 171-176: Replace the direct stdout debug prints in
generateComponentLogsCountQuery and generateWorkflowLogsCountQuery with
structured logs using the provided logger: when logger.Enabled(nil,
slog.LevelDebug) and json.MarshalIndent succeeds, call the logger's debug-level
method (e.g., logger.Debug / logger.Log with slog.LevelDebug) and pass a clear
message plus the pretty JSON as a structured field (e.g., "query" or
"pretty_query") instead of using fmt.Printf / fmt.Println; apply the same change
to the other similar helpers (the other query generation functions flagged) for
consistency.

In `@observability-tracing-openobserve/internal/openobserve/client.go`:
- Around line 327-342: The helper extractTotalCount should guard against a nil
response to avoid panics; update extractTotalCount to check if resp == nil (and
return 0) before accessing resp.Hits, keeping the existing logic for handling
empty hits and missing "total"; reference the extractTotalCount function and the
resp parameter to locate where to add the nil check.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d6293c27-2e35-49db-98e5-e1dbf3b6490c

📥 Commits

Reviewing files that changed from the base of the PR and between db55c2a and 4dc5c77.

⛔ Files ignored due to path filters (4)
  • observability-logs-openobserve/internal/api/gen/models.gen.go is excluded by !**/gen/**
  • observability-logs-openobserve/internal/api/gen/server.gen.go is excluded by !**/gen/**
  • observability-tracing-openobserve/internal/api/gen/models.gen.go is excluded by !**/gen/**
  • observability-tracing-openobserve/internal/api/gen/server.gen.go is excluded by !**/gen/**
📒 Files selected for processing (10)
  • observability-logs-openobserve/README.md
  • observability-logs-openobserve/helm/Chart.yaml
  • observability-logs-openobserve/internal/openobserve/client.go
  • observability-logs-openobserve/internal/openobserve/client_test.go
  • observability-logs-openobserve/internal/openobserve/queries.go
  • observability-tracing-openobserve/README.md
  • observability-tracing-openobserve/helm/Chart.yaml
  • observability-tracing-openobserve/internal/openobserve/client.go
  • observability-tracing-openobserve/internal/openobserve/client_test.go
  • observability-tracing-openobserve/internal/openobserve/queries.go

@nilushancosta nilushancosta merged commit 3744f20 into openchoreo:main Mar 20, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants