Skip to content

Conversation

@aptend
Copy link
Contributor

@aptend aptend commented Nov 25, 2025

Optimize CLUSTER BY for statement_info and metric tables

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

Related to https://github.com/matrixorigin/MO-Cloud/issues/5327

What this PR does / why we need it:

This PR optimizes the CLUSTER BY configuration for statement_info and metric tables to improve query performance and fix zonemap filtering issues.

Changes

  1. statement_info table: Changed CLUSTER BY from using account (varchar) to (request_at, account_id)
  2. metric table: Changed CLUSTER BY from (account, metric_name, collecttime) to (collecttime, account_id, metric_name)

Problem

The previous CLUSTER BY configuration used account (varchar(1024)) as the first clustering key. This caused issues because:

  • Long account names can fill up the zonemap, making it ineffective for filtering
  • String-based clustering is less efficient than integer-based clustering
  • Time-based queries are common but time fields were not prioritized in the clustering order

Solution

  • Replace account (varchar) with account_id (int) in CLUSTER BY to avoid zonemap overflow
  • Place time fields (request_at/collecttime) first in the clustering order to optimize time-range queries
  • This ensures zonemap filtering works correctly and improves query performance for time-based and account-based filters

Implementation Details

  • Added upgrade scripts in pkg/bootstrap/versions/v4_0_0/ to migrate existing tables
  • Updated table definitions in pkg/util/trace/impl/motrace/schema.go and pkg/util/metric/mometric/metric.go
  • Updated related query builder logic to handle the new clustering configuration
  • Added test cases to verify the changes

@matrix-meow matrix-meow added the size/XL Denotes a PR that changes [1000, 1999] lines label Nov 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/enhancement size/XL Denotes a PR that changes [1000, 1999] lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants