Skip to content

Conversation

@Alex-Burmak
Copy link
Owner

@Alex-Burmak Alex-Burmak commented May 23, 2025

Summary by Sourcery

Rewrite ClickHouse database migration to use direct ZooKeeper transactions for Replicated databases, remove the temporary-database workaround, and add robust schema consistency checks.

New Features:

  • Introduce create_database_nodes to initialize Replicated database nodes in ZooKeeper without creating a temporary database.
  • Implement migrate_as_first_replica and migrate_as_non_first_replica flows with ZooKeeper transactions, including counter generation and query/log node setup.
  • Add get_shard_and_replica_from_macros helper to extract shard and replica identifiers from ClickHouse macros.
  • Implement schema consistency checking (is_table_schema_equal) to abort migration when local table metadata diverges from ZooKeeper.

Enhancements:

  • Refactor migration logic to remove the temporary database approach and consolidate detach/attach helpers.
  • Refactor change_table_uuid to return a change flag and introduce read_local_table_metadata for metadata comparison.
  • Simplify migrate_engine_command to detect first vs. non-first replica based on NodeExistsError from create_database_nodes.
  • Improve logging, error handling, and transaction result validation in migration routines.

Tests:

  • Clean up existing database migration feature tests by removing redundant restarts and sleeps.
  • Add a new scenario to validate schema-drift detection for MergeTree tables before and after migration.

@sourcery-ai
Copy link

sourcery-ai bot commented May 23, 2025

Reviewer's Guide

This PR overhauls the database migration flow by replacing the temporary‐database approach with Zookeeper transactions for initializing Replicated databases, streamlining first- and non-first replica logic, adding metadata consistency checks, enhancing macro handling, and updating related CLI commands and tests.

Sequence Diagram: First Replica Database Migration

sequenceDiagram
    actor User
    participant CLI as migrate_engine_command
    participant ZK as Zookeeper
    participant CH as ClickHouse
    participant FS as LocalMetadataStorage

    User->>CLI: Execute migrate_engine_command(database)
    CLI->>ZK: create_database_nodes(database) # Initial ZK setup for DB
    ZK-->>CLI: Nodes created (determines first replica)

    CLI->>ZK: migrate_as_first_replica(database)
    activate ZK
    ZK->>ZK: Generate counter for DDL queue
    ZK->>ZK: Start Transaction
    ZK->>ZK: Create first_replica_database_name node
    ZK->>ZK: Create query log nodes
    ZK->>ZK: Create database replica nodes (incl. serverUUID from CH)
    ZK->>FS: Read local table metadata files
    FS-->>ZK: Table metadata content
    ZK->>ZK: Create table metadata nodes in ZK
    ZK->>ZK: Commit Transaction
    deactivate ZK
    ZK-->>CLI: ZK setup successful

    CLI->>CH: DETACH DATABASE database
    CH-->>CLI: Database detached
    CLI->>FS: Update local DB .sql file (set_replicated)
    FS-->>CLI: Local metadata updated
    CLI->>CH: ATTACH DATABASE database
    CH-->>CLI: Database attached
Loading

Sequence Diagram: Non-First Replica Database Migration

sequenceDiagram
    actor User
    participant CLI as migrate_engine_command
    participant ZK as Zookeeper
    participant CH as ClickHouse
    participant FS as LocalMetadataStorage

    User->>CLI: Execute migrate_engine_command(database)
    CLI->>ZK: create_database_nodes(database) # Attempt initial ZK setup
    ZK-->>CLI: NodeExistsError (determines non-first replica)

    CLI->>ZK: migrate_as_non_first_replica(database)
    activate ZK
    ZK->>ZK: Generate counter for DDL queue
    ZK->>ZK: Start Transaction
    ZK->>ZK: Create query log nodes
    ZK->>ZK: Create database replica nodes (incl. serverUUID from CH)
    ZK->>ZK: Commit Transaction
    deactivate ZK
    ZK-->>CLI: ZK replica setup successful

    CLI->>CH: Get tables info (name, uuid, metadata_path, engine)
    CH-->>CLI: Tables info
    CLI->>CH: DETACH DATABASE database
    CH-->>CLI: Database detached

    CLI->>ZK: Check table schema consistency
    activate ZK
    ZK->>FS: Read local table metadata file
    FS-->>ZK: Local table metadata content
    ZK->>ZK: Compare local metadata (sans UUID) with ZK metadata
    deactivate ZK
    ZK-->>CLI: Consistency check result

    CLI->>FS: Update local DB .sql file (set_replicated)
    FS-->>CLI: Local DB metadata updated
    CLI->>FS: Change table UUIDs in local .sql files if different from ZK
    FS-->>CLI: Local table UUIDs updated (or not)

    alt Table UUIDs were changed
        CLI->>User: Inform ClickHouse restart needed
    else No UUID changes or non-critical differences
        CLI->>CH: ATTACH DATABASE database
        CH-->>CLI: Database attached
    end
Loading

Class Diagram: Update to DatabaseMetadata and Zookeeper Interaction

classDiagram
    class DatabaseMetadata {
        +str database_name
        +DatabaseEngine database_engine
        +str replica_path
        +str shard
        +str replica_name
        +set_replicated() void
        +update_metadata_file() void
    }
    class KazooClient {
        +transaction() TransactionRequest
    }
    class TransactionRequest {
        +create(path, value)
        +delete(path)
        +commit() list
    }
    DatabaseMetadata ..> KazooClient : Uses (indirectly via zk_client context)
    KazooClient ..> TransactionRequest : Creates
Loading

File-Level Changes

Change Details Files
Adopt transactional Zookeeper API for node creation
  • Introduce _generate_counter and _check_result_txn to manage ZK sequences and transaction results
  • Add create_database_nodes and multiple create* helper functions using TransactionRequest
  • Replace ad-hoc update_zk_nodes and list_zk_nodes calls with txn.create operations
ch_tools/chadmin/internal/migration.py
ch_tools/chadmin/internal/zookeeper.py
Refactor migration logic and remove temporary‐database flow
  • Eliminate create_temp_db and temp_db parameters in migrate functions
  • Use create_database_nodes to initialize first replica, catching NodeExistsError for others
  • Simplify migrate_as_first_replica and migrate_as_non_first_replica with shared detach/attach and metadata updates
ch_tools/chadmin/internal/migration.py
ch_tools/chadmin/cli/database_group.py
Implement local vs ZK metadata consistency checks
  • Add read_local_table_metadata and is_table_schema_equal to compare schemas
  • Introduce _check_tables_consistent to abort migration on drift
  • Integrate consistency check into non-first replica flow before committing
ch_tools/chadmin/internal/migration.py
ch_tools/chadmin/internal/table.py
Enhance shard and replica identification via macros
  • Add get_shard_and_replica_from_macros using get_macros and replace_macros
  • Use resolved shard
replica values in query and replica node creation
Update CLI migrate command to use new ZK init and error handling
  • Replace create_temp_db calls with create_database_nodes in migrate_engine_command
  • Handle NodeExistsError to detect non-first replica
  • Remove temp_db variable and improve exception logging
ch_tools/chadmin/cli/database_group.py
Refresh database_migrate.feature tests
  • Remove redundant sleep steps and consolidate restart commands
  • Add scenario for MergeTree schema drift and expected migration failure
  • Ensure coverage for both first and non-first replica flows
tests/features/database_migrate.feature

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@Alex-Burmak
Copy link
Owner Author

@sourcery-ai review

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Alex-Burmak - I've reviewed your changes - here's some feedback:

  • There’s a lot of repeated txn.create() calls for different ZK nodes—consider extracting common path/value creation into helper functions to reduce duplication.
  • The fixed‐offset slicing to remove UUIDs in is_table_schema_equal is brittle; using a regex or structured parser would be more robust than hard-coded start/length values.
  • The new feature scenarios duplicate restart/sleep steps—factor those into a Background or Scenario Outline in database_migrate.feature to DRY up the tests.
Here's what I looked at during the review
  • 🟡 General issues: 1 issue found
  • 🟢 Security: all looks good
  • 🟢 Review instructions: all looks good
  • 🟢 Testing: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

logging.info("result contains NodeExistsError.")
raise NodeExistsError()
if isinstance(result, Exception):
logging.error("result contains ex={}, type=P{}.", result, type(result))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (typo): Fix placeholder typo in logging

The format string should use type={} instead of type=P{} to log the exception type correctly.

Suggested change
logging.error("result contains ex={}, type=P{}.", result, type(result))
logging.error("result contains ex={}, type={}.", result, type(result))

Comment on lines +123 to +124
missing = [macro for macro in ["shard", "replica"] if macro not in macros]
if missing:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Use named expression to simplify assignment and conditional (use-named-expression)

Suggested change
missing = [macro for macro in ["shard", "replica"] if macro not in macros]
if missing:
if missing := [
macro for macro in ["shard", "replica"] if macro not in macros
]:

Comment on lines +181 to +183
was_changed = _change_tables_uuid(ctx, tables_info, migrating_database)

if was_changed:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Use named expression to simplify assignment and conditional (use-named-expression)

Suggested change
was_changed = _change_tables_uuid(ctx, tables_info, migrating_database)
if was_changed:
if was_changed := _change_tables_uuid(
ctx, tables_info, migrating_database
):

logging.info(
"New metadata for node from mapping table:\n{}\n===", target_metadata
)
txn.create(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): Use f-string instead of string concatenation [×4] (use-fstring-for-concatenation)

metadata_path = row["metadata_path"]

if match_str_ch_version(get_version(ctx), "25.1"):
metadata_path = CLICKHOUSE_PATH + "/" + metadata_path
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Use f-string instead of string concatenation [×2] (use-fstring-for-concatenation)

Suggested change
metadata_path = CLICKHOUSE_PATH + "/" + metadata_path
metadata_path = f"{CLICKHOUSE_PATH}/{metadata_path}"

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Alex-Burmak - I've reviewed your changes - here's some feedback:

  • Consider centralizing Zookeeper path templates (e.g. "/clickhouse/..." strings) into constants or helper functions to reduce duplication and simplify maintenance.
  • Extract repeated magic values (like "0", "1", default log pointers, etc.) into named constants or configuration entries to improve readability and avoid hard-coding.
  • migration.py has grown quite large—consider splitting it into smaller modules or classes (e.g. separating first-replica vs non-first-replica logic) for clearer structure and easier navigation.
Here's what I looked at during the review
  • 🟡 General issues: 4 issues found
  • 🟢 Security: all looks good
  • 🟢 Review instructions: all looks good
  • 🟢 Testing: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +127 to +128
shard = replace_macros("{shard}", get_macros(ctx))
replica = replace_macros("{replica}", get_macros(ctx))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Unnecessary second call to get_macros

Pass the previously assigned macros variable to replace_macros instead of calling get_macros(ctx) again.

Suggested change
shard = replace_macros("{shard}", get_macros(ctx))
replica = replace_macros("{replica}", get_macros(ctx))
shard = replace_macros("{shard}", macros)
replica = replace_macros("{replica}", macros)

Comment on lines +124 to +125
if missing:
raise RuntimeError(f"Failed replace marcos. {missing}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick (typo): Typo in exception message ('marcos')

Suggested change
if missing:
raise RuntimeError(f"Failed replace marcos. {missing}")
if missing:
raise RuntimeError(f"Failed replace macros. {missing}")

Comment on lines +193 to +201
logging.info(
"create_database_nodes failed with NodeExistsError. {}, type={}. Migrate as second replica",
ex,
type(ex),
)

first_replica = False
except Exception as ex:
logging.info("create_database_nodes failed with ex={}", type(ex))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: Logging only the exception type

Log the exception message as well to provide more context on the failure.

)


def _get_tables_info_and_detach(ctx: Context, database_name: str) -> dict:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: Incorrect return type annotation

Update the function signature to use -> List[dict] to match the actual return type.

Comment on lines +123 to +124
missing = [macro for macro in ["shard", "replica"] if macro not in macros]
if missing:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Use named expression to simplify assignment and conditional (use-named-expression)

Suggested change
missing = [macro for macro in ["shard", "replica"] if macro not in macros]
if missing:
if missing := [
macro for macro in ["shard", "replica"] if macro not in macros
]:

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How I can disable "use-named-expression" suggestions?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have two options:

  1. Globally turn off the use-named-expression rule in your repo’s .sourcery.yaml. For example, at the root of your project add:
# .sourcery.yaml
rules:
  use-named-expression:
    enabled: false

(or, depending on your Sourcery version, you may have to use disabled_rules: [use-named-expression])

  1. Suppress it inline on a per-function or per-file basis by inserting a comment:
# sourcery skip: use-named-expression
def get_shard_and_replica_from_macros(ctx: Context) -> Tuple[str, str]:
    …

That will tell Sourcery to ignore that particular refactoring.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How I can disable "use-named-expression" suggestions for Sourcery installed through GitHub application?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You still need to tell Sourcery what to ignore, even when it’s running as a GitHub App. There isn’t a toggle in the GitHub-App UI—you have to ship a config in your repo (or use inline skips). Two options:

  1. Repo-wide via .sourcery.yaml
    At the root of your repo add something like:

    # .sourcery.yaml
    rules:
      use-named-expression:
        enabled: false
    # or, depending on your Sourcery version:
    # disabled_rules: [use-named-expression]
  2. File- or function-level inline suppression
    Add a # sourcery skip: comment right before the function or file you want Sourcery to ignore:

    # sourcery skip: use-named-expression
    def get_shard_and_replica_from_macros(ctx: Context) -> Tuple[str, str]:
        …

Once you commit one of those, Sourcery (even as a GitHub App) will stop suggesting the use-named-expression refactoring.

Comment on lines +181 to +183
was_changed = _change_tables_uuid(ctx, tables_info, migrating_database)

if was_changed:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Use named expression to simplify assignment and conditional (use-named-expression)

Suggested change
was_changed = _change_tables_uuid(ctx, tables_info, migrating_database)
if was_changed:
if was_changed := _change_tables_uuid(
ctx, tables_info, migrating_database
):

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How I can disable this type of suggestions?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can turn off that particular refactoring in a couple of ways:

  1. Per-project (via .sourcery.yaml)
    Create a .sourcery.yaml at the root of your repo and blacklist the rule:
# .sourcery.yaml
ignore_rules:
  - use-named-expression
  1. In-file or per-code block
  • To skip a single suggestion on the next statement:

    # sourcery skip: use-named-expression
    if _change_tables_uuid(ctx, tables_info, migrating_database):
        …
  • To skip all Sourcery suggestions in a file, put at the top:

    # sourcery skip-file

With either of those in place Sourcery will stop proposing the use-named-expression change.

logging.info(
"New metadata for node from mapping table:\n{}\n===", target_metadata
)
txn.create(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): Use f-string instead of string concatenation [×4] (use-fstring-for-concatenation)

metadata_path = row["metadata_path"]

if match_str_ch_version(get_version(ctx), "25.1"):
metadata_path = CLICKHOUSE_PATH + "/" + metadata_path
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Use f-string instead of string concatenation [×2] (use-fstring-for-concatenation)

Suggested change
metadata_path = CLICKHOUSE_PATH + "/" + metadata_path
metadata_path = f"{CLICKHOUSE_PATH}/{metadata_path}"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants