-
Notifications
You must be signed in to change notification settings - Fork 12
Refactor database migration to use SYSTEM RESTORE DATABASE REPLICA #443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Reviewer's GuideRefactors database migration and replica restore logic to prioritize ClickHouse’s native SYSTEM RESTORE DATABASE REPLICA (CH >= 25.8), encapsulate migration behavior in a DatabaseMigrator, centralize ZooKeeper transaction handling, and update CLI/tests for the new flow with a ZooKeeper-based fallback for older versions. Sequence diagram for restore_replica_command with SYSTEM RESTORE and fallbacksequenceDiagram
actor Operator
participant CLI as restore_replica_command
participant DBMeta as parse_database_metadata
participant DBUtil as DatabaseModule
participant DBReplica as DatabaseReplicaModule
participant ZKMgr as ZookeeperDatabaseManager
participant ZK as ZooKeeper
Operator->>CLI: invoke restore_replica_command(database)
CLI->>DBUtil: is_database_exists(ctx, database)
DBUtil-->>CLI: exists or not
CLI->>DBMeta: parse_database_metadata(database)
DBMeta-->>CLI: db_metadata (engine, zookeeper_path)
CLI->>CLI: validate engine is_replicated
CLI->>DBReplica: supports_system_restore_database_replica(ctx)
DBReplica-->>CLI: bool
alt CH >= 25.8 and supported
CLI->>DBReplica: system_restore_database_replica(ctx, database)
DBReplica->>DBReplica: execute SYSTEM RESTORE DATABASE REPLICA
DBReplica-->>CLI: success or exception
alt restore failed
CLI->>CLI: log error and raise
else restore ok
CLI-->>Operator: completed
end
else CH < 25.8 (fallback)
CLI->>DBReplica: _restore_replica_fallback(ctx, database, zookeeper_path)
DBReplica->>ZKMgr: __init__(ctx)
DBReplica->>DBReplica: check_database_exists_in_zk(ctx, database, zookeeper_path)
DBReplica-->>CLI: first_replica flag
alt first replica
DBReplica->>ZKMgr: create_database_structure(database, zookeeper_path)
ZKMgr->>ZK: create database structure nodes
ZK-->>ZKMgr: ok or NodeExistsError
alt NodeExistsError
ZKMgr-->>DBReplica: signal concurrent creation
DBReplica->>DBReplica: treat as non-first replica
end
end
DBReplica->>ZKMgr: create_replica_nodes(database, first_replica, zookeeper_path)
ZKMgr->>ZK: transaction with ZKTransactionBuilder
ZK-->>ZKMgr: committed
ZKMgr-->>DBReplica: success
DBReplica-->>CLI: fallback restore complete
CLI->>DBUtil: detach_database(ctx, database)
DBUtil-->>CLI: detached
CLI->>DBUtil: attach_database(ctx, database)
DBUtil-->>CLI: attached
CLI-->>Operator: completed via fallback
end
Class diagram for database migration and replica management refactorclassDiagram
class AttacherContext {
-ctx: Context
-database: str
+__init__(ctx: Context, database: str)
+__enter__() None
+__exit__(exc_type: Optional[type], exc_value: Optional[Exception], traceback: Optional[Any]) bool
}
class DatabaseMigrator {
-ctx: Context
+__init__(ctx: Context)
+migrate_to_atomic(database: str, clean_zookeeper: bool) None
+migrate_to_replicated(database: str) None
-_validate_version_support() None
-_validate_database_engine(database: str) Any
-_check_tables_consistent(database_name: str, local_tables: list~TableInfo~) None
-_compare_table_schemas(table_name: str, local_metadata: str, zk_metadata: str) bool
-_sync_table_uuids(tables_info: list~TableInfo~) bool
}
class ZookeeperDatabaseManager {
-ctx: Context
+__init__(ctx: Context)
+create_database_structure(database_name: str, db_replica_path: Optional[str]) None
+create_replica_nodes(database_name: str, first_replica: bool, db_replica_path: Optional[str]) None
-_generate_counter(zk: KazooClient, db_zk_path: str) str
-_create_first_replica_name_node(builder: ZKTransactionBuilder, prefix_db_zk_path: str, database_name: str) None
-_create_query_log_entry(builder: ZKTransactionBuilder, prefix_db_zk_path: str, counter: str) None
-_create_replica_registration(builder: ZKTransactionBuilder, database_name: str, prefix_db_zk_path: str) None
-_create_table_metadata_nodes(builder: ZKTransactionBuilder, database_name: str, prefix_db_zk_path: str) None
-_get_host_id(database_name: str, replica: str) str
-_get_server_uuid() str
-_get_shard_and_replica() Tuple~str, str~
-_get_default_db_path(database_name: str) str
}
class ZKTransactionBuilder {
-ctx: Context
-zk: KazooClient
-txn
-path_to_nodes: List~str~
-_committed: bool
+__init__(ctx: Context, zk: KazooClient)
+__enter__() ZKTransactionBuilder
+__exit__(exc_type: Any, exc_val: Any, exc_tb: Any) None
+create_node(path: str, value: str) ZKTransactionBuilder
+delete_node(path: str) ZKTransactionBuilder
+commit() None
+reset() None
+_check_result_txn(results: List, no_throw: bool) bool
}
class DatabaseModule {
+list_databases(ctx: Context, database: Optional[str], like: Optional[str], engine: Optional[str], with_stats: bool, with_active_parts: bool, skip_temporary: bool, format_: Optional[str]) Any
+is_database_exists(ctx: Context, database_name: str) bool
+detach_database(ctx: Context, database: str) None
+attach_database(ctx: Context, database: str) None
}
class DatabaseReplicaModule {
+system_database_drop_replica(ctx: Context, database_zk_path: str, replica: str, dry_run: bool) None
+supports_system_restore_database_replica(ctx: Context) bool
+system_restore_database_replica(ctx: Context, database_name: str) None
+check_database_exists_in_zk(ctx: Context, database_name: str, db_replica_path: Optional[str]) bool
+get_default_table_in_db_path(database_name: str, table_name: str) str
+get_tables_metadata(ctx: Context, database_name: str) dict~str, str~
+_restore_replica_fallback(ctx: Context, database_name: str, db_replica_path: Optional[str]) None
}
AttacherContext --> DatabaseModule : uses detach_database
AttacherContext --> DatabaseModule : uses attach_database
DatabaseMigrator --> AttacherContext : manages detach_attach
DatabaseMigrator --> DatabaseReplicaModule : uses check_database_exists_in_zk
DatabaseMigrator --> DatabaseReplicaModule : uses get_default_table_in_db_path
DatabaseMigrator --> DatabaseReplicaModule : uses get_tables_metadata
DatabaseMigrator --> DatabaseReplicaModule : uses supports_system_restore_database_replica
DatabaseMigrator --> DatabaseReplicaModule : uses system_restore_database_replica
DatabaseReplicaModule --> ZookeeperDatabaseManager : uses in _restore_replica_fallback
ZookeeperDatabaseManager --> ZKTransactionBuilder : builds_transactions
ZookeeperDatabaseManager --> DatabaseModule : uses is_database_exists via queries
Flow diagram for DatabaseMigrator.migrate_to_replicatedflowchart TD
A["Start migrate_to_replicated(database)"] --> B[Validate ClickHouse version with supports_system_restore_database_replica]
B -->|version < 25.8| C[Raise RuntimeError: unsupported version]
B -->|version >= 25.8| D[Validate database engine is Atomic]
D -->|not Atomic| E[Raise RuntimeError: only Atomic supported]
D -->|Atomic| F[Check ZooKeeper for existing database<br/>first_replica = not check_database_exists_in_zk]
F -->|first_replica true| G[Log migrating as first replica]
F -->|first_replica false| H[Log migrating as non-first replica]
G --> I[Detach database]
H --> I
I -->|first_replica false| J[List tables in database]
J --> K[Check tables consistent with ZooKeeper metadata]
K --> L{UUID sync needed?}
L -->|no| N[Set engine to Replicated in metadata]
L -->|yes| M[Sync table UUIDs via ZooKeeper metadata]
M --> O[need_restart = True]
G --> N
N --> P{need_restart?}
P -->|yes| Q[Restart ClickHouse via restart_command]
P -->|no| R[Attach database]
Q --> S[Run SYSTEM RESTORE DATABASE REPLICA]
R --> S
S --> T[Log success and finish]
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Method get_default_table_in_db_path is called via self.zk_manager, so it cannot be a static method.
Replace f-string interpolation with Jinja2 template parameters in: - is_database_exists() - create_table_metadata_nodes() - _get_host_id() BREAKING CHANGE: None
Add comprehensive module-level documentation for: - zookeeper.py - database_replica.py - database.py
- Implement __enter__ and __exit__ for 'with' statement support - Add method chaining (return self from create_node/delete_node) - Add transaction state protection with _committed flag - Add reset() method for builder reuse - Improve documentation with usage examples BREAKING CHANGE: None (fully backward compatible)
Update all ZKTransactionBuilder usage to use context manager pattern: - database_replica.py: create_database_structure(), create_replica_nodes() - migration.py: _migrate_non_first_replica() Improves resource management and code readability.
…REPLICA - Add support for SYSTEM RESTORE DATABASE REPLICA command (CH >= 25.8) - Implement version detection and automatic fallback for older versions - Simplify migration flow by using built-in restore command - Refactor restore_replica_command to use new restoration methods - Add _restore_replica_fallback for legacy ZooKeeper-based restoration - Update migrate_to_replicated to use SYSTEM RESTORE DATABASE REPLICA - Remove manual ZooKeeper transaction handling from migration flow - Improve error handling and logging throughout restoration process - Clean up docstrings and remove redundant code
Replace detailed ZooKeeper error trace with single meaningful error message about unexpected digest value in replica node
…d version check test - Update @require_version from 24.8 to 25.8 for all migration scenarios - Add new scenario to test migration on unsupported version (< 25.8) - Remove unnecessary clickhouse-server restart commands from tests - Simplify test scenarios by removing manual restart steps
- Simplify database_replica.py logic (220 -> reduced lines) - Refactor migration.py implementation (81 -> reduced lines) - Clean up zookeeper.py utilities (35 -> reduced lines) - Update database_group.py CLI commands (11 changes) Total: -81 lines of code
…and comprehensive documentation - Add creation of 'first_replica_database_name' ZK node in create_database_structure() to match ClickHouse implementation (DatabaseReplicated.cpp:689-692) - Add escape_for_zookeeper() for table names in metadata nodes to properly handle special characters (matches ClickHouse escapeForFileName logic) - Add comprehensive ZooKeeper structure documentation to ZookeeperDatabaseManager class describing all nodes, their purpose, and replica registration process - Add detailed docstrings to all key methods explaining ZK node creation and structure - Improve inline comments for better code maintainability All changes verified against ClickHouse source code and tested: - Lint: passed (176 files) - Unit tests: 120 passed - Integration tests: 4 scenarios, 97 steps passed
Test validates escape_for_zookeeper() for tables with special characters (dash, dots, percent). Ensures compatibility with ClickHouse escapeForFileName() logic.
532a79e to
9cdda97
Compare
Removed TODO and replaced execute() call with ctx.invoke(restart_command) in migration.py
Automatically perform DETACH/ATTACH after restore. Extract reusable detach_database() and attach_database() functions to database.py.
Remove unused ZookeeperDatabaseManager and fix early return in restore-replica command
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey - I've found 3 issues, and left some high level feedback:
- In
DatabaseMigrator._sync_table_uuidsyou build ZK paths withget_default_table_in_db_path(database_name, table_name), butZookeeperDatabaseManager._create_table_metadata_nodesnow stores metadata under an escaped table name, so UUID sync will fail for tables with special characters; consider using the same escaping logic (or a shared helper) when reading and writing table metadata paths. - The
system_restore_database_replicahelper constructsSYSTEM RESTORE DATABASE REPLICA {database_name}without quoting the database identifier, which may break for database names with special characters; consider wrapping the name in backticks or using parameterized / macro-based substitution similar to other queries.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `DatabaseMigrator._sync_table_uuids` you build ZK paths with `get_default_table_in_db_path(database_name, table_name)`, but `ZookeeperDatabaseManager._create_table_metadata_nodes` now stores metadata under an escaped table name, so UUID sync will fail for tables with special characters; consider using the same escaping logic (or a shared helper) when reading and writing table metadata paths.
- The `system_restore_database_replica` helper constructs `SYSTEM RESTORE DATABASE REPLICA {database_name}` without quoting the database identifier, which may break for database names with special characters; consider wrapping the name in backticks or using parameterized / macro-based substitution similar to other queries.
## Individual Comments
### Comment 1
<location> `ch_tools/chadmin/internal/database_replica.py:60-69` </location>
<code_context>
+def get_default_table_in_db_path(database_name: str, table_name: str) -> str:
</code_context>
<issue_to_address>
**issue (bug_risk):** Table name escaping for ZooKeeper metadata is inconsistent between readers and writers.
`_create_table_metadata_nodes` escapes `table_name` via `escape_for_zookeeper(table_name)`, but `get_default_table_in_db_path` uses the raw `table_name` when building the path used by `_sync_table_uuids`. For tables whose raw and escaped names differ, this will cause `get_zk_node` lookups to fail. Align these call sites so both read and write paths use the same escaping semantics for the ZK node name.
</issue_to_address>
### Comment 2
<location> `ch_tools/chadmin/internal/database_replica.py:64-73` </location>
<code_context>
+
-from ch_tools.chadmin.internal.utils import execute_query
+def get_tables_metadata(ctx: Context, database_name: str) -> dict[str, str]:
+ """Retrieve table metadata from ZooKeeper, returns dict mapping table names to CREATE statements."""
+ zk_tables_metadata: dict[str, str] = {}
+
+ with zk_client(ctx) as zk:
+ zk_metadata_path = f"{DEFAULT_ZK_ROOT}/{database_name}/{ZK_METADATA_SUBPATH}"
+ children = zk.get_children(zk_metadata_path)
+ if not children:
+ return zk_tables_metadata
+
+ for table_name in children:
+ try:
+ table_metadata_path = f"{zk_metadata_path}/{table_name}"
+ metadata_data = zk.get(table_metadata_path)
+ if metadata_data and metadata_data[0]:
+ zk_tables_metadata[table_name] = metadata_data[0].decode().strip()
+ except NoNodeError:
+ logging.warning(f"Table {table_name} metadata removed concurrently")
+
</code_context>
<issue_to_address>
**suggestion:** Handling of missing/empty ZooKeeper metadata nodes may hide partial failures.
If `get_children` returns tables but individual `get` calls fail or return empty data, those tables are dropped from the result and only appear as `missing_in_zk`, losing the signal that their ZK nodes existed but were unreadable. It would help to distinguish “no children” from “children with missing/empty metadata” by at least logging the affected table name and path when `metadata_data` is missing/empty, so partial ZK corruption is easier to diagnose.
Suggested implementation:
```python
import logging
from typing import Optional, Tuple
from click import Context
from kazoo.client import KazooClient
from kazoo.exceptions import NodeExistsError, NoNodeError
```
```python
def get_default_table_in_db_path(database_name: str, table_name: str) -> str:
return f"{DEFAULT_ZK_ROOT}/{database_name}/{ZK_METADATA_SUBPATH}/{table_name}"
def get_tables_metadata(ctx: Context, database_name: str) -> dict[str, str]:
"""Retrieve table metadata from ZooKeeper, returns dict mapping table names to CREATE statements."""
zk_tables_metadata: dict[str, str] = {}
with zk_client(ctx) as zk:
zk_metadata_path = f"{DEFAULT_ZK_ROOT}/{database_name}/{ZK_METADATA_SUBPATH}"
children = zk.get_children(zk_metadata_path)
if not children:
return zk_tables_metadata
for table_name in children:
table_metadata_path = f"{zk_metadata_path}/{table_name}"
try:
metadata_data = zk.get(table_metadata_path)
if metadata_data and metadata_data[0]:
zk_tables_metadata[table_name] = metadata_data[0].decode().strip()
else:
# Node exists but is empty or unreadable; log for easier diagnosis of partial failures.
logging.warning(
"Empty or missing ZooKeeper metadata for table %s at path %s",
table_name,
table_metadata_path,
)
except NoNodeError:
logging.warning(
"ZooKeeper metadata node for table %s was removed concurrently at path %s",
table_name,
table_metadata_path,
)
return zk_tables_metadata
```
</issue_to_address>
### Comment 3
<location> `ch_tools/chadmin/internal/zookeeper.py:37-46` </location>
<code_context>
return False
-class ZKTransactionBuilder:
- ctx: Context
- txn: TransactionRequest
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Context manager behavior for ZKTransactionBuilder may surprise callers who expect automatic commit/rollback.
`__enter__` currently returns the builder and `__exit__` always calls `reset()`, so `with ZKTransactionBuilder(...) as b:` will drop all operations unless `b.commit()` is explicitly called inside the block. This is easy to misuse given the context manager interface. Consider either committing automatically on successful exit (and resetting afterward), or clearly documenting this pattern and/or warning when exiting with uncommitted operations.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
…ilder behavior, and database name quoting - Add escape/unescape functions for ZK table names matching ClickHouse escapeForFileName logic - Improve logging for missing/empty ZK metadata nodes - Make ZKTransactionBuilder auto-commit on successful context exit - Add backticks to database name in SYSTEM RESTORE DATABASE REPLICA query
Summary
Refactored database migration to use native
SYSTEM RESTORE DATABASE REPLICAcommand (CH >= 25.8) instead of manual ZooKeeper management.Key Changes
DatabaseMigratorclass with simplified migration flow and version validationrestore_replica_with_system_command()as preferred method, legacy code moved toZookeeperDatabaseManager(deprecated)ZKTransactionBuilderclass for atomic ZooKeeper operationsBenefits
TODO
supervisorctl restart clickhouse-serverwith proper restart functionality once mergedSummary by Sourcery
Refactor ClickHouse database migration and replica restore flows to use the native SYSTEM RESTORE DATABASE REPLICA command when available, with a structured ZooKeeper-based fallback and updated helpers.
New Features:
Bug Fixes:
Enhancements:
Tests: