-
Notifications
You must be signed in to change notification settings - Fork 0
Test .sourcery.yaml #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@sourcery-ai review |
Reviewer's GuideThis PR replaces the temporary-database migration approach with direct ZooKeeper transactions, reorganizes imports, introduces ZK helper functions and metadata utilities for schema consistency, updates the CLI command accordingly, and extends the BDD tests with restart steps and a new MergeTree scenario. Sequence Diagram for First Replica ZooKeeper InteractionsequenceDiagram
participant M as migrate_as_first_replica
participant KZC as KazooClient
participant TXN as ZKTransaction
participant DBMetadata as DatabaseMetadata
participant CH as ClickHouse
M->>KZC: Get client
M->>KZC: transaction()
KZC-->>TXN: ZKTransaction object
M->>TXN: create("/clickhouse/db/first_replica_database_name", ...)
M->>TXN: create(query_node_path, ...)
M->>TXN: create(database_replica_path, ...)
M->>TXN: create(table_metadata_in_zk_path, ...)
M->>TXN: commit()
TXN-->>M: Commit results
M->>M: _update_local_metadata_first_replica(db)
Note right of M: Calls DBMetadata.set_replicated()
Note right of M: Calls DBMetadata.update_metadata_file()
M->>CH: DETACH DATABASE db
M->>CH: ATTACH DATABASE db
Class Diagram for Updated Metadata and Migration LogicclassDiagram
class DatabaseMetadata {
+database_name: string
+database_engine: DatabaseEngine
+replica_path: string
+shard: string
+replica_name: string
+set_replicated()
+update_metadata_file()
}
class migration_module {
<<Module>>
+migrate_as_first_replica(ctx, migrating_database)
+migrate_as_non_first_replica(ctx, migrating_database)
+create_database_nodes(ctx, migrating_database)
+is_table_schema_equal(ctx, db, table, path) bool
+get_shard_and_replica_from_macros(ctx) Tuple[str, str]
-_update_local_metadata_first_replica(ctx, db)
-_check_tables_consistent(ctx, db, tables_info)
-_generate_counter(ctx, zk, db) string
-_create_query_node(ctx, txn, db, counter)
-_create_database_replica(ctx, txn, db)
-_create_database_metadata_nodes(ctx, txn, db)
-_attach_dbs(ctx, dbs)
-_detach_dbs(ctx, dbs)
}
class table_module {
<<Module>>
+read_local_table_metadata(ctx, table_local_metadata_path) string
+change_table_uuid(ctx, db, table, path, new_uuid, attached) bool
}
class KazooClient {
<<External Library>>
+transaction() TransactionRequest
+create(path, value, sequence, ephemeral, makepath)
+exists(path)
+commit()
}
class TransactionRequest {
<<External Library>>
+create(path, value)
+delete(path)
+commit()
}
migration_module ..> DatabaseMetadata : uses
migration_module ..> table_module : uses
migration_module ..> KazooClient : uses
migration_module ..> TransactionRequest : uses
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @Alex-Burmak - I've reviewed your changes - here's some feedback:
- migration.py has grown into a very large file—consider splitting it into focused modules (e.g. ZK transaction builders, metadata comparators, and replica workflows) to improve readability and maintainability.
- Several try/except blocks catch broad Exception only to re‐raise—narrow these to specific exceptions (e.g. NodeExistsError) to avoid masking unexpected errors and clarify your error flows.
- The function
_change_tables_uuidreturns a boolean but its name suggests only an action—consider renaming it to convey its status return (e.g.has_tables_uuid_changed) for clearer intent.
Here's what I looked at during the review
- 🟡 General issues: 6 issues found
- 🟢 Security: all looks good
- 🟢 Review instructions: all looks good
- 🟢 Testing: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| logging.info("result contains NodeExistsError.") | ||
| raise NodeExistsError() | ||
| if isinstance(result, Exception): | ||
| logging.error("result contains ex={}, type=P{}.", result, type(result)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (typo): Fix logging placeholder typo in _check_result_txn
The format string should use 'type={}' instead of 'type=P{}'. Please correct the placeholder to avoid formatting errors.
| logging.error("result contains ex={}, type=P{}.", result, type(result)) | |
| logging.error("result contains ex={}, type={}.", result, type(result)) |
|
|
||
| first_replica = False | ||
| except Exception as ex: | ||
| logging.info("create_database_nodes failed with ex={}", type(ex)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: Correct logging arguments for exception details
The log message expects one argument but receives type(ex). To log both the exception and its type, include both in the message or use an f-string for clarity.
| shard = replace_macros("{shard}", get_macros(ctx)) | ||
| replica = replace_macros("{replica}", get_macros(ctx)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: Use the loaded macros dict instead of calling get_macros again
Pass the existing macros variable to replace_macros instead of calling get_macros(ctx) again.
| if "Replicated" in row["engine"]: | ||
| logging.warning( | ||
| "Replicated table engine {} can have different schema. Continue.", | ||
| row["engine"], | ||
| ) | ||
| continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (bug_risk): Strengthen replicated-engine detection
Literal substring matching may fail if the engine string's case or format changes. Use lowercasing or a regex for more reliable detection of ReplicatedMergeTree and similar engines.
| if "Replicated" in row["engine"]: | |
| logging.warning( | |
| "Replicated table engine {} can have different schema. Continue.", | |
| row["engine"], | |
| ) | |
| continue | |
| import re | |
| if re.search(r"replicated", row["engine"], re.IGNORECASE): | |
| logging.warning( | |
| "Replicated table engine {} can have different schema. Continue.", | |
| row["engine"], | |
| ) | |
| continue |
| local_table_metadata = ( | ||
| local_table_metadata[:start_pos_uuid] + local_table_metadata[finish_pos_uuid:] | ||
| ) | ||
| zk_table_metadata = ( | ||
| zk_table_metadata[:start_pos_uuid] + zk_table_metadata[finish_pos_uuid:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Avoid hardcoding UUID length slicing in metadata comparison
Relying on a fixed UUID length may cause issues if the format changes. Use parsing or regex to handle the UUID portion more robustly.
Suggested implementation:
import re
metadata_prefix = "ATTACH TABLE _ UUID '"
# Regex to match the UUID in the metadata string
uuid_regex = re.compile(r"(ATTACH TABLE _ UUID ')([0-9a-fA-F-]{36})(')")
def remove_uuid(metadata):
return uuid_regex.sub(r"\1\3", metadata, count=1)
local_table_metadata = remove_uuid(local_table_metadata)
zk_table_metadata = remove_uuid(zk_table_metadata)| ex, | ||
| ) | ||
| sys.exit(1) | ||
|
|
||
|
|
||
| def read_local_table_metadata(ctx: Context, table_local_metadata_path: str) -> str: | ||
| if match_str_ch_version(get_version(ctx), "25.1"): | ||
| table_local_metadata_path = f"{CLICKHOUSE_PATH}/{table_local_metadata_path}" | ||
|
|
||
| with open(table_local_metadata_path, "r", encoding="utf-8") as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Extract version-specific path logic to a helper
Consider moving the version check and metadata-path prefix logic from both change_table_uuid and read_local_table_metadata into a shared utility function to avoid duplication.
| missing = [macro for macro in ["shard", "replica"] if macro not in macros] | ||
| if missing: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code-quality): Use named expression to simplify assignment and conditional (use-named-expression)
| missing = [macro for macro in ["shard", "replica"] if macro not in macros] | |
| if missing: | |
| if missing := [ | |
| macro for macro in ["shard", "replica"] if macro not in macros | |
| ]: |
| was_changed = _change_tables_uuid(ctx, tables_info, migrating_database) | ||
|
|
||
| if was_changed: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code-quality): Use named expression to simplify assignment and conditional (use-named-expression)
| was_changed = _change_tables_uuid(ctx, tables_info, migrating_database) | |
| if was_changed: | |
| if was_changed := _change_tables_uuid( | |
| ctx, tables_info, migrating_database | |
| ): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How to suppress code-quality suggestions "use-named-expression"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can turn off that rule in two ways:
- Suppress it inline, for just this one spot, by adding a skip comment above the code you don’t want rewritten:
# sourcery skip: use-named-expression
if was_changed:
_change_tables_uuid(ctx, tables_info, migrating_database)- Disable it globally (or per-file) in your .sourcery.yaml. For example, to disable it project-wide:
# .sourcery.yaml
rules:
use-named-expression: falseOr to disable it only for a specific file:
# .sourcery.yaml
files:
ch_tools/chadmin/internal/migration.py:
exclude_rules:
- use-named-expressionEither approach will prevent Sourcery from suggesting that particular refactoring.
| logging.info( | ||
| "New metadata for node from mapping table:\n{}\n===", target_metadata | ||
| ) | ||
| txn.create( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (code-quality): Use f-string instead of string concatenation [×4] (use-fstring-for-concatenation)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How to configure current github repository to suppress this type of code-quality suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can tune Sourcery’s behavior at the repo level by adding a config file in your project root (either .sourcery.yaml, .sourcery.yml or .sourcery.json) and disabling just the rule(s) you don’t want. For example, to silence only the “use-fstring-for-concatenation” suggestions:
# .sourcery.yaml
analysis:
exclude-rules:
- use-fstring-for-concatenationIf you’d rather turn off all “code-quality” suggestions, you can also filter by category:
# .sourcery.yaml
analysis:
exclude-categories:
- code-qualityAfter you commit that file to your default branch, Sourcery’s GitHub Action / Bot will pick it up automatically and stop reporting the excluded rules.
–––
If you need to suppress a single suggestion inline you can also sprinkle one-off skips in your code:
# sourcery skip: use-fstring-for-concatenation
some_var = "foo" + barBut for a global repo-wide setting, the .sourcery.yaml method above is the way to go.
| metadata_path = row["metadata_path"] | ||
|
|
||
| if match_str_ch_version(get_version(ctx), "25.1"): | ||
| metadata_path = CLICKHOUSE_PATH + "/" + metadata_path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code-quality): Use f-string instead of string concatenation [×2] (use-fstring-for-concatenation)
| metadata_path = CLICKHOUSE_PATH + "/" + metadata_path | |
| metadata_path = f"{CLICKHOUSE_PATH}/{metadata_path}" |
Summary by Sourcery
Overhaul the database migration workflow to eliminate the temporary database step and drive replication setup directly through Zookeeper transactions, while adding robust schema consistency checks and simplifying the CLI command logic.
New Features:
Enhancements:
Tests: