Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ Features:
and check if they are allowed in the current context.
- Can identify "unsafe" operations, e.g. operations that acquire locks
that can be dangerous for production database.
- Checks if newly created tables have primary keys to ensure better performance and data integrity.

## Installation

Expand All @@ -30,4 +31,4 @@ pip install "migration-lint"

## Documentation

Read the docs on [GitHub Pages](https://pandadoc.github.io/migration-lint/)
Read the docs on [GitHub Pages](https://pandadoc.github.io/migration-lint/)
14 changes: 13 additions & 1 deletion docs/classification.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,9 +125,17 @@ Backward-compatible migration
> **WARNING**: If there are foreign keys, table creation requires
> `ShareRowExclusiveLock` on the child tables, so use `lock_timeout`
> if the table to create contains foreign keys. `ADD FOREIGN KEY ... NOT VALID`
> does require the same lock, so it doesnt make much sense
> does require the same lock, so it doesn't make much sense
> to create foreign keys separately.

> **BEST PRACTICE**: Always include a primary key when creating tables.
> Tables without primary keys can lead to performance issues and complicate
> replication. The migration-lint primary key linter will warn about tables
> created without primary keys. Common patterns include:
> * `id SERIAL PRIMARY KEY` (auto-incrementing integer)
> * `id UUID PRIMARY KEY DEFAULT gen_random_uuid()` (UUID)
> * Composite primary keys for junction tables

### Drop Table

Backward-incompatible migration
Expand Down Expand Up @@ -522,6 +530,10 @@ Backward-compatible migration
* Drop foreign keys in other tables
* `ALTER TABLE <tablename> DROP CONSTRAINT IF EXISTS <cname>`.

> **NOTE**: The migration-lint primary key linter only checks for missing
> primary keys in newly created tables. It does not validate operations
> that modify or drop existing primary keys.

### Add Check Constraints

Backward-compatible migration
Expand Down
64 changes: 63 additions & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@ Features:
and check if they are allowed in the current context.
- Can identify "unsafe" operations, e.g. operations that acquire locks
that can be dangerous for production database.
- Primary Key Validation: Checks if newly created tables have primary keys
to ensure better performance and data integrity.

## Installation

Expand All @@ -31,7 +33,8 @@ pip install "migration-lint"
so it depends on the framework you use for migrations.
- **Linter** - class that checks migration's SQL and context
and returns errors if any. We have implemented our linter
for backward incompatible migrations as well as integrated `squawk` linter.
for backward incompatible migrations, integrated `squawk` linter,
and added a primary key validation linter.

## Run

Expand Down Expand Up @@ -74,6 +77,65 @@ migration-lint --loader=gitlab_branch --extractor=<your extractor>

I uses env variable CI_MERGE_REQUEST_ID or option --mr-id.

## Configuration Options

### Primary Key Validation

The primary key linter can be configured using the following option:

```shell
--primary-key-check-dry-run
```

- **Default**: `True` (dry-run mode enabled)
- **Environment variable**: `MIGRATION_LINTER_PRIMARY_KEY_CHECK_DRY_RUN`
- **Description**: When enabled (default), the primary key linter runs in dry-run mode,
logging warnings about tables created without primary keys but not failing the overall
linting process. When disabled, missing primary keys are treated as errors.

**Example usage:**

```shell
# Run with primary key validation in dry-run mode (default)
migration-lint --loader=local_git --extractor=django

# Disable dry-run mode to treat missing primary keys as errors
migration-lint --loader=local_git --extractor=django --primary-key-check-dry-run=false
```

## Linters

### Primary Key Linter

The Primary Key Linter checks if newly created tables have primary keys defined.
This is important for database performance and data integrity.

**What it checks:**
- Analyzes `CREATE TABLE` statements in migrations
- Detects tables created without a `PRIMARY KEY` constraint
- Provides helpful warnings or errors depending on configuration

**Example output in dry-run mode:**
```
🔍 Primary Key Check (dry-run mode):
⚠️ Table 'user_sessions' created without a primary key:
CREATE TABLE user_sessions (session_id VARCHAR(255), user_id INTEGER, created_at TIMESTAMP)
💡 Consider adding a primary key for better performance and data integrity.
```

**Example output when dry-run is disabled:**
```
Errors found in migrations:

- Table 'user_sessions' is created without a primary key: CREATE TABLE user_sessions (session_id VARCHAR(255), user_id INTEGER, created_at TIMESTAMP)
Consider adding a primary key for better performance and data integrity.
```

**Best practices:**
- Every table should have a primary key for optimal database performance
- Primary keys ensure row uniqueness and enable efficient indexing
- Common primary key patterns include auto-incrementing integers or UUIDs

## Feedback

We value feedback and are committed to supporting engineers throughout
Expand Down
2 changes: 2 additions & 0 deletions migration_lint/analyzer/__init__.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
from migration_lint.analyzer.base import Analyzer
from migration_lint.analyzer.compat import CompatibilityLinter
from migration_lint.analyzer.primary_key import PrimaryKeyLinter
from migration_lint.analyzer.squawk import SquawkLinter

__all__ = (
"Analyzer",
"CompatibilityLinter",
"SquawkLinter",
"PrimaryKeyLinter",
)
120 changes: 120 additions & 0 deletions migration_lint/analyzer/primary_key.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
from __future__ import annotations

from typing import List

from migration_lint import logger
from migration_lint.analyzer.base import BaseLinter
from migration_lint.extractor.model import ExtendedSourceDiff
from migration_lint.sql.parser import classify_migration
from migration_lint.util.colors import blue, yellow


class PrimaryKeyLinter(BaseLinter):
"""
Linter that checks if newly created tables have primary keys.

This linter operates in dry-run mode by default, meaning it only logs
warnings but doesn't cause the overall linting process to fail.
"""

def __init__(self, dry_run: bool = True):
"""
Initialize the PrimaryKeyLinter.

Args:
dry_run: If True, only log warnings without failing the linter.
If False, treat missing primary keys as errors.
"""
self.dry_run = dry_run

def lint(
self,
migration_sql: str,
changed_files: List[ExtendedSourceDiff],
) -> List[str]:
"""Perform primary key validation on CREATE TABLE statements."""

errors = []

try:
classification_result = classify_migration(migration_sql)

create_table_statements = [
(statement_sql, statement_type)
for statement_sql, statement_type in classification_result
if self._is_create_table_statement(statement_sql)
]

tables_without_pk = []
for statement_sql, _ in create_table_statements:
table_name = self._extract_table_name_from_sql(statement_sql)
if not self._sql_has_primary_key(statement_sql):
tables_without_pk.append({
'table_name': table_name or 'unknown',
'sql': statement_sql
})

if tables_without_pk:
if self.dry_run:
logger.info(yellow("\n🔍 Primary Key Check (dry-run mode):"))
for table_info in tables_without_pk:
logger.info(
yellow(
f" ⚠️ Table '{table_info['table_name']}' "
f"created without a primary key:"
)
)
logger.info(blue(f" {table_info['sql']}"))
logger.info(
yellow(
" 💡 Consider adding a primary key for better "
"performance and data integrity."
)
)
else:
for table_info in tables_without_pk:
errors.append(
f"- Table '{table_info['table_name']}' is created "
f"without a primary key: {table_info['sql']}\n"
f" Consider adding a primary key for better "
f"performance and data integrity."
)
else:
logger.info(
blue("✅ Primary Key Check: All created tables have primary keys.")
)

except Exception as e:
logger.info(
yellow(f"Primary Key Check: Could not analyze SQL - {str(e)}")
)

return errors

def _is_create_table_statement(self, statement_sql: str) -> bool:
"""Check if a SQL statement is a CREATE TABLE statement."""
normalized_sql = statement_sql.strip().upper()
return normalized_sql.startswith("CREATE TABLE")

def _extract_table_name_from_sql(self, statement_sql: str) -> str:
"""Extract table name from CREATE TABLE SQL statement."""
try:
import re

# Match CREATE TABLE [IF NOT EXISTS] table_name
match = re.search(
r'CREATE\s+TABLE\s+(?:IF\s+NOT\s+EXISTS\s+)?([^\s\(]+)',
statement_sql,
re.IGNORECASE
)
if match:
return match.group(1).strip('"`')
return "unknown"
except Exception:
return "unknown"

def _sql_has_primary_key(self, statement_sql: str) -> bool:
"""Check if CREATE TABLE SQL statement includes a primary key."""
normalized_sql = statement_sql.upper()

return "PRIMARY KEY" in normalized_sql
40 changes: 29 additions & 11 deletions migration_lint/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,14 @@
import click

from migration_lint import logger
from migration_lint.analyzer import Analyzer, CompatibilityLinter, SquawkLinter
from migration_lint.extractor import Extractor, DjangoExtractor
from migration_lint.source_loader import SourceLoader, LocalLoader
from migration_lint.analyzer import (
Analyzer,
CompatibilityLinter,
PrimaryKeyLinter,
SquawkLinter,
)
from migration_lint.extractor import DjangoExtractor, Extractor
from migration_lint.source_loader import LocalLoader, SourceLoader
from migration_lint.util.env import get_bool_env


Expand Down Expand Up @@ -87,30 +92,43 @@
"--ignore-extractor-not-found",
"ignore_extractor_not_found",
is_flag=True,
help="Don't fail the whole linter if extraction went fine, but info about particular migration couldn't be found",
help="Don't fail the whole linter if extraction went fine, but info "
"about particular migration couldn't be found",
default=os.getenv("MIGRATION_LINTER_IGNORE_EXTRACTOR_NOT_FOUND", False),
)
@click.option(
"--primary-key-check-dry-run",
"primary_key_check_dry_run",
is_flag=True,
help="Primary key validation dry-run mode",
default=os.getenv("MIGRATION_LINTER_PRIMARY_KEY_CHECK_DRY_RUN", True),
)
def main(
loader_type: str,
extractor_type: str,
squawk_config_path: str,
squawk_pg_version: str,
primary_key_check_dry_run: bool,
**kwargs,
) -> None:
logger.info("Start analysis..")

loader = SourceLoader.get(loader_type)(**kwargs)
extractor = Extractor.get(extractor_type)(**kwargs)

linters = [
CompatibilityLinter(),
SquawkLinter(
config_path=squawk_config_path,
pg_version=squawk_pg_version,
),
PrimaryKeyLinter(dry_run=primary_key_check_dry_run),
]

analyzer = Analyzer(
loader=loader,
extractor=extractor,
linters=[
CompatibilityLinter(),
SquawkLinter(
config_path=squawk_config_path,
pg_version=squawk_pg_version,
),
],
linters=linters,
)
analyzer.analyze()

Expand Down
Loading
Loading