Skip to content

Releases: valehdba/pgclone

v4.0.0 — Schema Namespace

15 Apr 12:17

Choose a tag to compare

⚠️ BREAKING CHANGE — All pgclone functions now live under the pgclone schema.

What Changed

The pgclone schema is created automatically when you run CREATE EXTENSION pgclone. All functions have been moved into this schema and the pgclone_ prefix has been removed:

Before (v3.x) After (v4.0.0)
pgclone_table(...) pgclone.table(...)
pgclone_schema(...) pgclone.schema(...)
pgclone_database(...) pgclone.database(...)
pgclone_database_create(...) pgclone.database_create(...)
pgclone_table_async(...) pgclone.table_async(...)
pgclone_schema_async(...) pgclone.schema_async(...)
pgclone_progress(...) pgclone.progress(...)
pgclone_cancel(...) pgclone.cancel(...)
pgclone_resume(...) pgclone.resume(...)
pgclone_jobs() pgclone.jobs()
pgclone_clear_jobs() pgclone.clear_jobs()
pgclone_progress_detail() pgclone.progress_detail()
pgclone_jobs_view pgclone.jobs_view
pgclone_discover_sensitive(...) pgclone.discover_sensitive(...)
pgclone_mask_in_place(...) pgclone.mask_in_place(...)
pgclone_create_masking_policy(...) pgclone.create_masking_policy(...)
pgclone_drop_masking_policy(...) pgclone.drop_masking_policy(...)
pgclone_clone_roles(...) pgclone.clone_roles(...)
pgclone_verify(...) pgclone.verify(...)
pgclone_masking_report(...) pgclone.masking_report(...)
pgclone_table_ex(...) pgclone.table_ex(...)
pgclone_schema_ex(...) pgclone.schema_ex(...)
pgclone_functions(...) pgclone.functions(...)
pgclone_version() pgclone.version()

Upgrade Path

This is a breaking change. To upgrade from v3.x:

DROP EXTENSION pgclone;
CREATE EXTENSION pgclone;
SELECT pgclone.version();

All application queries must be updated to use the pgclone. prefix.

Quick Start

CREATE EXTENSION pgclone;

SELECT pgclone.table(
'host=source-server dbname=mydb user=postgres password=secret',
'public', 'customers', true
);

SELECT pgclone.schema(
'host=source-server dbname=mydb user=postgres password=secret',
'sales', true
);

Compatibility

  • PostgreSQL 14, 15, 16, 17, 18
  • No C API changes — only SQL function names affected
  • All 115 tests pass across all supported versions

v3.6.0 - GDPR/Compliance Masking

10 Apr 04:35

Choose a tag to compare

Generate audit reports proving that sensitive data is properly masked in your cloned databases. Essential for GDPR, HIPAA, and SOX compliance.

Usage

SELECT * FROM pgclone_masking_report('public');
 schema_name | table_name | column_name | sensitivity  | mask_status   | recommendation
-------------+------------+-------------+--------------+---------------+--------------------------------------
 public      | employees  | full_name   | PII - Name   | UNMASKED      | Apply mask strategy: name
 public      | employees  | email       | Email        | UNMASKED      | Apply mask strategy: email
 public      | employees  | phone       | Phone        | UNMASKED      | Apply mask strategy: phone
 public      | employees  | salary      | Financial    | UNMASKED      | Apply mask strategy: random_int
 public      | employees  | ssn         | National ID  | UNMASKED      | Apply mask strategy: null
 public      | users      | email       | Email        | MASKED (view) | OK - masked via users_masked view
 public      | users      | password    | Credential   | MASKED (view) | OK - masked via users_masked view

Sensitivity Categories

Email, PII - Name, Phone, National ID, Financial, Credential, Address, Date of Birth, Credit Card, IP Address — ~40 patterns across 10 categories.

Mask Status

Status Meaning
MASKED (view) A _masked view exists — data is protected via dynamic masking
UNMASKED Sensitive column has no masking — action needed

Compliance Workflow

-- 1. Clone production data
SELECT pgclone_database('host=prod dbname=myapp user=postgres', true);

-- 2. Find unmasked PII
SELECT * FROM pgclone_masking_report('public') WHERE mask_status = 'UNMASKED';

-- 3. Apply masking
SELECT pgclone_create_masking_policy('public', 'employees',
'{"email": "email", "full_name": "name", "ssn": "null"}', 'dba_team');

-- 4. Confirm — all sensitive columns now masked
SELECT * FROM pgclone_masking_report('public');

Also in This Release

  • Refactored sensitivity rules into shared SensitivityRule struct and pgclone_match_sensitivity() helper — used by both pgclone_discover_sensitive and pgclone_masking_report

Install

git clone https://github.com/valehdba/pgclone.git
cd pgclone && make && sudo make install
DROP EXTENSION IF EXISTS pgclone; CREATE EXTENSION pgclone;
SELECT pgclone_version();  -- pgclone 3.6.0

v3.5.0 — Clone Verification

09 Apr 18:28

Choose a tag to compare

Compare row counts between source and target databases after cloning to verify completeness.

Usage

-- Verify a specific schema
SELECT * FROM pgclone_verify(
    'host=source-server dbname=prod user=postgres',
    'app_schema'
);
 schema_name |  table_name  | source_rows | target_rows | match
-------------+--------------+-------------+-------------+--------------
 app_schema  | customers    |       15230 |       15230 | ✓
 app_schema  | orders       |      148920 |      148920 | ✓
 app_schema  | payments     |       98100 |       97855 | ✗
 app_schema  | audit_log    |     1204500 |           0 | ✗ (missing)
-- Verify all schemas
SELECT * FROM pgclone_verify(
    'host=source-server dbname=prod user=postgres'
);

Match Indicators

Indicator Meaning
Row counts are equal
Row counts differ
✗ (missing) Table exists on source but not on target

How It Works

  • Uses pg_class.reltuples for fast approximate counts — no full table scans needed
  • Connects to both source (via libpq) and local (via loopback) in a single call
  • Returns a standard PostgreSQL table result — can be filtered, sorted, joined
  • Works with regular and partitioned tables

Typical Workflow

-- 1. Clone a schema
SELECT pgclone_schema('host=prod dbname=myapp user=postgres', 'sales', true);

-- 2. Verify the clone
SELECT * FROM pgclone_verify('host=prod dbname=myapp user=postgres', 'sales');

-- 3. Check for mismatches
SELECT * FROM pgclone_verify('host=prod dbname=myapp user=postgres', 'sales')
WHERE match != '✓';

What's New

  • pgclone_verify(conninfo) — all user schemas
  • pgclone_verify(conninfo, schema) — single schema
  • SET-RETURNING FUNCTION using SRF pattern with VerifyState in multi_call_memory_ctx
  • 5 new pgTAP tests (84 total, 23 test groups, 22 C functions)

Install

git clone https://github.com/valehdba/pgclone.git
cd pgclone && make && sudo make install
DROP EXTENSION IF EXISTS pgclone; CREATE EXTENSION pgclone;
SELECT pgclone_version();  -- pgclone 3.5.0

v3.4.0 — Clone Roles with Permissions and Passwords

09 Apr 08:30

Choose a tag to compare

Clone database roles from a source PostgreSQL instance with their encrypted passwords, attributes, memberships, and all privilege grants.

Usage

-- Clone all non-system roles
SELECT pgclone_clone_roles(
    'host=prod dbname=myapp user=postgres password=secret'
);
-- OK: 8 roles created, 2 roles updated, 45 grants applied

-- Clone specific roles (comma-separated)
SELECT pgclone_clone_roles(
'host=prod dbname=myapp user=postgres password=secret',
'app_user, reporting_user, api_service'
);
-- OK: 3 roles created, 0 roles updated, 18 grants applied

-- Clone a single role
SELECT pgclone_clone_roles(
'host=prod dbname=myapp user=postgres password=secret',
'app_user'
);
-- OK: 1 roles created, 0 roles updated, 7 grants applied

What Gets Cloned

Category Details
Role attributes LOGIN, SUPERUSER, CREATEDB, CREATEROLE, REPLICATION, INHERIT, CONNECTION LIMIT, VALID UNTIL
Passwords Encrypted password (SCRAM-SHA-256 / md5 hash) copied as-is from pg_authid
Role memberships GRANT role TO role relationships
Schema privileges USAGE, CREATE
Table privileges SELECT, INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER
Sequence privileges USAGE, SELECT, UPDATE
Function privileges EXECUTE on functions and procedures

Existing Role Behavior

If a role already exists on the target — no DROP, no recreate:

  • Password is updated to match source (hash preserved)
  • Role attributes (LOGIN, CREATEDB, etc.) synced via ALTER ROLE
  • Permissions applied additively (existing grants not revoked)

Requirements

  • Superuser on both source and target (pg_authid access)
  • System roles (pg_*) and postgres are excluded automatically

Bug Fixes in This Release

  • PASSWORD instead of ENCRYPTED PASSWORD to correctly preserve pre-hashed SCRAM/md5 passwords
  • aclexplode() schema privilege query was calling the function twice (SELECT + LATERAL) causing duplicate grants
  • Function EXECUTE grant counter was incrementing even on failure

Install

git clone https://github.com/valehdba/pgclone.git
cd pgclone && make && sudo make install
DROP EXTENSION IF EXISTS pgclone; CREATE EXTENSION pgclone;
SELECT pgclone_version();  -- pgclone 3.4.0

v3.3.0: Dynamic data masking via views

08 Apr 20:32

Choose a tag to compare

Role-based masking policies that preserve original data while presenting masked views to unprivileged users. Unlike static masking (v3.2.0), dynamic masking keeps the base table intact — different roles see different data.

New Functions

Create a masking policy

SELECT pgclone_create_masking_policy(
    'public', 'employees',
    '{"email": "email", "full_name": "name", "ssn": "null",
      "salary": {"type": "random_int", "min": 40000, "max": 200000}}',
    'dba_team'
);

This does four things:

  1. Creates view public.employees_masked with mask expressions
  2. Revokes SELECT on public.employees from PUBLIC
  3. Grants SELECT on public.employees_masked to PUBLIC
  4. Grants SELECT on public.employees to dba_team

Drop a masking policy

SELECT pgclone_drop_masking_policy('public', 'employees');

Drops the masked view and restores base table access to PUBLIC.

How It Works

┌─────────────────────────────────────────────────────┐
│                   Base Table                        │
│  employees: Alice, alice@example.com, 95000, 123-.. │
├─────────────────────────────────────────────────────┤
│         ▲ SELECT granted                            │
│         │ to dba_team only                          │
│                                                     │
│   ┌─────────────────────────────────────────┐       │
│   │          Masked View                     │       │
│   │  employees_masked:                       │       │
│   │    XXXX, a***@example.com, 87432, NULL   │       │
│   ├─────────────────────────────────────────┤       │
│   │  SELECT granted to PUBLIC                │       │
│   └─────────────────────────────────────────┘       │
└─────────────────────────────────────────────────────┘

Regular users query employees_masked → see anonymized data dba_team role queries employees directly → see raw data

Complete Masking Toolkit (v3.0–v3.3)

pgclone now offers three complementary masking modes:

Mode Function When to Use
Clone-time (v3.0.0) "mask" in options JSON Data never reaches local disk unmasked
Static (v3.2.0) pgclone_mask_in_place() Permanently modify existing tables
Dynamic (v3.3.0) pgclone_create_masking_policy() Preserve originals, role-based access

Plus pgclone_discover_sensitive() (v3.1.0) to automatically detect which columns need masking.

What's New

  • pgclone_create_masking_policy(schema, table, mask_json, privileged_role)
  • pgclone_drop_masking_policy(schema, table)
  • Uses CREATE OR REPLACE VIEW for idempotent policy updates
  • All 8 mask strategies work in dynamic masking
  • 7 new pgTAP tests (73 total, 21 test groups)
  • 20 C functions, ~4135 lines in pgclone.c

Install

git clone https://github.com/valehdba/pgclone.git
cd pgclone && make && sudo make install
DROP EXTENSION IF EXISTS pgclone; CREATE EXTENSION pgclone;
SELECT pgclone_version();  -- pgclone 3.3.0

v3.2.0: Static data masking on cloned data

08 Apr 20:31

Choose a tag to compare

Apply masking to already-cloned local tables via UPDATE — no source connection needed. Useful for sanitizing existing dev/staging databases or applying different mask strategies after an initial clone.

New Function

SELECT pgclone_mask_in_place(
    'public', 'employees',
    '{"email": "email", "full_name": "name", "ssn": "null"}'
);
-- Returns: OK: masked 1000 rows in public.employees (3 columns)

Same mask JSON format as clone-time masking. All 8 strategies work.

Use Cases

  • Mask tables that were cloned without masking
  • Apply different mask strategies to an already-cloned table
  • Sanitize existing development/staging databases
  • Re-mask after schema changes

Bug Fixes

This release also includes 4 bug fixes to the masking engine:

Bug Impact Fix
random_int range inversion min > max produced out-of-range values Swap when inverted
random_int zero defaults Unset min/max always returned 0 Default to [0, 99999]
constant without value Simple string "constant" produced empty string Default to "REDACTED"
mask_in_place buffer overflow Large mask JSON silently truncated on stack Replaced with StringInfo
Missing simple-string types "random_int" and "constant" not recognized as simple strings Added to parser

What's New

  • pgclone_mask_in_place(schema, table, mask_json) SQL function
  • Reuses pgclone_build_mask_expr() for consistent behavior across all masking paths
  • 7 new pgTAP tests (66 total, 20 test groups)

v3.1.0: Auto-discovery of sensitive data

08 Apr 20:31

Choose a tag to compare

pgclone v3.1.0 — Auto-Discovery of Sensitive Data

Automatically scan a source database schema for columns that look like sensitive data and get suggested masking strategies — ready to paste into your clone command.

New Function

SELECT pgclone_discover_sensitive(
    'host=prod dbname=myapp user=readonly',
    'public'
);

Returns JSON grouped by table:

{
  "employees": {"email": "email", "full_name": "name", "phone": "phone", "salary": "random_int", "ssn": "null"},
  "users": {"password": "hash", "api_key": "hash", "ip_address": "hash"}
}

Detection Patterns (~40 rules across 10 categories)

Category Detected Patterns
Email email, e_mail
Name first_name, last_name, full_name, firstname, lastname, *_name
Phone phone, mobile, telephone, fax
SSN / National ID ssn, social_security, national_id, tax_id
Financial salary, income, wage, compensation
Credentials password, passwd, secret, token, api_key, apikey
Address address, street, zip, zipcode, postal
Date of Birth birth, dob, date_of_birth
Credit Card card_number, credit_card, ccn, cvv
IP Address ip_address, ipaddress

Typical Workflow

-- 1. Discover what needs masking
SELECT pgclone_discover_sensitive(conn, 'public');

-- 2. Use the output directly in a masked clone
SELECT pgclone_table(conn, 'public', 'employees', true, 'employees_dev',
'{"mask": {"email": "email", "full_name": "name", "salary": "random_int", "ssn": "null"}}');

What's New

  • pgclone_discover_sensitive(conninfo, schema) SQL function
  • Case-insensitive pattern matching against column names
  • Scans regular tables and partitioned tables
  • 6 new pgTAP tests (59 total, 19 test groups)

v3.0.0: Data masking / anonymization engine

08 Apr 20:30

Choose a tag to compare

pgclone v3.0.0 — Data Masking / Anonymization Engine

Clone PostgreSQL tables with column-level data anonymization applied during transfer. Masking happens server-side as SQL expressions inside COPY (SELECT ...) TO STDOUT — zero row-by-row processing overhead.

8 Masking Strategies

Strategy Example Output NULL Handling
email a***@example.com Preserves NULL
name XXXX Preserves NULL
phone ***-4567 Preserves NULL
partial Jo***son (configurable prefix/suffix) Preserves NULL
hash 5d41402abc4b2a76b9719d911017c592 (MD5, deterministic) Preserves NULL
null NULL Always NULL
random_int Random in [min, max] Always produces value
constant Fixed replacement value Always produces value

Usage

-- Clone with email and name masking
SELECT pgclone_table(
    'host=prod dbname=myapp user=readonly',
    'public', 'users', true, 'users',
    '{"mask": {"email": "email", "full_name": "name", "ssn": "null"}}'
);

-- Parameterized masks
SELECT pgclone_table(conn, 'hr', 'employees', true, 'employees_dev',
'{"mask": {"salary": {"type": "random_int", "min": 40000, "max": 200000},
"last_name": {"type": "partial", "prefix": 1, "suffix": 2}}}'
);

-- Combine with WHERE filter
SELECT pgclone_table(conn, 'public', 'users', true, 'users_dev',
'{"mask": {"email": "email"}, "where": "status = ''active''"}'
);

What's New

  • MaskRule struct and pgclone_build_mask_expr() for extensible mask strategy architecture
  • Masking applied server-side in COPY (SELECT ...) TO STDOUT — no external dependencies
  • Queries pg_catalog.pg_attribute for column names when masks active without explicit column list
  • Fully composable with existing columns, where, indexes, constraints, triggers options
  • test_schema.employees test fixture with realistic sensitive data
  • 15 new pgTAP tests (53 total, 18 test groups)

Install

git clone https://github.com/valehdba/pgclone.git
cd pgclone && make && sudo make install
CREATE EXTENSION pgclone;
SELECT pgclone_version();  -- pgclone 3.0.0

v2.2.0 - Worker pool for parallel cloning

04 Apr 21:53

Choose a tag to compare

What's New

Parallel cloning ("parallel": N) now uses a fixed-size worker pool instead of spawning one background worker per table. This is a major architectural improvement that dramatically reduces resource consumption on large schemas.

How It Works

When you run SELECT pgclone_schema_async(conn, 'sales', true, '{"parallel": 4}'), pgclone now:

  1. Queries the source for the full table list and populates a shared-memory task queue
  2. Launches exactly 4 background workers (not one per table)
  3. Each worker pulls the next unclaimed table from the queue, clones it, then grabs the next
  4. Faster workers automatically handle more tables (dynamic load balancing)
  5. Parent job tracks aggregate progress across all workers in real time

Before vs After

v2.0.0 (per-table workers) v2.2.0 (worker pool)
100 tables, parallel=4 100 bgworkers launched 4 bgworkers
DB connections 200 (100 × 2) 8 (4 × 2)
Job slots used 101 (1 parent + 100 child) 5 (1 parent + 4 pool)
max_worker_processes needed ≥ 100 ≥ 4
Load balancing Static (pre-assigned chunks) Dynamic (work-stealing)

No API Changes

The SQL interface is identical — same "parallel" JSON option, same pgclone_schema_async() function signature. Existing scripts work without modification.

Changes

  • Fixed-size worker pool replaces per-table bgworker spawning
  • Exactly N workers launched, each pulling from shared task queue
  • Dynamic load balancing: faster workers handle more tables
  • Resource usage: O(N) instead of O(tables) for bgworkers/connections
  • Explicit worker_job_ids tracking for safe pool finalization
  • Max 512 tables per pool operation
  • No API changes

Upgrade Notes

  • PostgreSQL restart required — shared memory layout has changed
  • Requires shared_preload_libraries = 'pgclone' (same as before for async operations)
  • Maximum 512 tables per pool operation (PGCLONE_MAX_POOL_TASKS)
  • Only one pool-based parallel operation can run at a time per cluster

Bug Fix Included

  • Pool worker finalization now uses explicit worker_job_ids[] tracking instead of sentinel-based slot scanning, preventing potential false matches when other async jobs run concurrently

Installation

sudo make install
# Restart PostgreSQL, then:
ALTER EXTENSION pgclone UPDATE TO '2.2.0';

##Supported PostgreSQL Versions
PostgreSQL 14, 15, 16, 17, 18
Full Changelog: https://github.com/valehdba/pgclone/blob/main/CHANGELOG.md

v2.1.3: async bugfixes, clear_jobs, async tests, docs restructure

04 Apr 08:27

Choose a tag to compare

Bugfixes

  • bgw_copy_data: COPY pipeline error handling — failures now logged, connection leak fixed
  • WaitForBackgroundWorkerStartup added to all async functions — jobs no longer stuck in pending
  • pgclone_schema_async parallel: fixed hardcoded jobs[0] shared memory corruption
  • pgclone_schema_async parallel: parent job status transition fix

New

  • pgclone_clear_jobs() — free completed/cancelled job slots
  • Async test suite (test/test_async.sh)
  • CONTRIBUTING.md, SECURITY.md

Documentation Restructured

  • docs/USAGE.md, docs/ASYNC.md, docs/TESTING.md, docs/ARCHITECTURE.md
  • CHANGELOG.md — full version history

Full Changelog: v2.1.2...v2.1.3