Skip to content

benzsevern/goldenpipe

GoldenPipe

Golden Suite orchestrator -- Check quality, fix issues, deduplicate records. One command. Built by Ben Severn.

PyPI CI codecov Downloads Python 3.11+ License: MIT Docs DQBench Pipeline Open In Colab

What It Does

Raw Data
  | GoldenCheck   -- profile & discover quality issues
  | GoldenFlow    -- fix issues, standardize, reshape
  | GoldenMatch   -- deduplicate, match, create golden records
  v
Golden Records

GoldenPipe orchestrates the full pipeline with adaptive logic:

  • Skips transformation if no quality issues found
  • Routes to privacy-preserving matching if sensitive fields detected
  • Reports reasoning for every decision

Install

pip install goldenpipe

Quick Start

import goldenpipe as gp

result = gp.run("customers.csv")

print(result.status)        # "success"
print(result.check)         # Quality findings
print(result.transform)     # What was fixed
print(result.match)         # Deduplicated clusters
print(result.reasoning)     # Why each decision was made

CLI

goldenpipe run customers.csv                # Full pipeline
goldenpipe run customers.csv --verbose      # Show reasoning
goldenpipe run customers.csv --skip-flow    # Check + Match only
goldenpipe run customers.csv --strategy pprl  # Force privacy mode
goldenpipe run customers.csv -o golden.csv  # Save golden records

Remote MCP Server

GoldenPipe is available as a hosted MCP server on Smithery — connect from any MCP client without installing anything.

Claude Desktop / Claude Code:

{
  "mcpServers": {
    "goldenpipe": {
      "url": "https://goldenpipe-mcp-production.up.railway.app/mcp/"
    }
  }
}

Local server:

pip install goldenpipe[mcp]
goldenpipe mcp-serve

4 tools available: list pipeline stages, validate wiring, run full check-transform-match pipeline, explain configs.

Part of the Golden Suite

Tool Purpose Install
GoldenCheck Validate & profile data quality pip install goldencheck
GoldenFlow Transform & standardize data pip install goldenflow
GoldenMatch Deduplicate & match records pip install goldenmatch
GoldenPipe Orchestrate the full pipeline pip install goldenpipe

Author

Ben Severn

License

MIT

About

Golden Suite orchestrator — chains validation (GoldenCheck), transformation (GoldenFlow), and entity resolution (GoldenMatch). 4 MCP tools on Smithery. DQBench Pipeline: 88.07.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Sponsor this project

 

Packages

 
 
 

Contributors