A lightweight dbt-like CLI tool built in Rust for SQL templating, compilation, and execution against DuckDB.
- SQL Templating: Jinja-style templating with
config()andvar()functions - Custom Macros: Reusable SQL macros loaded from
macro_pathsdirectories - AST-based Dependencies: Automatically extracts dependencies from SQL using
sqlparser-rs- no need forref()orsource()functions - Dependency-aware Execution: Builds a DAG and executes models in topological order
- Schema Testing: Built-in support for 8 test types (
unique,not_null,positive,non_negative,accepted_values,min_value,max_value,regex) with sample failing rows - Source Definitions: Document and test external data sources
- Documentation Generation: Generate markdown or JSON docs from schema files
- DuckDB Backend: In-memory or file-based DuckDB database execution
- Multiple Output Formats: JSON, table, and tree output formats
curl -fsSL https://raw.githubusercontent.com/datastx/Feather-Flow/main/install.sh | bashThis detects your OS and architecture, downloads the correct binary from the latest GitHub Release, verifies the SHA256 checksum, and installs to ~/.local/bin.
Options:
# Install a specific version
curl -fsSL https://raw.githubusercontent.com/datastx/Feather-Flow/main/install.sh | FF_VERSION=0.1.0 bash
# Install to a custom directory
curl -fsSL https://raw.githubusercontent.com/datastx/Feather-Flow/main/install.sh | INSTALL_DIR=/usr/local/bin bashPre-built binaries are available on the Releases page:
| Platform | Artifact |
|---|---|
| Linux x86_64 | ff-x86_64-linux-gnu |
| macOS x86_64 | ff-x86_64-apple-darwin |
| macOS ARM (Apple Silicon) | ff-aarch64-apple-darwin |
# Example: download latest for Linux x86_64
curl -fsSL https://github.com/datastx/Feather-Flow/releases/latest/download/ff-x86_64-linux-gnu -o ff
chmod +x ff
sudo mv ff /usr/local/bin/docker pull ghcr.io/datastx/feather-flow:latest
# Run any ff command
docker run --rm -v "$(pwd)":/workspace -w /workspace ghcr.io/datastx/feather-flow validate
docker run --rm -v "$(pwd)":/workspace -w /workspace ghcr.io/datastx/feather-flow run
docker run --rm -v "$(pwd)":/workspace -w /workspace ghcr.io/datastx/feather-flow testgit clone https://github.com/datastx/Feather-Flow.git
cd Feather-Flow
make build-release
# Binary is at target/release/ff
# Or install directly
cargo install --path crates/ff-cli- Create a project directory with a
featherflow.ymlconfiguration:
name: my_project
version: "1.0"
database:
type: duckdb
path: ":memory:"
model_paths:
- models
seed_paths:
- seeds
source_paths:
- sources
macro_paths:
- macros
vars:
env: dev
start_date: "2024-01-01"- Create SQL models in the
models/directory:
-- models/staging/stg_orders.sql
{{ config(materialized='view', schema='staging') }}
SELECT
id AS order_id,
user_id AS customer_id,
created_at AS order_date,
amount,
status
FROM raw_orders
WHERE created_at >= '{{ var("start_date") }}'- Create a schema file with the same name as your model (1:1 convention):
# models/staging/stg_orders.yml
version: 1
description: "Staged orders from raw source"
owner: data-team
tags:
- staging
- orders
columns:
- name: order_id
tests:
- unique
- not_null
- name: customer_id
tests:
- not_null
- name: order_date
- name: amount
- name: status- Optionally define external sources in
sources/:
# sources/raw_ecommerce.yml
kind: sources
version: 1
name: raw_ecommerce
description: "Raw e-commerce data"
schema: main
tables:
- name: raw_orders
description: "Raw order data"
columns:
- name: id
type: INTEGER
tests:
- unique
- not_null
- name: user_id
type: INTEGER
- name: amount
type: DECIMAL(10,2)- Run Featherflow commands:
# Load seed data
ff seed
# Compile models (renders Jinja, extracts dependencies, builds manifest)
ff compile
# List models and sources
ff ls
# Execute models in dependency order
ff run
# Run schema tests
ff test
# Validate project without execution
ff validate
# Generate documentation
ff docsCompiles SQL models by rendering Jinja templates, extracting dependencies, and generating a manifest.
ff compile [--project-dir <DIR>] [--models <MODELS>]Executes compiled models in topological order.
ff run [--target <DB_PATH>] [--select <SELECTOR>] [--full-refresh]Selectors:
model_name- Run a specific model+model_name- Run model and all ancestorsmodel_name+- Run model and all descendants+model_name+- Run model, ancestors, and descendants
Lists models and sources with their dependencies and materialization settings.
ff ls [--output <FORMAT>] [--select <SELECTOR>]Output formats: table (default), json, tree
Parses models and outputs AST or dependency information.
ff parse [--models <MODELS>] [--output <FORMAT>]Output formats: pretty (default), json, deps
Runs schema tests defined in model and source schema files.
ff test [--target <DB_PATH>] [--models <MODELS>] [--fail-fast]Loads CSV seed files into the database.
ff seed [--seeds <NAMES>] [--full-refresh]Validates project configuration, SQL syntax, and schema files without executing.
ff validate [--models <MODELS>] [--strict]Generates documentation from schema files.
ff docs [--output <PATH>] [--format <FORMAT>] [--models <MODELS>]Output formats: markdown (default), json, html
Generates lineage.dot Graphviz diagram for visualization.
--project-dir, -p <DIR>: Project directory (default: current directory)--config, -c <FILE>: Config file path--target, -t <PATH>: Database path (overrides config)--verbose, -v: Enable verbose output
name: project_name # Project name
version: "1.0" # Project version
database:
type: duckdb # Database type (duckdb, snowflake)
path: ":memory:" # Database path (":memory:" for in-memory)
dialect: duckdb # SQL dialect (duckdb, snowflake)
materialization: view # Default materialization (view, table)
schema: main # Default schema
model_paths: # Directories containing SQL models
- models
seed_paths: # Directories containing CSV seed files
- seeds
source_paths: # Directories containing source definitions
- sources
macro_paths: # Directories containing macro files
- macros
target_path: target # Output directory for compiled files
vars: # Variables accessible via var()
env: dev
start_date: "2024-01-01"Use the config() function in your SQL models:
{{ config(materialized='table', schema='staging') }}
SELECT * FROM raw_dataSupported config options:
materialized:'view'or'table'schema: Target schema name
Access variables with var():
SELECT * FROM {{ var('schema') }}.users
WHERE env = '{{ var('env', 'prod') }}'Create reusable macros in macros/:
-- macros/date_utils.sql
{% macro date_trunc(date_col, granularity) %}
DATE_TRUNC('{{ granularity }}', {{ date_col }})
{% endmacro %}Use in models:
{% from "date_utils.sql" import date_trunc %}
SELECT
{{ date_trunc('order_date', 'month') }} AS order_month,
SUM(amount) AS total
FROM orders
GROUP BY 1my_project/
├── featherflow.yml
├── models/
│ ├── staging/
│ │ ├── stg_orders.sql
│ │ ├── stg_orders.yml # 1:1 schema file
│ │ ├── stg_customers.sql
│ │ └── stg_customers.yml # 1:1 schema file
│ └── marts/
│ ├── fct_orders.sql
│ └── fct_orders.yml # 1:1 schema file
├── seeds/
│ ├── raw_orders.csv
│ └── raw_customers.csv
├── sources/
│ └── raw_ecommerce.yml # kind: sources
├── macros/
│ └── date_utils.sql
└── target/
├── compiled/
├── manifest.json
├── run_results.json
└── docs/
# Build all crates
make build
# Run tests
make test
# Run CI checks (format, clippy, test, doc)
make ci# CLI commands
make ff-seed # Load seed data
make ff-compile # Compile models
make ff-run # Execute models
make ff-ls # List models
make ff-test # Run tests
make ff-validate # Validate project
make ff-docs # Generate documentation
# Development workflows
make dev-cycle # seed -> run -> test
make dev-validate # compile -> validateff-cli: CLI binary and commandsff-core: Core library (config, project, DAG, sources)ff-sql: SQL parsing and dependency extractionff-jinja: Jinja templating layer with macro supportff-db: Database abstraction and DuckDB backendff-test: Test generation and execution
MIT