A production-grade data pipeline for fetching, storing, and transforming Federal Reserve Economic Data (FRED) in Databricks using the medallion architecture and Delta Live Tables.
This project creates and maintains FRED economic data in a Databricks lakehouse with:
- Bronze Layer: Raw data ingestion using Streaming Tables with Auto Loader
- Silver Layer: Cleaned, typed data using Streaming Tables with data quality constraints
- Gold Layer: Business-ready denormalized Materialized View
FRED API ──▶ Bronze (Raw) ──▶ Silver (Clean) ──▶ Gold (Business)
Streaming Table Streaming Table Materialized View
- Automated FRED API data fetching for 15 economic rate series
- Medallion architecture (Bronze → Silver → Gold)
- Delta Live Tables with Streaming Tables and Materialized Views
- Data quality validation with DLT expectations
- Primary/Foreign key constraints
- Databricks Asset Bundles for infrastructure as code
- Environment-specific configurations (dev/prod)
| Series ID | Description |
|---|---|
| DFF | Federal Funds Effective Rate |
| DTB3 | 3-Month Treasury Bill Rate |
| DGS1 | 1-Year Treasury Rate |
| DGS2 | 2-Year Treasury Rate |
| DGS5 | 5-Year Treasury Rate |
| DGS10 | 10-Year Treasury Rate |
| DGS30 | 30-Year Treasury Rate |
| SOFR | Secured Overnight Financing Rate |
| DPRIME | Bank Prime Loan Rate |
| MORTGAGE30US | 30-Year Fixed Rate Mortgage |
| MORTGAGE15US | 15-Year Fixed Rate Mortgage |
| BAMLC0A0CM | ICE BofA US Corporate Index Yield |
| BAMLH0A0HYM2 | ICE BofA US High Yield Index |
| T10YIE | 10-Year Breakeven Inflation Rate |
| T5YIE | 5-Year Breakeven Inflation Rate |
- Databricks workspace with Unity Catalog enabled
- Python 3.9+
- FRED API key (get one free)
git clone https://github.com/dmkern03/FRED.git
cd FREDpython -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txtdatabricks secrets create-scope fred-api
databricks secrets put-secret fred-api api-key --string-value "YOUR_API_KEY"databricks bundle deploy --target dev
databricks bundle run fred_dlt_pipeline --target dev- Navigate to Delta Live Tables in Databricks
- Click Create Pipeline
- Add notebooks:
02_Bronze_Setup.py,03_Silver_Setup.py,04_Gold_Setup.py - Set target catalog:
investments, schema:fred - Create and start the pipeline
See DLT Pipeline Setup Guide for detailed instructions.
Run these notebooks once before starting the DLT pipeline:
| Step | Notebook | Purpose |
|---|---|---|
| 1 | 01_API_Setup.py |
Create volumes and configure secrets |
| 2 | 05_Daily_API_Call.py |
Fetch initial data from FRED API |
Schedule 05_Daily_API_Call.py as a daily Databricks Workflow. The DLT pipeline automatically processes new data.
FRED/
├── notebooks/ # Databricks notebooks
│ ├── 01_API_Setup.py # One-time setup
│ ├── 02_Bronze_Setup.py # DLT Bronze layer
│ ├── 03_Silver_Setup.py # DLT Silver layer
│ ├── 04_Gold_Setup.py # DLT Gold layer
│ └── 05_Daily_API_Call.py # Daily API fetch
│
├── src/ # Python source code
│ └── utils/
│ └── api_helpers.py # FRED API utilities
│
├── tests/ # Unit tests
├── docs/ # Documentation
├── config/ # Pipeline configuration
├── resources/ # DAB resource definitions
└── environments/ # Environment configs
- Architecture - System design and data flow
- DLT Pipeline Setup - Detailed setup instructions
- Production Deployment - Production deployment guide
- Data Dictionary - Schema documentation
- Constraints Reference - PK/FK constraints
pytest tests/ -vpip install pre-commit
pre-commit installThis project uses Ruff for linting and formatting.
ruff check .
ruff format .Contributions are welcome! Please read CONTRIBUTING.md for guidelines.
This project is licensed under the MIT License - see the LICENSE file for details.
- Federal Reserve Bank of St. Louis for the FRED API
- Databricks for Delta Live Tables