🚀 Microsoft Fabric API Data Pipeline

This project demonstrates an end-to-end data engineering pipeline built using Microsoft Fabric. The pipeline extracts data from a public REST API, stores raw data in the Lakehouse Bronze layer, and processes the data through Silver and Gold layers using Fabric notebooks.

The solution follows the Medallion Architecture (Bronze → Silver → Gold) to enable scalable and structured data processing for analytics.

🔄 Pipeline Overview

The pipeline consists of three main stages.

1️⃣ Extract_from_API

This stage uses a Copy Activity in Microsoft Fabric Data Pipeline to fetch data from a REST API using an HTTP GET request.

The API response is retrieved in JSON format and stored directly in the Lakehouse Bronze layer as raw data.

Raw Data Path

API_data/bronze/posts/Raw_API_DATA.json

The Bronze layer stores the unprocessed source data exactly as received, ensuring traceability and reproducibility.

2️⃣ Bronze_to_Silver

Once the raw data is ingested, a Microsoft Fabric Notebook processes the Bronze data and transforms it into a structured dataset suitable for analytics.

Typical transformations include:

Removing invalid or duplicate records
Flattening nested JSON structures
Standardizing column names
Performing data type conversions
Applying basic data validation rules

The processed dataset is then stored in the Silver layer as structured tables.

The Silver layer contains clean and reliable datasets ready for further transformation.

3️⃣ Silver_to_Gold

In the final stage, another Fabric notebook processes the Silver layer tables to create analytics-ready datasets.

The Gold layer contains curated and aggregated tables optimized for reporting and business intelligence.

Typical operations include:

Data aggregation
Business metric calculations
Data enrichment
Preparing datasets for analytics

These datasets can be consumed by:

Power BI dashboards
reporting tools
downstream analytics systems

🏗 Architecture Flow

API Source
      ↓
Fabric Data Pipeline (Copy Activity)
      ↓
Bronze Layer (Raw JSON Files in Lakehouse)
      ↓
Notebook Transformation (Bronze → Silver)
      ↓
Silver Layer (Clean Structured Data)
      ↓
Notebook Aggregation (Silver → Gold)
      ↓
Gold Layer (Analytics Ready Data)

🧰 Technologies Used

Microsoft Fabric
Fabric Data Pipelines
Lakehouse Architecture
REST API Data Ingestion
PySpark / Fabric Notebooks
JSON Data Processing
Git for Version Control

🎯 Purpose of the Project

This project demonstrates how to design and implement a scalable data pipeline using Microsoft Fabric.

It highlights key data engineering practices including:

ingesting data from external APIs
implementing Medallion Architecture
notebook-based data transformations
layered data modeling for analytics
building analytics-ready datasets for reporting and dashboards

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
noteboook		noteboook
pipelines		pipelines
.gitignore		.gitignore
Microsoft Fabric data pipeline overview.png		Microsoft Fabric data pipeline overview.png
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 Microsoft Fabric API Data Pipeline

🔄 Pipeline Overview

1️⃣ Extract_from_API

2️⃣ Bronze_to_Silver

3️⃣ Silver_to_Gold

🏗 Architecture Flow

🧰 Technologies Used

🎯 Purpose of the Project

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🚀 Microsoft Fabric API Data Pipeline

🔄 Pipeline Overview

1️⃣ Extract_from_API

2️⃣ Bronze_to_Silver

3️⃣ Silver_to_Gold

🏗 Architecture Flow

🧰 Technologies Used

🎯 Purpose of the Project

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages