Skip to content

Racem1000/wind-power-forecasting

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Probabilistic Wind Power Forecasting

Overview

This project implements a probabilistic wind power forecasting pipeline using real turbine SCADA data. The goal is to produce reliable short-term power predictions with quantified uncertainty, a critical requirement for modern power systems integrating large shares of wind energy.

The system combines physics-informed data validation, machine learning forecasting, and distribution-free uncertainty calibration to generate prediction intervals suitable for operational decision-making.

A live dashboard visualizing forecasts and uncertainty intervals is available here: 🔗 https://racem1000.github.io/wind-power-forecasting/dashboard/


Motivation

Wind energy is inherently variable and uncertain. Accurate forecasting helps:

  • Improve grid stability and dispatch planning
  • Reduce balancing costs in electricity markets
  • Support renewable integration at scale

While most forecasting systems focus on point predictions, grid operators and energy companies require probabilistic forecasts that quantify uncertainty. This project demonstrates a lightweight yet robust pipeline combining physical constraints with machine learning to produce calibrated probabilistic forecasts.


Dataset

  • Source: Wind turbine SCADA data

  • Year: 2018

  • Resolution: 10-minute intervals

  • Size: ~50,000 observations

  • Variables used:

    • Wind speed
    • Power output
    • Derived aerodynamic features
    • Time-based features

Methodology

1. Physics-Based Data Cleaning

To ensure realistic training data, the pipeline applies aerodynamic and turbine operation constraints:

  • Cut-in speed filtering (removal of invalid power generation below operational threshold)
  • Cut-out speed filtering (removal of shutdown-region artifacts)
  • Betz limit validation to detect physically impossible power values

This step ensures the model learns from physically plausible turbine behavior.


2. Feature Engineering

The forecasting model uses a feature set combining physics, temporal signals, and historical dynamics:

Base features

  • Wind speed
  • Wind power density (v³)

Temporal encoding

  • Cyclical hour-of-day
  • Cyclical day-of-year

Historical dynamics

  • 19 lag features
  • Rolling statistics

These features capture both short-term turbulence effects and daily seasonal patterns.


3. Probabilistic Forecasting Model

The core model is LightGBM quantile regression, trained to predict multiple conditional quantiles:

  • q = 0.10
  • q = 0.50 (median forecast)
  • q = 0.90

Training is performed using pinball loss, enabling direct learning of conditional power distributions rather than a single deterministic value.


4. Uncertainty Calibration

To guarantee statistically valid prediction intervals, the project applies:

MAPIE Split Conformal Regression

This provides distribution-free coverage guarantees, ensuring forecast intervals remain reliable even if the underlying model is imperfect.


Results

Turbine rated power: 3600 kW

Metric Value
Mean Absolute Error 51 kW
Relative Error 1.4% of rated capacity
Target Coverage 80%
Achieved Coverage 75.8%

The results demonstrate high point forecast accuracy and well-calibrated probabilistic intervals suitable for operational forecasting scenarios.


Visualization Dashboard

An interactive dashboard was developed to visualize:

  • Wind speed evolution
  • Power forecasts
  • Prediction intervals
  • Forecast uncertainty

Live dashboard: https://racem1000.github.io/wind-power-forecasting/dashboard/


Technology Stack

Languages & Libraries

  • Python
  • LightGBM
  • MAPIE
  • Pandas / NumPy
  • Scikit-learn

Visualization

  • Chart.js
  • HTML
  • GitHub Pages

Development Environment

  • Jupyter Notebooks
  • Git / GitHub

Project Structure

wind-power-forecasting/
│
├── data/                # Processed SCADA dataset
├── notebooks/           # Exploratory analysis and modeling
├── models/              # Trained models
├── dashboard/           # Web visualization
├── src/                 # Data pipeline and forecasting scripts
└── README.md

Future Improvements

Potential extensions include:

  • Multi-turbine farm forecasting
  • Integration of numerical weather prediction (NWP) data
  • Deep learning models (LSTM / Temporal Transformers)
  • Advanced probabilistic scoring (CRPS, Winkler score)

Author

Racem Kamel

Renewable Energy Engineer

Focus areas:

  • Wind energy analytics
  • Probabilistic forecasting
  • AI for power systems

About

This project develops a probabilistic wind power forecasting system using real SCADA data from a utility scale wind turbine. The pipeline combines physics based data validation, feature engineering and LightGBM quantile regression to generate power forecasts with uncertainty intervals.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors