An MIT-licensed Python package for accessing and preprocessing data from the Australian Energy Market Operator (AEMO) for the National Electricity Market (NEM).
NEMDataTools provides a clean, efficient interface for:
- Downloading raw data from AEMO's public data sources
- Processing various AEMO data formats
- Managing time series data with appropriate timestamps
- Supporting multiple data tables and report types
- Delivering preprocessed data ready for analysis
This package is designed for researchers, analysts, and developers who need to work with AEMO data under a permissive MIT license.
pip install nemdatatoolspip install --index-url https://test.pypi.org/simple/ nemdatatools# Clone the repository
git clone https://github.com/ZhipengHe/nemdatatools.git
cd nemdatatools
# Install in development mode with all dependencies
pip install -e ".[dev,docs]"
# Or install just the core package
pip install -e .- Python 3.10 or higher
- pandas, numpy, requests, pyarrow, tqdm
import nemdatatools as ndt
# Download and process dispatch price data with automatic caching
data = ndt.fetch_data(
data_type="DISPATCHPRICE",
start_date="2023/01/01",
end_date="2023/01/02",
regions=["NSW1", "VIC1"],
cache_path="./cache" # Enable local caching
)
# Data is already processed and standardized
print(f"Downloaded {len(data)} records")
print(data.head())
# Advanced analysis with built-in functions
stats = ndt.calculate_price_statistics(data)
resampled = ndt.resample_data(data, '1H') # Resample to hourly
windows = ndt.create_time_windows(data, window_size='4H') # 4-hour windows- π Complete Data Pipeline: Download β Extract β Process β Cache β Analyze in one API call
- π Core Data Types: MMSDM dispatch data, pre-dispatch forecasts, with framework for expansion
- β‘ Intelligent Caching: Metadata-based local caching with configurable TTL
- π Advanced Processing: Data standardization, time series resampling, statistical analysis
- β° Time-Aware: Proper AEST timezone handling and dispatch interval management
- π Region Support: All NEM regions (NSW1, VIC1, QLD1, SA1, TAS1) with filtering
- π‘οΈ Production Ready: Robust error handling, retry logic, comprehensive testing
NEMDataTools has reached production readiness with core functionality complete and thoroughly tested.
-
Complete Data Pipeline
- Multi-source data downloading (MMSDM, pre-dispatch, static)
- ZIP file extraction and CSV processing
- Intelligent caching with metadata management
- End-to-end data standardization and validation
-
Advanced Processing Capabilities
- Time series resampling and statistical analysis
- Price and demand calculation functions
- Time window creation for analysis
- AEST timezone and dispatch interval handling
-
Production Infrastructure
- Comprehensive error handling and retry logic
- 79 test functions with 58% coverage
- Pre-commit hooks with Black, Ruff, MyPy
- GitHub Actions CI/CD pipeline
- Type annotations throughout codebase
- Data Type Expansion: Adding support for remaining MMSDM tables
- Documentation: API reference and advanced usage guides
| Data Type | Status | Description |
|---|---|---|
DISPATCHPRICE |
β Fully Tested | 5-minute dispatch prices by region |
DISPATCHREGIONSUM |
β Fully Tested | 5-minute regional dispatch summary |
DISPATCH_UNIT_SCADA |
β Fully Tested | Generator SCADA readings |
PREDISPATCHPRICE |
β Fully Tested | Pre-dispatch price forecasts |
PRICE_AND_DEMAND |
β Tested | Direct CSV price and demand data |
P5MIN_REGIONSOLUTION |
5-minute pre-dispatch (implementation complete, testing pending) | |
| Static Data Types | β Framework Ready | Registration lists and boundaries |
Here are some documents to help you get started with developing NEMDataTools:
- Project Planning:
- Implementation Plan: Detailed plan for implementing core modules
- Project Board: Overview of the project structure and milestones
- Development Workflow:
- Quickstart with UV: Setting up the development environment with Universal Viewer
- UV Integration Guide: Using UV for dependency management
- Quickstart with Pre-Commit: Setting up pre-commit hooks for code quality
- Commitizen Guide: Using Commitizen for standardized commit messages
Detailed documentation is available at Documentation (WIP).
# Main data fetching function
data = ndt.fetch_data(
data_type="DISPATCHPRICE",
start_date="2023/01/01",
end_date="2023/01/02",
regions=["NSW1", "VIC1"],
cache_path="./cache"
)
# Check available data types
available_types = ndt.get_available_data_types()
# Batch operations
ndt.download_multiple_tables(
tables=["DISPATCHPRICE", "DISPATCHREGIONSUM"],
start_date="2023/01/01",
end_date="2023/01/02"
)
# Advanced analysis
stats = ndt.calculate_price_statistics(data)
resampled = ndt.resample_data(data, '1H')
windows = ndt.create_time_windows(data, window_size='4H')Contributions are welcome! Please see our Contributing Guide for details.
NEMDataTools is released under the MIT License. See the LICENSE file for details.