🌪️ Wind Forecasting Framework

🚀 Project Overview

This project provides a framework to develop, train, tune, and evaluate various deep learning models for probabilistic, multivariate wind forecasting. It is designed to work with diverse wind farm operational datasets and facilitate integration with control systems like the wind-hybrid-open-controller.

The framework supports multiple forecasting architectures and is built for execution on High-Performance Computing (HPC) clusters, leveraging Slurm for job management and Optuna for distributed hyperparameter optimization. While the examples use PostgreSQL, Optuna supports various backends (SQLite, MySQL, Journal Storage) via configuration.

🎯 Goal

To provide a flexible and scalable platform for experimenting with and deploying state-of-the-art wind forecasting models, particularly for ultra-short-term predictions relevant to wind farm control.

📚 Table of Contents

🌪️ Wind Forecasting Framework
- 🙏 Acknowledgements
- 📚 References

🛠️ Core Technologies

This framework utilizes a modern stack for deep learning and time series analysis with a modular, domain-driven architecture:

🐍 Programming Language: Python (v3.12+)
🧠 Deep Learning:
- PyTorch: Primary tensor computation library.
- PyTorch Lightning: Framework for structuring training, validation, testing, checkpointing, logging, multi-GPU/distributed training (DDP), and callbacks.
🕰️ Time Series Modeling:
- GluonTS (Fork): Provides foundational components (PyTorchLightningEstimator, data structures, transformations). Note: This project uses a specific fork.
📊 Hyperparameter Optimization:
- Optuna: Used for distributed hyperparameter tuning via configurable storage backends (PostgreSQL, SQLite, etc.), including pruning mechanisms.
☁️ Distributed Computing & Scheduling:
- Slurm: HPC workload manager for resource allocation and job execution via batch scripts (.sh).
📈 Experiment Tracking & Logging:
- WandB (Weights & Biases): Used for logging metrics, parameters, and configurations.
- Python logging: Standard library for application messages.
📦 Environment Management:
- Conda / Mamba: Recommended for managing the Python environment.
💾 Data Handling:
- Polars / Pandas: Efficient data manipulation.
- Parquet: Recommended file format for storing processed time series data.

🏗️ Architecture Highlights

Modular Design: Clean separation between core functionality, tuning-specific utilities, and cross-mode components.
Domain-Driven Organization: Hyperparameter tuning is encapsulated in the wind_forecasting.tuning subpackage with clear APIs.
Flexible Configuration: YAML-based configuration system supporting multiple modes (tune/train/test) with shared utilities.
Scalable Infrastructure: Supports both local development and distributed HPC execution with minimal configuration changes.

📂 Project Structure (`wind-forecasting/`)

wind-forecasting/
├── 📁 config/             # YAML configurations (training, preprocessing)
│   └── training/
├── 📁 wind_forecasting/   # Core application source code
│   ├── 📁 preprocessing/  # Data loading, processing, splitting (DataModule)
│   ├── 📁 run_scripts/    # Main execution scripts (run_model.py, testing.py, etc.)
│   │   └── tune_scripts/  # Example Slurm scripts for tuning
│   ├── 📁 tuning/         # Hyperparameter optimization subpackage
│   │   ├── core.py        # Main tune_model orchestration
│   │   ├── objective.py   # MLTuningObjective class
│   │   ├── scripts/       # Standalone tuning scripts
│   │   └── utils/         # Tuning-specific utilities
│   └── 📁 utils/          # General & cross-mode utilities
│       ├── optuna_*.py    # Optuna utilities (storage, config, params) used across modes
│       └── callbacks.py   # General PyTorch Lightning callbacks
├── 📁 logs/               # Default directory for runtime outputs (Slurm, WandB, Checkpoints)
├── 📁 optuna/             # Default directory for Optuna storage artifacts (DB data, sockets)
├── 📁 examples/           # Example scripts (data download) & input configurations
│   └── inputs/           # Example configuration files & data directory
├── 📁 install_rc/         # Environment setup scripts & YAML files
├── 📄 .gitignore
├── 📄 .gitattributes
└── 📄 README.md           # This file

🧠 Integrated Models

This framework is designed to be model-agnostic. Forecasting models are implemented externally in the pytorch-transformer-ts repository and integrated here. Currently supported models include:

Informer
Autoformer
Spacetimeformer
TACTiS-2

Refer to the pytorch-transformer-ts repository for detailed model implementations and architectures. New models following the GluonTS/PyTorch Lightning Estimator pattern can be added and configured via YAML.

⚙️ Setup

Environment Setup

The install_rc/ directory provides scripts to help create the necessary Python environment using Conda or Mamba.

Navigate to the directory:
```
cd install_rc
```
Run the installation script:
```
./install.sh
```
This script uses the provided .yaml files (e.g., wind_forecasting_cuda.yaml) to create a Conda environment with the required dependencies.

Note: On HPC environments, necessary system modules (CUDA, compilers, etc.) should be loaded before activating the Conda environment, typically within the Slurm job script.

Dependencies

A detailed list of dependencies can be found in the environment YAML files within install_rc/. Key requirements include:

Python 3.12+
PyTorch 2.x
PyTorch Lightning 2.x
Optuna
GluonTS (from the specified fork)
WandB
Polars
NumPy, Pandas
PyYAML
... (TODO Add other dependencies)

Example Data

To test the framework, you can download and prepare the public SMARTEOLE dataset from NREL's FLASC repository.

Run the download script:
```
python examples/download_flasc_data.py
```
This downloads the data into examples/inputs/SMARTEOLE-WFC-open-dataset/.
Use this data path in your preprocessing configuration.

🔄 Workflow

The typical workflow involves these stages:

1. Data Preprocessing

Write a preprocessing configuration file similar to wind-forecasting/examples/inputs/preprocessing_inputs_flasc.yaml
Run preprocessing on a local machine with python preprocessing_main.py --config /Users/ahenry/Documents/toolboxes/wind-forecasting/examples/inputs/preprocessing_inputs_flasc.yaml --reload_data --preprocess_data --regenerate_filters --multiprocessor cf --verbose or on a HPC by running wind-forecasting/wind_forecasting/preprocessing/load_data.sh, followed by wind-forecasting/wind_forecasting/preprocessing/preprocess_data.sh.
Write a training configuration file similar to wind-forecasting/examples/inputs/training_inputs_kestrel_flasc.yaml.
Run python wind-forecasting/wind_forecasting/run_scripts/load_data.py --config wind-forecasting/examples/inputs/training_inputs_aoifemac_flasc.yaml --reload, or on a HPC by running wind-forecasting/wind_forecasting/run_scripts/load_data_kestrel.sh, to resample the data as needed, caterogize the variables, and generate train/test/val splits.

2. Hyperparameter Tuning (ML Models)

The framework includes a comprehensive hyperparameter tuning system using Optuna for distributed optimization. The tuning functionality is organized in the wind_forecasting/tuning/ subpackage for maintainability and modularity.

Tune a ML model on a local machine with python wind-forecasting/wind_forecasting/run_scripts/run_model.py --config wind-forecasting/examples/inputs/training_inputs_aoifemac_flasc.yaml --mode tune --model informer, or on a HPC by running wind-forecasting/wind_forecasting/run_scripts/tune_model.sh.

2.2 Tuning a Statistical Model

Tune a statistical model on a local machine with python wind-hybrid-open-controller/whoc/wind_forecast/tuning.py --model_config wind_forecasting/examples/inputs/training_inputs_aoifemac_flasc.yaml --data_config wind_forecasting/examples/inputs/preprocessing_inputs_flasc.yaml --model svr --study_name svr_tuning --restart_tuning, or on a HPC by running wind-hybrid-open-controller/whoc/wind_forecast/run_tuning.sh [model] [number of models to tune].

3. Training a ML Model

Train a ML model on a local machine with python wind-forecasting/wind_forecasting/run_scripts/run_model.py --config wind-forecasting/examples/inputs/training_inputs_aoifemac_flasc.yaml --mode train --model informer --use_tuned_parameters, or on a HPC by running wind-forecasting/wind_forecasting/run_scripts/train_model_kestrel.sh.

4. Testing a ML Model

Test a ML model on a local machine with python wind-forecasting/wind_forecasting/run_scripts/run_model.py --config wind-forecasting/examples/inputs/training_inputs_aoifemac_flasc.yaml --mode test --model informer --checkpoint latest, or on a HPC by running wind-forecasting/wind_forecasting/run_scripts/test_model.sh.

5. Testing a WindForecaster class on Wind Farm Data

Make predictions at a given controller sampling time intervals, for a given SCADA dataset, and a given prediction time interval, compute the accuracy score and plot the results with python wind-hybrid-open-controller/whoc/wind_forecast/WindForecast.py --model_config wind_forecasting/examples/inputs/training_inputs_aoifemac_flasc.yaml --data_config wind_forecasting/examples/inputs/preprocessing_inputs_flasc.yaml --model informer.

6. Combining a Statistical or ML Model with a Wind Farm Controller

Write a WHOC configuration file similar to wind-hybrid-open-controller/examples/hercules_input_001.yaml. Run a case study of a yaw controller with a trained model with python wind-hybrid-open-controller/whoc/case_studies/run_case_studies.py 15 -rs -rrs --verbose -ps -rps -ras -st auto -ns 3 -m cf -sd wind-hybrid-open-controller/examples/floris_case_studies -mcnf wind_forecasting/examples/inputs/training_inputs_aoifemac_flasc.yaml -dcnf wind_forecasting/examples/inputs/preprocessing_inputs_flasc.yaml -wcnf wind-hybrid-open-controller/examples/hercules_input_001.yaml -wf scada, where you can fine tune parameters for a suite of cases by editing the dictionary case_studies["baseline_controllers_preview_flasc"] in wind-hybrid-open-controller/whoc/case_studies/initialize_case_studies.py and you can edit the common default parameters in the WHOC configuration file.

7. HPC

TODO add HPC version

🔧 Configuration

Primary configuration is via YAML files in config/training/.

Example: config/training/training_inputs_juan_flasc.yaml
Sections: experiment, logging, optuna, dataset, model (with nested <model_name> keys), callbacks, trainer.
Supports basic variable substitution (e.g., ${logging.optuna_dir}).

📋 Usage

Clone the repository and set up the Jupyter notebook collaboration as described in the setup section.
Download the required data using the script in examples or use your own data.
Set up the appropriate environment (CUDA or ROCm) using the scripts in the install_rc folder.
Preprocess the data using the script in the wind_forecasting/preprocessing folder.
Train and evaluate models using the scripts in the wind_forecasting/models directory.
For running jobs on HPC environments, use the SLURM scripts provided in the rc_jobs folder.

Configuration Files

Data Preprocessing Configuration YAML
ML-Model Configuration YAML
WHOC Configuration YAML
Command Line Arguments for wind-forecasting/wind_forecasting/preprocessing/preprocessing_main.py, wind-forecasting/wind_forecasting/run_scripts/load_data.py, wind-forecasting/wind_forecasting/run_scripts/run_model.py, wind-hybrid-open-controller/whoc/wind_forecast/tuning.py, and wind-hybrid-open-controller/whoc/case_studies/run_case_studies.py.
WHOC Case Study Suite in the case_studies dictionary defined at the top of wind-hybrid-open-controller/whoc/case_studies/initialize_case_studies.py.

Preprocessing

Configure: Create/edit preprocessing YAML (e.g., examples/inputs/preprocessing_inputs_flasc.yaml).
Run: Execute wind_forecasting/preprocessing/preprocessing_main.py with appropriate flags or use HPC scripts.

Local Machine:

python preprocessing_main.py --config examples/inputs/preprocessing_inputs_flasc.yaml --reload_data --preprocess_data --regenerate_filters --multiprocessor cf --verbose

HPC System:

# First load the data
./wind_forecasting/preprocessing/load_data.sh

# Then preprocess the data
./wind_forecasting/preprocessing/preprocess_data.sh

Data Loading: After preprocessing, load and prepare the data for model training:

python wind_forecasting/run_scripts/load_data.py --config examples/inputs/training_inputs_flasc.yaml --reload

HPC System:

./wind_forecasting/run_scripts/load_data_kestrel.sh

Tuning (HPC)

The framework's modular tuning system supports distributed hyperparameter optimization with PostgreSQL backend and comprehensive monitoring.

Configure: Edit training YAML (config/training/) with Optuna settings.
Submit Job: Modify and submit Slurm script (e.g., tune_model_storm.sh), ensuring the correct --model <model_name> is targeted.
```
sbatch wind_forecasting/run_scripts/tune_scripts/tune_model_storm.sh
```
Monitor: Use squeue, Slurm logs, WandB, and Optuna dashboard.

Local Machine:

python wind_forecasting/run_scripts/run_model.py --config examples/inputs/training_inputs_flasc.yaml --mode tune --model informer

HPC System:

# Use the provided tuning script
./wind_forecasting/run_scripts/tune_model.sh

Training

Configure: Edit training YAML. Set use_tuned_parameters: true (optional), high limit_train_batches, max_epochs.

Run:

python wind_forecasting/run_scripts/run_model.py \
  --config config/training/training_inputs_*.yaml \
  --mode train \
  --model <model_name> \
  [--use_tuned_parameters] \
  [--checkpoint <path | 'best' | 'latest'>] # To resume

(Or use an HPC script)

Local Machine:

python wind_forecasting/run_scripts/run_model.py --config examples/inputs/training_inputs_flasc.yaml --mode train --model informer --use_tuned_parameters

HPC System:

# Use the provided training script
./wind_forecasting/run_scripts/train_model_kestrel.sh

Testing

Configure: Ensure training YAML points to the correct dataset config.

Run:

python wind_forecasting/run_scripts/run_model.py \
  --config config/training/training_inputs_*.yaml \
  --mode test \
  --model <model_name> \
  --checkpoint <path | 'best' | 'latest'>

(Or use an HPC script)

Local Machine:

python wind_forecasting/run_scripts/run_model.py --config examples/inputs/training_inputs_flasc.yaml --mode test --model informer --checkpoint latest

HPC System:

# Use the provided testing script
./wind_forecasting/run_scripts/test_model.sh

🤝 Contributing

Tuning & Training the Benchmark Models

Tune a statistical model on a local machine with python wind-hybrid-open-controller/whoc/wind_forecast/tuning.py --model_config wind_forecasting/examples/inputs/training_inputs_aoifemac_flasc.yaml --data_config wind_forecasting/examples/inputs/preprocessing_inputs_flasc.yaml --model svr --study_name svr_tuning --restart_tuning, or on a HPC by running wind-hybrid-open-controller/whoc/wind_forecast/run_tuning_kestrel.sh [model] [model_config].

Contributions are welcome! Please follow standard Git practices (fork, branch, pull request).

🙏 Acknowledgements

Authors and developers of the integrated forecasting models and underlying libraries (PyTorch, Lightning, GluonTS, Optuna, WandB, etc.).
Compute resources provided by the University of Oldenburg HPC group, University of Colorado Boulder, and NREL.

📚 References

TACTiS: Drouin, A., Marcotte, É., & Chapados, N. (2022). TACTiS: Transformer-Attentional Copulas for Time Series. ICML. (Link)
TACTiS-2: Ashok, A., Marcotte, É., Zantedeschi, V., Chapados, N., & Drouin, A. (2024). TACTIS-2: Better, Faster, Simpler Attentional Copulas for Multivariate Time Series. ICLR. (arXiv)
Informer: Zhou, H., Zhang, S., Peng, J., Zhang, S., Li, J., Xiong, H., & Zhang, W. (2021). Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. AAAI. (arXiv)
Autoformer: Wu, H., Xu, J., Wang, J., & Long, M. (2021). Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting. NeurIPS. (arXiv)
Spacetimeformer: Shinde, A., et al. (2021). Spacetimeformer: Spatio-Temporal Transformer for Time Series Forecasting. (arXiv)
GluonTS: Alexandrov, A., et al. (2020). GluonTS: Probabilistic Time Series Modeling in Python. JMLR. (Link)
PyTorch Lightning: (Link)
Optuna: Akiba, T., Sano, S., Yanase, T., Ohta, T., & Koyama, M. (2019). Optuna: A Next-generation Hyperparameter Optimization Framework. KDD. (Link)
WandB: (Link)
Related Repositories:
- pytorch-transformer-ts (Model Implementations)
- gluonts (Fork)
- wind-hybrid-open-controller (Downstream Application)

Name		Name	Last commit message	Last commit date
Latest commit History 3,543 Commits
config		config
examples		examples
hpc_scripts		hpc_scripts
install_scripts		install_scripts
tests		tests
wind_forecasting		wind_forecasting
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
CODEOWNERS		CODEOWNERS
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Folders and files

Latest commit

History

Repository files navigation

🌪️ Wind Forecasting Framework

🚀 Project Overview

🎯 Goal

🛠️ Core Technologies

🏗️ Architecture Highlights

📂 Project Structure (wind-forecasting/)

🧠 Integrated Models

⚙️ Setup

Environment Setup

Dependencies

Example Data

🔄 Workflow

1. Data Preprocessing

2. Hyperparameter Tuning (ML Models)

2.2 Tuning a Statistical Model

3. Training a ML Model

4. Testing a ML Model

5. Testing a WindForecaster class on Wind Farm Data

6. Combining a Statistical or ML Model with a Wind Farm Controller

7. HPC

🔧 Configuration

📋 Usage

Configuration Files

Preprocessing

Tuning (HPC)

Training

Testing

🤝 Contributing

Tuning & Training the Benchmark Models

🙏 Acknowledgements

📚 References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

📂 Project Structure (`wind-forecasting/`)

Packages