Skip to content

achenry/wind-forecasting

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3,543 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸŒͺ️ Wind Forecasting Framework

Project Status Last Updated Contributors License: MIT

πŸš€ Project Overview

This project provides a framework to develop, train, tune, and evaluate various deep learning models for probabilistic, multivariate wind forecasting. It is designed to work with diverse wind farm operational datasets and facilitate integration with control systems like the wind-hybrid-open-controller.

The framework supports multiple forecasting architectures and is built for execution on High-Performance Computing (HPC) clusters, leveraging Slurm for job management and Optuna for distributed hyperparameter optimization. While the examples use PostgreSQL, Optuna supports various backends (SQLite, MySQL, Journal Storage) via configuration.

🎯 Goal

To provide a flexible and scalable platform for experimenting with and deploying state-of-the-art wind forecasting models, particularly for ultra-short-term predictions relevant to wind farm control.

πŸ“š Table of Contents

πŸ› οΈ Core Technologies

This framework utilizes a modern stack for deep learning and time series analysis with a modular, domain-driven architecture:

  • 🐍 Programming Language: Python (v3.12+)
  • 🧠 Deep Learning:
    • PyTorch: Primary tensor computation library.
    • PyTorch Lightning: Framework for structuring training, validation, testing, checkpointing, logging, multi-GPU/distributed training (DDP), and callbacks.
  • πŸ•°οΈ Time Series Modeling:
    • GluonTS (Fork): Provides foundational components (PyTorchLightningEstimator, data structures, transformations). Note: This project uses a specific fork.
  • πŸ“Š Hyperparameter Optimization:
    • Optuna: Used for distributed hyperparameter tuning via configurable storage backends (PostgreSQL, SQLite, etc.), including pruning mechanisms.
  • ☁️ Distributed Computing & Scheduling:
    • Slurm: HPC workload manager for resource allocation and job execution via batch scripts (.sh).
  • πŸ“ˆ Experiment Tracking & Logging:
    • WandB (Weights & Biases): Used for logging metrics, parameters, and configurations.
    • Python logging: Standard library for application messages.
  • πŸ“¦ Environment Management:
    • Conda / Mamba: Recommended for managing the Python environment.
  • πŸ’Ύ Data Handling:
    • Polars / Pandas: Efficient data manipulation.
    • Parquet: Recommended file format for storing processed time series data.

πŸ—οΈ Architecture Highlights

  • Modular Design: Clean separation between core functionality, tuning-specific utilities, and cross-mode components.
  • Domain-Driven Organization: Hyperparameter tuning is encapsulated in the wind_forecasting.tuning subpackage with clear APIs.
  • Flexible Configuration: YAML-based configuration system supporting multiple modes (tune/train/test) with shared utilities.
  • Scalable Infrastructure: Supports both local development and distributed HPC execution with minimal configuration changes.

πŸ“‚ Project Structure (wind-forecasting/)

wind-forecasting/
β”œβ”€β”€ πŸ“ config/             # YAML configurations (training, preprocessing)
β”‚   └── training/
β”œβ”€β”€ πŸ“ wind_forecasting/   # Core application source code
β”‚   β”œβ”€β”€ πŸ“ preprocessing/  # Data loading, processing, splitting (DataModule)
β”‚   β”œβ”€β”€ πŸ“ run_scripts/    # Main execution scripts (run_model.py, testing.py, etc.)
β”‚   β”‚   └── tune_scripts/  # Example Slurm scripts for tuning
β”‚   β”œβ”€β”€ πŸ“ tuning/         # Hyperparameter optimization subpackage
β”‚   β”‚   β”œβ”€β”€ core.py        # Main tune_model orchestration
β”‚   β”‚   β”œβ”€β”€ objective.py   # MLTuningObjective class
β”‚   β”‚   β”œβ”€β”€ scripts/       # Standalone tuning scripts
β”‚   β”‚   └── utils/         # Tuning-specific utilities
β”‚   └── πŸ“ utils/          # General & cross-mode utilities
β”‚       β”œβ”€β”€ optuna_*.py    # Optuna utilities (storage, config, params) used across modes
β”‚       └── callbacks.py   # General PyTorch Lightning callbacks
β”œβ”€β”€ πŸ“ logs/               # Default directory for runtime outputs (Slurm, WandB, Checkpoints)
β”œβ”€β”€ πŸ“ optuna/             # Default directory for Optuna storage artifacts (DB data, sockets)
β”œβ”€β”€ πŸ“ examples/           # Example scripts (data download) & input configurations
β”‚   └── inputs/           # Example configuration files & data directory
β”œβ”€β”€ πŸ“ install_rc/         # Environment setup scripts & YAML files
β”œβ”€β”€ πŸ“„ .gitignore
β”œβ”€β”€ πŸ“„ .gitattributes
└── πŸ“„ README.md           # This file

🧠 Integrated Models

This framework is designed to be model-agnostic. Forecasting models are implemented externally in the pytorch-transformer-ts repository and integrated here. Currently supported models include:

  • Informer
  • Autoformer
  • Spacetimeformer
  • TACTiS-2

Refer to the pytorch-transformer-ts repository for detailed model implementations and architectures. New models following the GluonTS/PyTorch Lightning Estimator pattern can be added and configured via YAML.

βš™οΈ Setup

Environment Setup

The install_rc/ directory provides scripts to help create the necessary Python environment using Conda or Mamba.

  1. Navigate to the directory:
    cd install_rc
  2. Run the installation script:
    ./install.sh
    This script uses the provided .yaml files (e.g., wind_forecasting_cuda.yaml) to create a Conda environment with the required dependencies.

Note: On HPC environments, necessary system modules (CUDA, compilers, etc.) should be loaded before activating the Conda environment, typically within the Slurm job script.

Dependencies

A detailed list of dependencies can be found in the environment YAML files within install_rc/. Key requirements include:

  • Python 3.12+
  • PyTorch 2.x
  • PyTorch Lightning 2.x
  • Optuna
  • GluonTS (from the specified fork)
  • WandB
  • Polars
  • NumPy, Pandas
  • PyYAML
  • ... (TODO Add other dependencies)

Example Data

To test the framework, you can download and prepare the public SMARTEOLE dataset from NREL's FLASC repository.

  1. Run the download script:
    python examples/download_flasc_data.py
    This downloads the data into examples/inputs/SMARTEOLE-WFC-open-dataset/.
  2. Use this data path in your preprocessing configuration.

πŸ”„ Workflow

The typical workflow involves these stages:

1. Data Preprocessing

  1. Write a preprocessing configuration file similar to wind-forecasting/examples/inputs/preprocessing_inputs_flasc.yaml
  2. Run preprocessing on a local machine with python preprocessing_main.py --config /Users/ahenry/Documents/toolboxes/wind-forecasting/examples/inputs/preprocessing_inputs_flasc.yaml --reload_data --preprocess_data --regenerate_filters --multiprocessor cf --verbose or on a HPC by running wind-forecasting/wind_forecasting/preprocessing/load_data.sh, followed by wind-forecasting/wind_forecasting/preprocessing/preprocess_data.sh.
  3. Write a training configuration file similar to wind-forecasting/examples/inputs/training_inputs_kestrel_flasc.yaml.
  4. Run python wind-forecasting/wind_forecasting/run_scripts/load_data.py --config wind-forecasting/examples/inputs/training_inputs_aoifemac_flasc.yaml --reload, or on a HPC by running wind-forecasting/wind_forecasting/run_scripts/load_data_kestrel.sh, to resample the data as needed, caterogize the variables, and generate train/test/val splits.

2. Hyperparameter Tuning (ML Models)

The framework includes a comprehensive hyperparameter tuning system using Optuna for distributed optimization. The tuning functionality is organized in the wind_forecasting/tuning/ subpackage for maintainability and modularity.

  • Tune a ML model on a local machine with python wind-forecasting/wind_forecasting/run_scripts/run_model.py --config wind-forecasting/examples/inputs/training_inputs_aoifemac_flasc.yaml --mode tune --model informer, or on a HPC by running wind-forecasting/wind_forecasting/run_scripts/tune_model.sh.

2.2 Tuning a Statistical Model

  • Tune a statistical model on a local machine with python wind-hybrid-open-controller/whoc/wind_forecast/tuning.py --model_config wind_forecasting/examples/inputs/training_inputs_aoifemac_flasc.yaml --data_config wind_forecasting/examples/inputs/preprocessing_inputs_flasc.yaml --model svr --study_name svr_tuning --restart_tuning, or on a HPC by running wind-hybrid-open-controller/whoc/wind_forecast/run_tuning.sh [model] [number of models to tune].

3. Training a ML Model

  • Train a ML model on a local machine with python wind-forecasting/wind_forecasting/run_scripts/run_model.py --config wind-forecasting/examples/inputs/training_inputs_aoifemac_flasc.yaml --mode train --model informer --use_tuned_parameters, or on a HPC by running wind-forecasting/wind_forecasting/run_scripts/train_model_kestrel.sh.

4. Testing a ML Model

  • Test a ML model on a local machine with python wind-forecasting/wind_forecasting/run_scripts/run_model.py --config wind-forecasting/examples/inputs/training_inputs_aoifemac_flasc.yaml --mode test --model informer --checkpoint latest, or on a HPC by running wind-forecasting/wind_forecasting/run_scripts/test_model.sh.

5. Testing a WindForecaster class on Wind Farm Data

  • Make predictions at a given controller sampling time intervals, for a given SCADA dataset, and a given prediction time interval, compute the accuracy score and plot the results with python wind-hybrid-open-controller/whoc/wind_forecast/WindForecast.py --model_config wind_forecasting/examples/inputs/training_inputs_aoifemac_flasc.yaml --data_config wind_forecasting/examples/inputs/preprocessing_inputs_flasc.yaml --model informer.

6. Combining a Statistical or ML Model with a Wind Farm Controller

  • Write a WHOC configuration file similar to wind-hybrid-open-controller/examples/hercules_input_001.yaml. Run a case study of a yaw controller with a trained model with python wind-hybrid-open-controller/whoc/case_studies/run_case_studies.py 15 -rs -rrs --verbose -ps -rps -ras -st auto -ns 3 -m cf -sd wind-hybrid-open-controller/examples/floris_case_studies -mcnf wind_forecasting/examples/inputs/training_inputs_aoifemac_flasc.yaml -dcnf wind_forecasting/examples/inputs/preprocessing_inputs_flasc.yaml -wcnf wind-hybrid-open-controller/examples/hercules_input_001.yaml -wf scada, where you can fine tune parameters for a suite of cases by editing the dictionary case_studies["baseline_controllers_preview_flasc"] in wind-hybrid-open-controller/whoc/case_studies/initialize_case_studies.py and you can edit the common default parameters in the WHOC configuration file.

7. HPC

  • TODO add HPC version

πŸ”§ Configuration

Primary configuration is via YAML files in config/training/.

  • Example: config/training/training_inputs_juan_flasc.yaml
  • Sections: experiment, logging, optuna, dataset, model (with nested <model_name> keys), callbacks, trainer.
  • Supports basic variable substitution (e.g., ${logging.optuna_dir}).

πŸ“‹ Usage

  1. Clone the repository and set up the Jupyter notebook collaboration as described in the setup section.
  2. Download the required data using the script in examples or use your own data.
  3. Set up the appropriate environment (CUDA or ROCm) using the scripts in the install_rc folder.
  4. Preprocess the data using the script in the wind_forecasting/preprocessing folder.
  5. Train and evaluate models using the scripts in the wind_forecasting/models directory.
  6. For running jobs on HPC environments, use the SLURM scripts provided in the rc_jobs folder.

Configuration Files

  • Data Preprocessing Configuration YAML
  • ML-Model Configuration YAML
  • WHOC Configuration YAML
  • Command Line Arguments for wind-forecasting/wind_forecasting/preprocessing/preprocessing_main.py, wind-forecasting/wind_forecasting/run_scripts/load_data.py, wind-forecasting/wind_forecasting/run_scripts/run_model.py, wind-hybrid-open-controller/whoc/wind_forecast/tuning.py, and wind-hybrid-open-controller/whoc/case_studies/run_case_studies.py.
  • WHOC Case Study Suite in the case_studies dictionary defined at the top of wind-hybrid-open-controller/whoc/case_studies/initialize_case_studies.py.

Preprocessing

  1. Configure: Create/edit preprocessing YAML (e.g., examples/inputs/preprocessing_inputs_flasc.yaml).
  2. Run: Execute wind_forecasting/preprocessing/preprocessing_main.py with appropriate flags or use HPC scripts.

Local Machine:

python preprocessing_main.py --config examples/inputs/preprocessing_inputs_flasc.yaml --reload_data --preprocess_data --regenerate_filters --multiprocessor cf --verbose

HPC System:

# First load the data
./wind_forecasting/preprocessing/load_data.sh

# Then preprocess the data
./wind_forecasting/preprocessing/preprocess_data.sh
  1. Data Loading: After preprocessing, load and prepare the data for model training:
python wind_forecasting/run_scripts/load_data.py --config examples/inputs/training_inputs_flasc.yaml --reload

HPC System:

./wind_forecasting/run_scripts/load_data_kestrel.sh

Tuning (HPC)

The framework's modular tuning system supports distributed hyperparameter optimization with PostgreSQL backend and comprehensive monitoring.

  1. Configure: Edit training YAML (config/training/) with Optuna settings.
  2. Submit Job: Modify and submit Slurm script (e.g., tune_model_storm.sh), ensuring the correct --model <model_name> is targeted.
    sbatch wind_forecasting/run_scripts/tune_scripts/tune_model_storm.sh
  3. Monitor: Use squeue, Slurm logs, WandB, and Optuna dashboard.

Local Machine:

python wind_forecasting/run_scripts/run_model.py --config examples/inputs/training_inputs_flasc.yaml --mode tune --model informer

HPC System:

# Use the provided tuning script
./wind_forecasting/run_scripts/tune_model.sh

Training

  1. Configure: Edit training YAML. Set use_tuned_parameters: true (optional), high limit_train_batches, max_epochs.
  2. Run:
    python wind_forecasting/run_scripts/run_model.py \
      --config config/training/training_inputs_*.yaml \
      --mode train \
      --model <model_name> \
      [--use_tuned_parameters] \
      [--checkpoint <path | 'best' | 'latest'>] # To resume
    (Or use an HPC script)

Local Machine:

python wind_forecasting/run_scripts/run_model.py --config examples/inputs/training_inputs_flasc.yaml --mode train --model informer --use_tuned_parameters

HPC System:

# Use the provided training script
./wind_forecasting/run_scripts/train_model_kestrel.sh

Testing

  1. Configure: Ensure training YAML points to the correct dataset config.
  2. Run:
    python wind_forecasting/run_scripts/run_model.py \
      --config config/training/training_inputs_*.yaml \
      --mode test \
      --model <model_name> \
      --checkpoint <path | 'best' | 'latest'>
    (Or use an HPC script)

Local Machine:

python wind_forecasting/run_scripts/run_model.py --config examples/inputs/training_inputs_flasc.yaml --mode test --model informer --checkpoint latest

HPC System:

# Use the provided testing script
./wind_forecasting/run_scripts/test_model.sh

🀝 Contributing

Tuning & Training the Benchmark Models

  1. Tune a statistical model on a local machine with python wind-hybrid-open-controller/whoc/wind_forecast/tuning.py --model_config wind_forecasting/examples/inputs/training_inputs_aoifemac_flasc.yaml --data_config wind_forecasting/examples/inputs/preprocessing_inputs_flasc.yaml --model svr --study_name svr_tuning --restart_tuning, or on a HPC by running wind-hybrid-open-controller/whoc/wind_forecast/run_tuning_kestrel.sh [model] [model_config].

Contributions are welcome! Please follow standard Git practices (fork, branch, pull request).

πŸ™ Acknowledgements

πŸ“š References

  • TACTiS: Drouin, A., Marcotte, Γ‰., & Chapados, N. (2022). TACTiS: Transformer-Attentional Copulas for Time Series. ICML. (Link)
  • TACTiS-2: Ashok, A., Marcotte, Γ‰., Zantedeschi, V., Chapados, N., & Drouin, A. (2024). TACTIS-2: Better, Faster, Simpler Attentional Copulas for Multivariate Time Series. ICLR. (arXiv)
  • Informer: Zhou, H., Zhang, S., Peng, J., Zhang, S., Li, J., Xiong, H., & Zhang, W. (2021). Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. AAAI. (arXiv)
  • Autoformer: Wu, H., Xu, J., Wang, J., & Long, M. (2021). Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting. NeurIPS. (arXiv)
  • Spacetimeformer: Shinde, A., et al. (2021). Spacetimeformer: Spatio-Temporal Transformer for Time Series Forecasting. (arXiv)
  • GluonTS: Alexandrov, A., et al. (2020). GluonTS: Probabilistic Time Series Modeling in Python. JMLR. (Link)
  • PyTorch Lightning: (Link)
  • Optuna: Akiba, T., Sano, S., Yanase, T., Ohta, T., & Koyama, M. (2019). Optuna: A Next-generation Hyperparameter Optimization Framework. KDD. (Link)
  • WandB: (Link)
  • Related Repositories:

License: MIT Made with ❀️ by achenry and boujuan

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors