Giter VIP home page Giter VIP logo

windturbineoutputprediction's Introduction

WindTurbineOutputPrediction

This repository contains the Python and R Jupyter notebooks I used to work on H2O's Open Tour NYC Hackathon on July 19 and 20, 2016, and afterwards. See blog post at http://lucdemortier.github.io/articles/17/WindPower for a description of the results.

Contents

  • 1_data_preparation.ipynb: Reads hackathon input csv files (for training and testing), creates data frames, and pickles them for Python notebooks or feathers them for R notebooks.
  • 2_exploratory_visuals.ipynb: Generates various plots to explore the data prior to modeling.
  • 3_random_forest_regressor.ipynb: A random forest regression model which models all ten turbines as a single turbine with a "zone id" setting.
  • 4_random_forest_regressor.ipynb: A random forest regression model which separately models each of the ten turbines, using wind velocity measurements from all zones.
  • 5_xgboost_regressor.ipynb: An XGBoost regression model.
  • 6_xgboost_classifier_plus_regressor.ipynb: A combination of an XGBoost classifier and regressor. The classifier predicts which turbine outputs are zero, the regressor predicts the values of the non-zero outputs.
  • 7_gamlss_R.ipynb: A generalized linear model. This notebook runs an R kernel and uses the R package GAMLSS.
  • 8_check_solution.ipynb: Uses csv files with predictions created by the other notebooks to compute the RMSE for the hackathon's public and private leaderboards.
  • summarynoprint.R and wp_withdata.R are routines from the GAMLSS package that I had to modify slightly for the R notebook.

Problem Statement

Given daily 24-hours-at-a-time wind forecasts, predict the nominal wind turbine output for 10 turbines. The provided data are the turbine number, timestamp of the forecast, and forecasted zonal and meridional wind vectors at 10 meters and 100 meters above ground. The wind data were taken in 2012 and 2013. The training data consist of the first 19 months, and the test set of the following five months (the last month only has ten records). The public leaderboard is based on the first two months of the test dataset (Aug-2013 and Sep-2013), while the rest of the test dataset is used for the private leaderboard.

Note:

Data:

Variable Definition
ID Unique ID of observation
ZONEID Zone (turbine) ID
TIMESTAMP Date and time of observation
U10 Zonal wind velocity at 10 m above ground
V10 Meridional wind velocity at 10 m above ground
U100 Zonal wind velocity at 100 m above ground
V100 Meridional wind velocity at 100 m above ground
TARGETVAR Output of wind turbine, as a fraction of maximum capacity

To learn more about the U and V wind velocity components, click here.

The full data set (including the target variable values for the test subset used for the public and private leaderboards) is available from Dr. Tao Hong's Energy Forecasting website, under "GEFCom2014".

windturbineoutputprediction's People

Contributors

lucdemortier avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.