Giter VIP home page Giter VIP logo

defi-ia-2022's Introduction

Defi-IA-2022

Défi IA 2022 Kaggle - Local prediction of rainfall using data measure on ground stations and results from MétéoFrance predictive models (Arpège_2D).

This code achieved a MAPE score of 26.57 (16th/84) on the public leaderboard and the competition results were a MAPE of 30.13, ranked 27th place (private leaderboard).

Instructions to download the data:

  • Create your account on Kaggle
  • Get your API credentials

" To use the Kaggle API, sign up for a Kaggle account at https://www.kaggle.com. Then go to the 'Account' tab of your user profile (https://www.kaggle.com//account) and select 'Create API Token'. This will trigger the download of kaggle.json, a file containing your API credentials. Place this file in the location ~/.kaggle/kaggle.json "

  • Be sure to have the kaggle command installed using pip install kaggle. If the kaggle command cannot be found, add ~/.local/bin to your path.
  • Run dowload_data.sh in your working directory. A ./DATA_RAINFALL/ directory will be created containing all the data from ground stations and Arpege.

Preprocessing the data:

  • Install the xarray python library (plus netcdf4 and h5netcdf if necessary) to collect Arpege_2D data
  • Open your terminal in your working directory and run the following command : python preprocess_train.py <your_working_directory_path>. The script fills nans in X_station_train and Y_train, merges X_station_train with 2D_arpege_train and then reshapes the training features to hourly features (all features at each hour). Two files full_X_train.csv and full_Y_train.csv are created in the directory ./DATA_RAINFALL/Train/Train/. Preprocessing the training set is time-consuming (approximately 6 hours).
  • Open your terminal in your working directory and run the following command : python preprocess_test.py <your_working_directory_path>. The script fills nans in X_station_test, merges X_station_test with 2D_arpege_test and then reshapes the features to hourly features. A file full_X_test.csv is created in the directory ./DATA_RAINFALL/Test/Test/. Preprocessing the test set could last for around 30 minutes. Note that Y_test is not provided by MeteoFrance.

Training models and making predictions

  • Download the "DATA_RAINFALL.zip" file from : https://drive.google.com/file/d/10xF6B2JB-cEftuSWBBWLvF_FETbXBVpc/view?usp=sharing , or run the preprocessing part that creates it. Unzip the file, make sure the folder is next to main_model.py. This folder contains the data after preprocessing.
  • Run main_model.py, this file runs the feature engineering before training on MLP and LGBM models, creating prediction .csv files afterwards.

defi-ia-2022's People

Contributors

anthonygofin avatar achedozeau avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.