Giter VIP home page Giter VIP logo

massbalancemachine's Introduction

License: MIT

massbalancemachine

MassBalanceMachine

A bridge between mass balance modelling and observations. Global machine learning glacier mass balance modelling that assimilates all glaciological and remote sensing data sources.

  • πŸ”οΈ MassBalanceMachine takes meteorological, topographical and/or other features to predict the surface mass balance of glaciers for a region of interest.
  • ❄️ MassBalanceMachine uses glaciological (stake) and geodetic mass balance data as targets.
  • πŸ“… MassBalanceMachine can make predictions or fill data gaps on an annual, seasonal (summer and winter), and monthly temporal scale for any spatial resolution.

This project is in ongoing development, and new features will be added over the coming months. Please see the contribution guidelines for more information on contributing to this project.

Requirements

You can run the MassBalanceMachine core scripts and notebooks with the following software installed:

Installation (for all users)

To run the Jupyter Notebooks, you'll need to set up a Conda environment. Within this environment, Poetry will handle the installation of all necessary packages and dependencies. Follow these steps to create a new Conda environment named MassBalanceMachine:

conda env create -f environment.yml

Activate the MassBalanceMachine environment:

conda activate MassBalanceMachine # for linux and unix users alternatively: source activate MassBalanceMachine

Install all required packages and dependencies needed in the environment via poetry:

poetry install

All packages and dependencies should now be installed correctly, and you are ready to use the MassBalanceMachine core (massbalancemachine). For example, by importing the packing in a Jupyter Notebook by: import massbalancemachine as mbm. Make sure you have selected the right interpreter or kernel before that, in your editor of choice.

Note: If you are working on a remote server running JupyterLab or Jupyter Notebook (e.g. Binder) instead of locally, the virtual environment of the notebook will be different from the Conda environment. As an additional step, you need to create a new kernel that includes the Conda environment in Jupyter Notebook. Here’s how you can do it:

poetry run ipython kernel install --user --name=mbm_env

At last, make sure your Jupyter kernel, select in the top right corner of the notebook, or via the Launcher (sometimes this requires a refresh),has been configured to work with the 'mbm_env' Conda environment. You should now be ready to go and use massbalancemachine in your Jupyter Notebooks.

Known Installation Issues

  • Poetry occasionally flags duplicate package folders, but it simplifies dependency and version control in Python projects, ensuring seamless integration of libraries and packages, which can typically be resolved by locating and removing unwanted package versions from your Conda environment folder.

Additional Installation for Windows Users

Note: Topographical features are retrieved using OGGM in the data processing stage, which for now requires a Unix environment. However, it is not required to run the model training and evaluation in a remote environment. Window users can either choose to work with the MassBalanceMachine for the entire project in a Unix environment, or just for the data processing part (this requires two times installing the Conda environment).

If you haven't already, please consult How to install Linux on Windows with WSL. A list of steps is provided for Windows users to run this code on their local machine in a remote environment:

  1. Please see one of the following links, depending on your editor of choice, how to connect WSL as a remote environment:
    1. Visual Studio
    2. PyCharm
    3. Juypyter Notebook
  2. Installing Anaconda on Linux:
    1. Anaconda Docs, or
    2. Steps to Install Anaconda on Windows Ubuntu Terminal
  3. Follow the steps as specified in the section: Installation.
  4. Access the remote environment in the terminal, select the right kernel or interpreter and run the Jupyter Notebook or Python scripts.

Usage

After installing the massbalancemachine package and setting up the Conda environment successfully, you can start exploring the example notebooks found in the notebooks directory. These notebooks are designed to walk you through using MassBalanceMachine with WGMS data, focusing initially on extracting data from the Open Global Glacier Model (OGGM). This data includes comprehensive topographical information for nearly all glaciers worldwide.

Specifically, the example notebooks concentrate on glaciers documented in the WGMS database, particularly those in Iceland. They cover various topics, including:

  1. Data Pre-processing 🌍: Users have two options for preparing their data. They can choose to follow a notebook that converts their data into the WGMS format (available here), or they can start with their data already formatted in the WGMS standard (found here). In both workflows, topographical and climate data are fetched and aligned with the stake measurements. Subsequently, the data is aggregated to a monthly resolution, preparing it for use as training data for the model.
    • Note: If the OGGM cluster is shut down, users will be unable to retrieve topographical features for their region of interest. If you encounter a 403 error in your notebook while trying to retrieve these features, it likely means that the OGGM cluster is down. You can check the status of the cluster on their Slack channel.
  2. Data Exploration πŸ”: Users can gain deeper insights into their data by visualizing time series of the available stake measurements, which are related to either the region-wide surface mass balance or the point surface mass balance. The example is available here.
  3. Model Training πŸš€ & Testing 🎯: Users can choose from two models. One option is the XGBoost model, with an example available in this notebook. The other option is a neural network, which will be released in the future. Both models are customized to handle the monthly resolution of the data. In the notebooks, the models will be trained using the data obtained earlier.

Project Structure

  • The massbalancemachine package contains the core components of MassBalanceMachine, including scripts, classes, and example Jupyter Notebooks that are essential for new users to start a MassBalanceMachine project. This core package, named massbalancemachine, can be imported into scripts and Jupyter Notebooks as needed.
  • regions contains additional scripts, classes, and Jupyter Notebooks that are tailored for MassBalanceMachine instances that operate in different regions in the world. If the region you are interested in is not on this list, you can, with a pull request, add this to the repository. Please make sure you do not upload any confidential or unpublished data. Regions that are covered so far:
    • [WIP] Iceland
    • [WIP] Switzerland
    • [COMING SOON] Norway
    • [ADD YOUR OWN REGION]. PRs welcome! Message us if you have questions πŸ™‚

Project Roadmap

The following features are on the roadmap to be implemented in the coming months:

  • πŸ›°οΈ MassBalanceMachine uses geodetic mass balance data as an extra target variable on top of glaciological data. This will help calibrate the bias/trend in long simulations where the cumulative mass balance matters.
  • πŸ”„ MassBalanceMachine can do transfer learning for new regions, reducing the training time and making more accurate predictions.
  • πŸ“Š MassBalanceMachine can incorporate physical constraints, in order to merge physical knowledge with data-driven discovery.

Contributors

Julian
Julian

πŸ’» πŸ“– 🚧 πŸ”£ πŸ”¬
khsjursen
khsjursen

πŸ”¬ πŸ’» πŸ€” πŸ”£
Jordi Bolibar
Jordi Bolibar

πŸ”¬ πŸ“† πŸ’΅ πŸ€” πŸ§‘β€πŸ«
Marijn
Marijn

πŸ€” πŸ”£ πŸ”¬ πŸ’»
zekollari
zekollari

πŸ”¬ πŸ’΅ πŸ€” πŸ§‘β€πŸ«

Contribution Guidelines

  • The MassBalanceMachine project is an open-source community initiative that welcomes new users to fork the repository, add new regions, or modify the existing code and submit a pull request.
  • Currently, uploading data is not allowed unless it is accompanied by a license that explicitly permits open access, allowing it to be shared and used by others. Pull requests containing data will be rejected. In the future, data sharing will be supported.
  • If you have any questions, please contact one of the contributors listed above. You can also create new Git issues via the issue tracker to propose new features, changes to existing ones, or report bugs.

massbalancemachine's People

Contributors

julianbiesheuvel avatar jordibolibar avatar marvande avatar allcontributors[bot] avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.