Giter VIP home page Giter VIP logo

hagelslag's Introduction

Hagelslag

Storm tracking, machine learning, and probabilistic evaluation

NSF-1261776

Hagelslag is an object-based severe storm forecasting system that utilizing image processing and machine learning tools to derive calibrated probabilities of severe hazards from convection-allowing numerical weather prediction model output. The package contains modules for storm identification and tracking, spatio-temporal data extraction, and machine learning model training to predict hazard intensity as well as space and time translations.

Citation

If you employ hagelslag in your research, please acknowledge its use with the following citations:

Gagne, D. J., A. McGovern, S. E. Haupt, R. A. Sobash, J. K. Williams, M. Xue, 2017: Storm-Based Probabilistic Hail
Forecasting with Machine Learning Applied to Convection-Allowing Ensembles, Wea. Forecasting, 32, 1819-1840. 
https://doi.org/10.1175/WAF-D-17-0010.1. 

Gagne II, D. J., A. McGovern, N. Snook, R. Sobash, J. Labriola, J. K. Williams, S. E. Haupt, and M. Xue, 2016: 
Hagelslag: Scalable object-based severe weather analysis and forecasting. Proceedings of the Sixth Symposium on 
Advances in Modeling and Analysis Using Python, New Orleans, LA, Amer. Meteor. Soc., 447.

If you discover any issues, please post them to the Github issue tracker page. Questions and comments should be sent to djgagne at ou dot edu.

Requirements

Hagelslag is compatible with Python 3.6 or newer. Hagelslag is easiest to install with the help of the Miniconda Python Distribution, but it should work with other Python setups as well. Hagelslag requires the following packages and recommends the following versions:

  • numpy >= 1.10
  • scipy >= 0.15
  • matplotlib >= 1.4
  • scikit-learn >= 0.16
  • pandas >= 0.15
  • arrow >= 0.8.0
  • pyproj
  • netCDF4-python
  • xarray
  • jupyter
  • ncepgrib2
  • pygrib
  • cython
  • pip
  • sphinx
  • mock

Install dependencies with the following commands:

git clone https://github.com/djgagne/hagelslag.git
cd ~/hagelslag
conda env create -f environment.yml
conda activate hagelslag

Installation

Install the latest version of hagelslag with the following command from the top-level hagelslag directory (where setup.py is): pip install .

Hagelslag will install the libraries in site-packages and will also install 3 applications into the bin directory of your Python installation.

Use

A Jupyter notebook is located in the demos directory that showcases the functionality of the package. For larger scale use, 3 scripts are provided in the bin directory.

  • hsdata performs object tracking and matching as well as data processing.
  • hsfore trains and applies machine learning models.
  • hseval performs forecast verification.

All scripts take input from a config file. The config file should be valid Python code and contain a dictionary called config. Custom machine learning models and parameters should be contained within the config files. Examples of them can be found in the config directory.

Documentation

API Documentation is available here.

hagelslag's People

Contributors

ahijevyc avatar alburke avatar charlie-becker avatar djgagne avatar lmadaus avatar mariajmolina avatar thomasmgeo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hagelslag's Issues

Data format

Good day!

I have been reviewing the all the codes from hagelslag and I would like to ask the data input format that it needs. Thank you.

different centroid in track_step csv and patches nc

The centroid in the csv track_step file is different than the centroid in the patches netCDF file.
When I say different, I don't mean more than 0.2 degrees different, but definitely more than machine roundoff error.

For example:
object: d01_REFL_1KM_AGL_20110427-0000_24_25_622_01
csv longitude: -87.70007
patches longitude: -87.73282

Issue with Probability Evaluation Function

The ampersand (&) symbol in the /hagelslag/evaluation/ProbabilityMetrics.py module's DictributedROC class update() function did not produce expected behavior. My interpretation is that np.count_nonzero((forecasts >= threshold) & (observations >= self.obs_threshold)) should yield the frequency of values where boolean values are true and intersect, but instead this yields for me the frequency of times either boolean array is true. I added a np.logical_and and it fixed this issue for me. Please let me know if you can replicate this issue, and if so, I can submit a pull request of my fix.

    def update(self, forecasts, observations):
        """
        Update the ROC curve with a set of forecasts and observations
        Args:
            forecasts: 1D array of forecast values
            observations: 1D array of observation values.
        """
        for t, threshold in enumerate(self.thresholds):
            tp = np.count_nonzero((forecasts >= threshold) & (observations >= self.obs_threshold))
            fp = np.count_nonzero((forecasts >= threshold) &
                                  (observations < self.obs_threshold))
            fn = np.count_nonzero((forecasts < threshold) &
                                  (observations >= self.obs_threshold))
            tn = np.count_nonzero((forecasts < threshold) &
                                  (observations < self.obs_threshold))
            self.contingency_tables.iloc[t] += [tp, fp, fn, tn]

skimage regionprops depreciating xy coordinates

Eventually we need to accept the objects' region properties in row-column coordinates, as opposed to xy.

skimage.measure regionprops function will stop using xy coordinates and use row-column instead.
STObject.py uses it for shape properties.

We could specify regionprops(..., coordinates='xy') for now to avoid the warning, but this will break.

See https://scikit-image.org/docs/0.14.x/release_notes_and_installation.html#deprecations for details on how to avoid this message.
warn(XY_TO_RC_DEPRECATION_MESSAGE)

STObject time tracking

STObject was built under the assumption of hourly integer time steps originally, and some of the derived functions have not been updated to enable arbitrary time differencing when tracking or calculating trajectories and centroids. This functionality needs to be fixed so that arbitrary times can be used rather than integer times with a step of 1.

Patch Centroid lon / lat

The centroid lon / lat data in the netCDF patch data represents the the longitude / latitude values at the center of the patch, not the center of the object, and thus there are mis-matches between centroid lon / lat in the CSV files and netCDF files for the same storm objects.

hagelslag/bin/hsdata

Lines 642 to 645 in 57d1051

for c_var in ["lon", "lat"]:
out_file.variables["centroid_" + c_var][:] = np.concatenate([np.array(f_track.attributes[c_var])[:,
patch_radius, patch_radius]
for f_track in forecast_tracks])

Is there any reason to keep the patch-center derived lons / lats? If not, the values could be replaced using the same process to generate the centroids for the CSV file.

hagelslag/bin/hsdata

Lines 454 to 455 in 57d1051

centroid_x, centroid_y = forecast_track.center_of_mass(step)
centroid_lon, centroid_lat = proj(centroid_x, centroid_y, inverse=True)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.