Giter VIP home page Giter VIP logo

metloom's Introduction

metloom

Testing Status Documentation Status Code Coverage

Location Oriented Observed Meteorology

metloom is a python library created with the goal of consistent, simple sampling of meteorology and snow related point measurments from a variety of datasources is developed by M3 Works as a tool for validating computational hydrology model results. Contributions welcome!

Warning - This software is provided as is (see the license), so use at your own risk. This is an opensource package with the goal of making data wrangling easier. We make no guarantees about the quality or accuracy of the data and any interpretation of the meaning of the data is up to you.

  • Free software: BSD license

Features

Requirements

python >= 3.7

Install

python3 -m pip install metloom
  • Common install issues:
    • Macbook M1 and M2 chips: some python packages run into issues with the new M chips
      • error : from lxml import etree in utils.py ((mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64)

        The solution is the following

        pip uninstall lxml
        pip install --no-binary lxml lxml

Local install for dev

The recommendation is to use virtualenv, but other local python environment isolation tools will work (pipenv, conda)

python3 -m pip install --upgrade pip
python3 -m pip install -r requirements_dev
python3 -m pip install .

Testing

pytest

If contributing to the codebase, code coverage should not decrease from the contributions. Make sure to check code coverage before opening a pull request.

pytest --cov=metloom

Documentation

readthedocs coming soon

https://metloom.readthedocs.io.

Usage

See usage documentation https://metloom.readthedocs.io/en/latest/usage.html

NOTES: PointData methods that get point data return a GeoDataFrame indexed on both datetime and station code. To reset the index simply run df.reset_index(inplace=True)

Simple usage examples are provided in this readme and in the docs. See our examples for code walkthroughs and more complicated use cases.

Usage Examples

Use metloom to find data for a station

from datetime import datetime
from metloom.pointdata import SnotelPointData

snotel_point = SnotelPointData("713:CO:SNTL", "MyStation")
df = snotel_point.get_daily_data(
    datetime(2020, 1, 2), datetime(2020, 1, 20),
    [snotel_point.ALLOWED_VARIABLES.SWE]
)
print(df)

Use metloom to find snow courses within a geometry

from metloom.pointdata import CDECPointData
from metloom.variables import CdecStationVariables

import geopandas as gpd

fp = <path to shape file>
obj = gpd.read_file(fp)

vrs = [
    CdecStationVariables.SWE,
    CdecStationVariables.SNOWDEPTH
]
points = CDECPointData.points_from_geometry(obj, vrs, snow_courses=True)
df = points.to_dataframe()
print(df)

Tutorials

In the Examples folder, there are multiple Jupyter notbook based tutorials. You can edit and run these notebooks by running Jupyter Lab from the command line

pip install jupyterlab
jupyter lab

This will open a Jupyter Lab session in your default browser.

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

metloom's People

Contributors

micah-prime avatar robertson-mark avatar micahjohnson150 avatar dependabot[bot] avatar zachhoppinen avatar bessoh2 avatar kehangit avatar

Stargazers

 avatar Joachim Meyer avatar Tobias Augspurger avatar Jeff Carpenter avatar Ibrahim O Alabi avatar  avatar Jacob avatar Scott Henderson avatar Steven Pestana avatar Colorado Reed avatar ShihengATM avatar Rob avatar  avatar

Watchers

 avatar

metloom's Issues

Publish to pypi

Publish to pypi

  • Setup pypi for M3Works
  • Setup actions to publish to PyPi after successful build of main branch if the tag changed (or on a tag creation?)

timezone issues with mesowest

  • metloom version: 0.2.0
  • Python version: 3.8
  • Operating System: mac

Description

Sampling snotel station 'CESC1' around 2020-03-08 results in a timezone error
sensor_df["datetime"].apply(self._handle_df_tz) results in nonexistanttimeerror 2020-03-08 02:02:00
the tzinfo is 'America/Los_Angeles'

MesowestPointData.points_from_geometry does not consider zero stations being returned

  • metloom version: 0.2.1
  • Python version: 3.9
  • Operating System: MacOS

Description

Using a small bounding box to grab mesowest stations and received an error

What I Did

Running this I receive this traceback

File "/home/user/envs/loomenv/lib/python3.9/site-packages/metloom/pointdata/mesowest.py", line 285, in points_from_geometry
  data = jdata['STATION']
KeyError: 'STATION'

It would appear mesowest doesn't consider receiving no stations back. The solution here is to return an empty list instead of trying to index on it.

CDEC fall back on finer timescale data if desired increment is not found

  • metloom version: 0.2.2
  • Python version: 3.8
  • Operating System: OSX

Description

CDEC sometimes makes 15 minute data available when daily or hourly is not. We can resample to get the data we want. Example

https://cdec.water.ca.gov/dynamicapp/req/JSONDataServlet?Stations=FLV&SensorNums=18&dur_code=E&start_date=2021-11-01&end_date=2021-11-04

https://cdec.water.ca.gov/dynamicapp/req/JSONDataServlet?Stations=FLV&SensorNums=18&dur_code=D&start_date=2021-11-01&end_date=2021-11-04

This would broaden the list of sensors that we can actually retrieve data for

Adding Irwin SP

Creating this to continue a convo with Joe.

Would be cool to add in the csv data from the Irwin Study plot to be used from metloom.

There doesn't sound like a publicly available API or link but we could still make a class to read the csv that is often emailed around when updated. We could also add a variables class to map variables. Here is what I am picturing.

from metloom.pointdata import IrwinPointData 
from metloom.variables import IrwinVariables

pnt = IrwinPointData(csv=file)
df = pnt.get_daily_data(start, end, [IrwinVariables.SNOWDEPTH])

The variables are a straightforward mapping of their names to a more useful or readable name. e.g.

in variables.py

class IrwinVariables(VariableBase):
    SNOWDEPTH = InstrumentDescription("Sno_Height_M", "SNOWDEPTH")

While this effort wouldnt solve much in the way of getting the data, it would help standardize the use of individual energy balance sites that all seem to do things a scosh different.

NRCS SNOLITE data

NRCS has depth and temp stations available via the Snotel Lite (SNOTLITE) network. This is not documented on the AWDB docs, but is available via the "SNTLT" network code. Add this as something we find by default

CDEC removed Metadata URL

  • metloom version: 0.3.2
  • Python version: 3.8.10
  • Operating System: Ubuntu 20.04

Description

Pulling any CDEC station metadata is currently showing 404. After a long arduous discussion with the powers that be, I learned the link that is currently being used they deprecated 2 years ago and had a redirect to somewhere else. 2 Days ago they got rid of that redirect for security reasons. When I asked what is the designated webservice they would like folks to use for metadata they said there isn't one and offer their web interface as a solution.

What I Did

from metloom.pointdata import CDECPointData
pnt = CDECPointData('TUM', 'Tuolumne Meadows')
print(pnt.metadata)

yields an exception

requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http://cdec.water.ca.gov/cdecstation2/CDecServlet/getStationInfo?stationID=TUM

The recommended url to use now is https://cdec.water.ca.gov/dynamicapp/staMeta?station_id=TUM .

Mesowest Point Data throws exception when no station is found

  • metloom version: 0.2.3
  • Python version: 3.8
  • Operating System: Ubuntu

Description

Attempting to pull a mesowest station and I misspelled the station id. This results in 'STATION is not in the response data and thus throws an exception.

What I Did

from metloom.pointdata import MesowestPointData
from dateparser import parse

pnt = pnt = MesowestPointData('NOT_REAL_CODE', 'test')
pnt.get_daily_data(parse('2 days ago'), parse('today'), [pnt.ALLOWED_VARIABLES.TEMP])

This throws

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/micah/projects/m3works/metloom/metloom/pointdata/mesowest.py", line 280, in get_daily_data
    df = self._get_data(start_date, end_date, variables, interval='D')
  File "/home/micah/projects/m3works/metloom/metloom/pointdata/mesowest.py", line 119, in _get_data
    sensor_df = self._sensor_response_to_df(
  File "/home/micah/projects/m3works/metloom/metloom/pointdata/mesowest.py", line 183, in _sensor_response_to_df
    station_response = response_data['STATION']
KeyError: 'STATION'

Note the response data is:

{'SUMMARY': {'NUMBER_OF_OBJECTS': 0, 'RESPONSE_CODE': 2, 'VERSION': 'v2.13.8', 'RESPONSE_MESSAGE': 'No stations found for this request.', 'RESPONSE_TIME': 0}}

USGS - Snotel coordinate system

Question on where the NAD-83 EPSG code information came:

data = data.set_crs("EPSG:4269").to_crs("EPSG:4326")

I needed a refresher myself today and came across this in the FAQ:

SNOTEL site coordinates in the WGS84 coordinate system (Latitude/Longitude) are available on our website

Comparing the values for 637:ID:SNTL, it looks like that this is also returned in the SOAP API metadata.

So is this conversion necessary?

Resample dataframe oddity

  • metloom version: 0.2.1
  • Python version: 3.9
  • Operating System: Mac

Description

Resample dataframe returns an empty dataframe when the original index does not contain values for the resampled index. This will show up when a sensor returns data at odd increments (non of which line up with an hour or day)

What I Did

def test_frequency():
    stat = MesowestPointData("KTRK", "Truckee-Tahoe")
    # returning nothing
    data = stat.get_hourly_data(
        datetime(2021, 11, 4), datetime(2021, 11, 8),
        [stat.ALLOWED_VARIABLES.TEMP]
    )
    print(data)
    assert data is not None

NRCS duplicate data

I ran into an issue where 585:WY:SNTL returns duplicate entries of the exact same value for hourly air temp. This should be simple to handle on the metloom side of things

Metloom architecture - One class per file

As a newcomer to metloom, it is quite challenging to find specific classes or tests when trying to figure out execution logic. You can use your IDE (or preferred environment) to do a search and it sill leaves my head spinning when trying to navigate. Files also get quite long right now.

I am proposing three architectural changes/conventions:

  • File names to match their class
  • One class per file
  • Each class test file name corresponds with "test_my_awesome_class"

I have followed this pattern in the past and it makes code navigation quite simpler (especially for newcomers). It also makes testing each class simpler and organizes your test suite the same manner as your sources.

What do you guys think?

I am happy to prototype this in a PR

Buffer station search

Add a buffer parameter (assume same units as shapefile) when searching for stations to allow searches expanding past the shapefile bounding box

Ground Temp fails on get_daily)data

  • metloom version: 0.2.11
  • Python version: 3..8.10
  • Operating System: Ubuntu 20.04

Description

I pulled GroundTemperature from the snotel network using get_daily_data and it will error out.

What I Did

pnt = SnotelPointData('505:CO:SNTL')
df = pnt.get_daily_data(
    datetime(2020, 1,1), datetime(2020, 1,2),
    [SnotelVariables.TEMPGROUND8IN]
)
  File "/home/micah/projects/m3works/metloom/metloom/pointdata/snotel.py", line 120, in _fetch_data_for_variables
    params = extra_params[variable.name]
KeyError: 'GROUND TEMPERATURE -8IN'

Process finished with exit code 1

Add CSV file reader

Looking to add a csv reader for the SnowEx Stations and the Senator Beck stations.

Description

It would be nice to have the same mapped features for using flat files that we encounter with SnowEx and with Senator Beck.

Don't rely on climata

  • metloom version: 0.1.0
  • Python version: 3
  • Operating System: mac, ubuntu

Description

When installing metloom with the latest setuptools, climata fails to build. Climata depends on suds-jurko which uses deprecated setuptools methods. Suds-jurko appears unmaintained. Since we only have limited dependencies on Climata, the best course of action would be to abandon the dependency and rewrite the parts we need with a more current SOAP client

Implement get hourly data in USGS

We should:

  • Implement hourly data function in the USGS loader
  • Improve the search to be for daily and instantaneous variables
  • Add resampling fallbacks to all methods

NWS Forecast API

It would be very useful to have a class that handles 7 day forecasts form the NWS

Here is the API https://www.weather.gov/documentation/services-web-api

We can call the points api to get the URL for the forecast https://api.weather.gov/points/42,-119

In this example, 3 forecast URLs are available

"forecast": "https://api.weather.gov/gridpoints/BOI/28,28/forecast",
"forecastHourly": "https://api.weather.gov/gridpoints/BOI/28,28/forecast/hourly",
"forecastGridData": "https://api.weather.gov/gridpoints/BOI/28,28",

forecast can be used to return the 12 hour increments

forecastHourly returns hourly data (hourly in local tz)

forecastGridData returns the 'raw' grid data (hourly UTC). This is likely the easiest version for us to use in the API. We can aggregate to daily as needed.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.