Giter VIP home page Giter VIP logo

Comments (21)

pmav99 avatar pmav99 commented on July 30, 2024 1

We don't have a "template". I added some thoughts of how the API could/should be in the wiki: https://github.com/oceanmodeling/searvey/wiki/API-design
but feel free to open a new ticket to further discuss this.

from searvey.

brey avatar brey commented on July 30, 2024 1

I understand the point but I wonder if we should bring this to the attention of Timothy first (with an issue on dataretrieval) and see what he has to say. Having said that I leave it up to you guys.

from searvey.

SorooshMani-NOAA avatar SorooshMani-NOAA commented on July 30, 2024 1

I think it would be better to do what you suggest. I already created an issue here DOI-USGS/dataretrieval-python#59. In the last meeting only two of us were present, so I just wanted to relay what was discussed. I haven't yet implemented anything for USGS.

from searvey.

cheginit avatar cheginit commented on July 30, 2024 1

@mroberge, Thanks for mentioning HyRiver. As Martin said, PyGeoHydro includes a class called NWIS that provides access to several NWIS endpoints (you can check out this example notebook). Also, I developed robust and performant engines for working with web services (AsyncRetriever and PyGeoOGC), so feel free to explore them and let me know if you need any help.

from searvey.

SorooshMani-NOAA avatar SorooshMani-NOAA commented on July 30, 2024 1

@cheginit I learned about your toolset a couple of weeks ago when working on a different project. Your software stack is very impressive and useful, however since searvey is focused on giving access to the original data from the source at the lowest level, it makes more sense to use minimal packages like dataretrieval. With that being said, I'm looking forward to using your software stack in other projects.

from searvey.

 avatar commented on July 30, 2024

thanks!

from searvey.

saeed-moghimi-noaa avatar saeed-moghimi-noaa commented on July 30, 2024

Thanks to @flackdl just share the location of the latest file:
https://github.com/flackdl/cwwed/blob/ad39f0e9bea6a0a3bdbc937fea41994f4ed359ba/scripts/usgs.py

from searvey.

 avatar commented on July 30, 2024

great, I've made a first draft of the implementation of this here:
https://github.com/oceanmodeling/StormEvents/blob/7054095b4cb54ac733ea40091a5a2ffa1210c50b/stormevents/usgs/events.py#L313-L375

from searvey.

saeed-moghimi-noaa avatar saeed-moghimi-noaa commented on July 30, 2024

Thanks @zacharyburnettNOAA . See the email I just sent to Danny.

from searvey.

brey avatar brey commented on July 30, 2024

@SorooshMani-NOAA provided some input via email. I repost here for completeness:

Today I noticed this package on GitHub: https://github.com/USGS-python/dataretrieval

I was wondering if this retrieves the same data that you were interested in or if there's another USGS database that you'd like to query?

This ones seems to have the following data available for retrieval:
instantaneous values (iv)
daily values (dv)
statistics (stat)
site info (site)
discharge peaks (peaks)
discharge measurements (measurements)
water quality samples (qwdata)

which seems to be what the water services REST API provides:
https://waterservices.usgs.gov/rest/

George, if this is the same database the Jack is interested in, does it make sense to add a "normalization" wrapper on top of the dataretrieval package or should searvey directly use REST API?

from searvey.

brey avatar brey commented on July 30, 2024

I looked a bit into dataretrieval and looks good. It already has users, they are considering doing a conda package
(see issue 44 therein) and the lead developer works for USGS which is beneficial for updates and access.

If it exposes all the data, then we can make a wrapper and use it as upstream dependency.

We can also invite Timothy Hodson to a meeting and discuss it.

from searvey.

SorooshMani-NOAA avatar SorooshMani-NOAA commented on July 30, 2024

Documenting relevant email between me and @Rjialceky (slightly modified):

[...] CSDL [...] is interested in the following observations in support of the coastal application teams modeling work for NOAA products and services:

  • Surface water level
  • Water level datums, relative and geodetic observations
  • Water temperature
  • Water salinity
  • Water currents

I am primarily interested in datum points in support of navigation products and services; and, where unavailable, interested in the surface water levels to formulate new datums. The challenge of course is to have searvey assemble available observations sourced from NOAA, IOC, USGS, etc. into the normalized categories above. In the case of USGS, the number of [potentially] available parameters to sort out from their observation sites looks especially large—so any software API / wrapper that makes that easier, maintainable, etc. should be leveraged:

https://waterdata.usgs.gov/nwis/uv?referred_module=sw&search_criteria=multiple_site_no&submitted_form=introduction

@brey @pmav99 I can't find the other ticket where we discussed normalization and/or standardization of the outputs. Given the quoted email above, how would you approach adding getter functions? Do we have a template to follow?

from searvey.

SorooshMani-NOAA avatar SorooshMani-NOAA commented on July 30, 2024

So does that mean if we want to add USGS data we (for now) just need to return the raw output we get from their API? In this case, is it really meaningful to have a wrapper around USGS dataretrieval package? Because they're already returning a dataframe

from searvey.

SorooshMani-NOAA avatar SorooshMani-NOAA commented on July 30, 2024

Today I was exploring using dataretrieval package for obtaining USGS datasets. It seems that dataretrieval removes a lot of metadata from the NWIS response during the creation of data tables. For example when getting the "instantaneous value" record for a station we might have something like the following as response from the web API:

{
    "name": "USGS:0148472405:00035:00000",
    "sourceInfo": {
        "geoLocation": {
            "geogLocation": {
                "latitude": 38.1389722,
                "longitude": -75.18363889,
                "srs": "EPSG:4326"
            },
            "localSiteXY": []
        },
        "note": [],
        "siteCode": [
            {
                "agencyCode": "USGS",
                "network": "NWIS",
                "value": "0148472405"
            }
        ],
        "siteName": "BUNTINGS GUT NEAR CEDARTOWN, MD",
        "siteProperty": [
            {
                "name": "siteTypeCd",
                "value": "ST-TS"
            },
            {
                "name": "hucCd",
                "value": "02040303"
            },
            {
                "name": "stateCd",
                "value": "24"
            },
            {
                "name": "countyCd",
                "value": "24047"
            }
        ],
        "siteType": [],
        "timeZoneInfo": {
            "daylightSavingsTimeZone": {
                "zoneAbbreviation": "EDT",
                "zoneOffset": "-04:00"
            },
            "defaultTimeZone": {
                "zoneAbbreviation": "EST",
                "zoneOffset": "-05:00"
            },
            "siteUsesDaylightSavingsTime": true
        }
    },
    "values": [
        {
            "censorCode": [],
            "method": [
                {
                    "methodDescription": "",
                    "methodID": 234506
                }
            ],
            "offset": [],
            "qualifier": [
                {
                    "network": "NWIS",
                    "qualifierCode": "P",
                    "qualifierDescription": "Provisional data subject to revision.",
                    "qualifierID": 0,
                    "vocabulary": "uv_rmk_cd"
                }
            ],
            "qualityControlLevel": [],
            "sample": [],
            "source": [],
            "value": [
                {
                    "dateTime": "2022-12-06T12:00:00.000-05:00",
                    "qualifiers": [
                        "P"
                    ],
                    "value": "1.2"
                }
            ]
        }
    ],
    "variable": {
        "noDataValue": -999999.0,
        "note": [],
        "oid": "45807109",
        "options": {
            "option": [
                {
                    "name": "Statistic",
                    "optionCode": "00000"
                }
            ]
        },
        "unit": {
            "unitCode": "mph"
        },
        "valueType": "Derived Value",
        "variableCode": [
            {
                "default": true,
                "network": "NWIS",
                "value": "00035",
                "variableID": 45807109,
                "vocabulary": "NWIS:UnitValues"
            }
        ],
        "variableDescription": "Wind speed, miles per hour",
        "variableName": "Wind speed, mph",
        "variableProperty": []
    }
}

But the resulting data set only returns (examples not from the same station!):

                           00060 00060_cd     site_no  00065 00065_cd
datetime
2022-12-06 08:45:00-05:00   4.48        P  0148471320   3.72        P

Does this make sense then to instead use web API directly (going back to the original question!)? Since in any case we need to create tables of constants, such as parameter codes, quality codes, etc. It may be that dataretrieval doesn't really take much heavy lifting off of searvey development in the end.

There's also the delay in fixing issues in dataretrieval and waiting for it to get to conda for searvey to depend on it. Right now, for example, there are some issues when retrieving data from stations with different time zones that results in an exception.

from searvey.

SorooshMani-NOAA avatar SorooshMani-NOAA commented on July 30, 2024

After discussion the comment above with @pmav99 during data retrieval meeting, we decided it makes more sense to start calling the NWIS API directly to start with, and just use our own mapping of response to data frames.

from searvey.

mroberge avatar mroberge commented on July 30, 2024

There are a variety of Python packages that use the USGS API. I set up a discussion among the authors here: mroberge/hydrofunctions#79

  • Taher Chegini @cheginit just added some elegant code to his HyRiver package that deals with timezone information from the USGS metadata.
  • my hydrofunctions requests data, stores the original response, and formats it into dataframes upon request. My plan is to offer more ways to organize the dataframe in the future: a 'tidy' format, wide, and multiindex.

from searvey.

SorooshMani-NOAA avatar SorooshMani-NOAA commented on July 30, 2024

Thank you @mroberge this information is very helpful.

from searvey.

SorooshMani-NOAA avatar SorooshMani-NOAA commented on July 30, 2024

I just realized that the get_iv metadata item in the returned tuple can include information about the parameter code or site. I though that the metadata only includes header or url information, but if the right arguments are passed, more information is extracted and included. I think the main question now is how much we want to keep the data from REST API untouched?

For IOC and COOPS stations we pretty much return whatever is provided by the web services, but for USGS NWIS we have to do so transformation either way. Can we then just take output of dataretrieval (or even one of the other packages from #14 (comment)) to be the main source of data and just return that data with minimal changes to fit searvey API conventions?

from searvey.

SorooshMani-NOAA avatar SorooshMani-NOAA commented on July 30, 2024

@brey, @pmav99, @saeed-moghimi-noaa, if you haven't already, I highly recommend reading this summary by @mroberge: mroberge/hydrofunctions#79. (mentioned in #14 (comment))

After that I'd like us to re-evaluate why we want to add USGS support within searvey. My take is:

  • searvey is a one-stop shop for [original] measurement data used for validating coastal ocean models
  • dataretrieval returns the data in a form very close to original source (NWIS REST API)
  • We don't want to reimplement the wheel

I'm just thinking out load, but given above (as opposed to what I said to @pmav99 the other day) maybe it makes more sense to follow the original plan of using dataretrieval package, and just assume the return values are the original data from source.

What do you think?

from searvey.

saeed-moghimi-noaa avatar saeed-moghimi-noaa commented on July 30, 2024

@SorooshMani-NOAA

What you suggested make sense. I am fine with that. However I will let @brey and @pmav99 as the lead developers of searvey to have the final say.

Thanks,

from searvey.

brey avatar brey commented on July 30, 2024

After the discussion with @SorooshMani-NOAA few days back and seeing his progress (!) using dataretrieval let's go with that. Thanks Soroosh.

I will close this issue and we can open more specific ones if needed during the implementation.

from searvey.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.