Giter VIP home page Giter VIP logo

iwls_pygeoapi's People

Contributors

glmcr avatar maximecarre1 avatar princessmittens avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

iwls_pygeoapi's Issues

Implement code for the calculation of the WLs tidal predictions (& models forecasts eventually) error statistics.

1). We will begin with simple errors statistics (arithmetic mean, variance, median(?)) for tidal predictions for some of the most important TGs stations (keeping in mind that we could do those calculations for all TGs stations having WL obs data) .

2). We will use those errors statistics for the "uncertainties" attributes in the S111 and S104 DCF8 (and also S111,S104 DCF2 and DCF3) products.

3). We will also use this code to produce WLs forecast errors stats for both H2D2 and Spine-OneD models results and we will compare those forecast errors to prove that H2D2 forecasted WLs (adjusted using the last obs WLs a-la-Spine-OneD) provides better or similar forecast quality than Spine-OneD model.

4). We will then add error covariances calculations between the TGs afterwards.

Things to keep in mind:

  • The errors statistics code should be generic (data agnostic) as much as we can do since we will probably also use it for comparing HADCP measurements with models forecasted currents.

  • We will need to implement code that can extract data (WLs, currents) from models results (after conversion to S104, S111 (and S4**)) at the locations of the TGs and HADCP. It will be possible to use our already developped code (on the science network) that assign models grid points to tiles

Things to discuss:

  • We need to keep the errors stats data on disc. probably using a SQL DB?

Default behaviour of variable length strings

After more reading, default behavior of variable length strings in datasets for h5py are specifically encoded as byte strings. This is something we can look into further but I am not completely sure why byte strings are the default (compressibility maybe?). Given that is the default, perhaps we should consider keeping it as is and using the dataset.asstr() below to parse it. I have tested this and it works for the compound types as well.

From the h5py docs on strings:

String data in HDF5 datasets is read as bytes by default: bytes objects for variable-length strings, or numpy bytes arrays ('S' dtypes) for fixed-length strings. Use Dataset.asstr() to retrieve str objects.

Update S100 file creation code and production templates for S-111 1.2.1 and S-104 1.1.0

The following changes in the latest version of the draft specs will need to be implemented:

-typeOfWaterLevelData and typeOfSurfaceCurrentData are now named dataDynamicity and the attribute is common between the two specs.
-verticalCoordinateBase is being removed from S111, seaSurface can now be set with verticalDatum
-verticalDatum enum as new possible values (seaSurface and seaFloor)
-new format for the horizontalCRS
-new optional metadata values for stations group (we might be able to populate some of them from the IWLS metadata):
• Location Maritime Resource Name
• URL to station or data portal
• Type of Tide
• Station type

We will need to wait for the final version of the specs to implement changes.

Organize repo into modules

Current stuctures of the repo is unsorted with all files in a central folder. Moving classes into modules where possible will organize the structure of the repo to improve readability.

Setup CI pipeline

Implementation of tests requires a CI pipeline, likely CircleCI or Jenkins which need to be automated.

Currently, I only have developer permissions on the repo and require something higher (admin likely) to setup the repo to the CircleCI or Jenkins.

Test if process_metadata can be moved to an external template file

Currently it is a big dictionary hard coded in process_iwls. It was done this way to match the example in the external process documentation for pygeoapi, metadata for the process can't be changed without changing the code, it would make more sense to keep them in a external Json file in the template directory.

Test if PROCESS_METADATA can be move to a template file

Currently it is a big dictionary hard coded in process_iwls. It was done this way to match the example in the external process documentation for pygeoapi, Metadata for the process can't be changed without changing the code, it would make more sense to keep them in a external Json file in the template directory.

Add the define-once-and-for-all principle in the coding guidelines.

Just one example for this: We now have the "properties" string defined at 11 places in three different source code files so if this "properties" string is changed for another string (for whatever reason) then we would have to change it 11 times.

Then it would be a good idea to define it just once in a variable that would be visible from all the source code and we would have to change only the content of this variable if the related string is changed.

Test new IWLS api features

IWLS public API has a new parameter for data request (temporal resolution), default is set at the highest possible resolution (one minute).

-Run server in test mode and check if the default value for the new parameter breaks the existing code.

Enforce Float and Integer precisions in metadata fields

Change templates and update methods to use Float and Integer (including enums) precisions described in S-111 1.2.0 and S-104 1.1.0.
Currently all floats are signed 64bit and all integer are signed 32bit.

S-104 1.1.0= page 93, tables 12.1, 12.2 ,12.3, 12,4
S-111 1.2.0 = page 95, tables 12.1, 12.2 ,12.3, 12,4

Use the following numpy class for encoding:
Float 64 = numpy.double
Float 32 = numpy.single
Int 32 = numpy.intc
Int 32 unsigned = numpy.uintc
Int 16 = numpy.short
Int 8 unsigned (for enumerations) = numpy.ubyte

Add Support for Hi-Lo S-104 files

  • Request Hi Lo data from IWLS with iwls_api_connector_waterlevels.py
  • Modify process_iwls.py to add option to request hi-lo in the process
  • Modify s104.py to allow correct trend calculation for hi-lo datasets (hardcode to steady or unknow?)

Update repo to follow PEP8 guidelines

In the a recent MR for refactoring code, some of the previous work (that I wrote) does not comply with PEP8 style/formatting. The goal is to fix this to apply the PEP8 style and ensure that is consistent across the repository.

Add support for VTG forecasts

The IWLS team is starting tests to integrate VTG stations forecasts to the database; support for the new data source (wlf-vtg) will need to be added to IwlsApiConnectorWaterLevels()

Attribute Metadata bug fixes

As discussed previously, the S104/S111 should have similar group metadata outlined below by @glmcr

In addition to this there are redundant attributes:
northBoundLongitude- to be removed
southBoundLongitude- to be removed
numInstance - redundant (should be numInstances but missing an 's' so a new attribute was created, likely a typo)

Use the same structure as the S104Dcf8:
---------------------------------------

    If one HADCP as more than one measurement point ("bin") associated with it then
    it can be considered as a "station" and should have its own "Group_nnn" structure.

    e.g. Woodward's Landing would have 3 bins (three measurements points with different coordinates
    in double precision in order that they can be displayed as two distinct points in a GUI
    viz app. in full zoom)

    Note that the "Hydrodynamic model forecast" current type would be interpolated
    at the stations "bins" points coordinates (when it is possible to do so)

    ...


    ATTRIBUTE "numInstances" {
        DATATYPE  H5T_STD_I32LE
        DATASPACE  SCALAR
        DATA {
        (0): 2
        }
    }
    ...

    GROUP "SurfaceCurrent.01" {
       ...

       ATTRIBUTE "typeOfCurrentData" {

           ...
           DATASPACE  SCALAR
           DATA {
           (0): Real-time observation
          }
       }
       ...

       ATTRIBUTE "numberOfStations" {
            DATATYPE  H5T_STD_I64LE
            DATASPACE  SCALAR
            DATA {
            (0): 3
            }
       }
       ...

       GROUP Group_001 {
           ...

           ATTRIBUTE "stationName" {
              ...

              DATASPACE  SCALAR
              DATA {
               (0): "Woodward's Landing: bin.001"
              }       
           }
           ...
        }
        GROUP Group_002 {
            ...

            ATTRIBUTE "stationName" {
                ...

               DATASPACE  SCALAR
               DATA {
               (0): "Woodward's Landing: bin.002"
               }
            }
            ...
        }
        GROUP Group_003 {
            ...

            ATTRIBUTE "stationName" {
                ...

               DATASPACE  SCALAR
               DATA {
               (0): "Woodward's Landing: bin.003"
               }
            }
            ...
        }
        GROUP "Positioning" {
          DATASET "geometryValues" {
             DATATYPE  H5T_COMPOUND {
                H5T_IEEE_F64LE "latitude";
                H5T_IEEE_F64LE "longitude";
             }
             DATASPACE  SIMPLE { ( 3 ) / ( 3 ) }
          ...         
        }
    } 
    GROUP "SurfaceCurrent.02" {
       ...

       ATTRIBUTE "typeOfCurrentData" {

           ...
           DATASPACE  SCALAR
           DATA {
           (0): Hydrodynamic model forecast
          }
       }
       ...

       ATTRIBUTE "numberOfStations" {
            DATATYPE  H5T_STD_I64LE
            DATASPACE  SCALAR
            DATA {
            (0): 3
            }
       }
       ...

       GROUP Group_001 {
           ...

           ATTRIBUTE "stationName" {
              ...

              DATASPACE  SCALAR
              DATA {
               (0): "Woodward's Landing: bin.001"
              }
           }
           ...
        }
        GROUP Group_002 {
            ...

            ATTRIBUTE "stationName" {
                ...

               DATASPACE  SCALAR
               DATA {
               (0): "Woodward's Landing: bin.002"
               }
            }
            ...
        }
        GROUP Group_003 {
            ...

            ATTRIBUTE "stationName" {
                ...

               DATASPACE  SCALAR
               DATA {
               (0): "Woodward's Landing: bin.003"
               }
            }
            ...
        }
        GROUP "Positioning" {
          DATASET "geometryValues" {
             DATATYPE  H5T_COMPOUND {
                H5T_IEEE_F64LE "latitude";
                H5T_IEEE_F64LE "longitude";
             }
             DATASPACE  SIMPLE { ( 3 ) / ( 3 ) }
          ...         
       }
    } 

Add additional error handling

Tying up loose ends in the repo:

  • Currently the api returns a zip file which is actually a json file when the error. What the repo returns should reflect the type of file. I.e. if returning a zip, it should be a .zip extension and vice versa for a json.
  • Requests to pygeoapi are currently not handled, these should be wrapped in a try/except block.
  • Remove unnecessary generated files created from a request.
  • General refactoring where necessary.

Implement extraction of 2D models forecast data at CHS tides gauges and HADCP locations

  • Directly telated to issue #28.
  • We will use the tiled S104 DCF2 and S111 DCF2, DCF3 that will end up on https://dd.alpha.meteo.gc.ca (non-operational) and eventually on https://dd.weather.gc.ca (operational) as input data.
  • The CHS tide gauges are normally inside one tile pixel for HR DFO ports models and H2D2 models..
  • Nearest DCF2 wet pixel otherwise or-and for coarser res. models (CIOPSes).
  • Nearest DCF3 unstructured grid point otherwise for H2D2 models currents

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.