pygeoapi plugins to access and process water level and surface currents from the IWLS public API

Python 98.97% Shell 1.03%

iwls_pygeoapi's People

Contributors

Stargazers

Watchers

iwls_pygeoapi's Issues

Experiments on using non-constant time intervals for DCF8

1). Experiment on using variable time intervals even for constant time intervals and dropping use of missing data flags.
2). Experiment on files sizes and processing performance compared to the constant time intervals type.

Implement code for the calculation of the WLs tidal predictions (& models forecasts eventually) error statistics.

1). We will begin with simple errors statistics (arithmetic mean, variance, median(?)) for tidal predictions for some of the most important TGs stations (keeping in mind that we could do those calculations for all TGs stations having WL obs data) .

2). We will use those errors statistics for the "uncertainties" attributes in the S111 and S104 DCF8 (and also S111,S104 DCF2 and DCF3) products.

3). We will also use this code to produce WLs forecast errors stats for both H2D2 and Spine-OneD models results and we will compare those forecast errors to prove that H2D2 forecasted WLs (adjusted using the last obs WLs a-la-Spine-OneD) provides better or similar forecast quality than Spine-OneD model.

4). We will then add error covariances calculations between the TGs afterwards.

Things to keep in mind:

The errors statistics code should be generic (data agnostic) as much as we can do since we will probably also use it for comparing HADCP measurements with models forecasted currents.
We will need to implement code that can extract data (WLs, currents) from models results (after conversion to S104, S111 (and S4**)) at the locations of the TGs and HADCP. It will be possible to use our already developped code (on the science network) that assign models grid points to tiles

Things to discuss:

We need to keep the errors stats data on disc. probably using a SQL DB?

Transfer 2D conversion code from the gitlab.science.gc.ca

Create another repo for that in https://github.com/DFO-CHS-Dynamic-Hydrographic-Products alongside this one.
Need to re-create (copy-paste) on-going issues on the gitlab.science.gc.ca in this new repo.
We will eventually merge both https://github.com/DFO-CHS-Dynamic-Hydrographic-Products/IWLS_pygeoapi and the transferred 2D code in order to not have redundant code.

Default behaviour of variable length strings

After more reading, default behavior of variable length strings in datasets for h5py are specifically encoded as byte strings. This is something we can look into further but I am not completely sure why byte strings are the default (compressibility maybe?). Given that is the default, perhaps we should consider keeping it as is and using the dataset.asstr() below to parse it. I have tested this and it works for the compound types as well.

From the h5py docs on strings:

String data in HDF5 datasets is read as bytes by default: bytes objects for variable-length strings, or numpy bytes arrays ('S' dtypes) for fixed-length strings. Use Dataset.asstr() to retrieve str objects.

Test formatter plugins for S-104/S-111

Test S-104/S-111 file creation using custom formatter plugins instead of pygeoapi processes

Update S100 file creation code and production templates for S-111 1.2.1 and S-104 1.1.0

The following changes in the latest version of the draft specs will need to be implemented:

-typeOfWaterLevelData and typeOfSurfaceCurrentData are now named dataDynamicity and the attribute is common between the two specs.
-verticalCoordinateBase is being removed from S111, seaSurface can now be set with verticalDatum
-verticalDatum enum as new possible values (seaSurface and seaFloor)
-new format for the horizontalCRS
-new optional metadata values for stations group (we might be able to populate some of them from the IWLS metadata):
• Location Maritime Resource Name
• URL to station or data portal
• Type of Tide
• Station type

We will need to wait for the final version of the specs to implement changes.

Organize repo into modules

Current stuctures of the repo is unsorted with all files in a central folder. Moving classes into modules where possible will organize the structure of the repo to improve readability.

Refactor the code to replace all "WDS" strings by "Spine"

Fix typos S-111 template (Group_F)

Fix typos S-111 template Group_F

Clean duplicate file in utils folder

-Remove duplicate S100 conversion script in utils folder.

Create new branch and tag for version 0.0.1

Test plugins with version 0.14 of pygeoapi

Test code for compatibility with the latest release of pygeoapi (we are currently using 0.12.0)
0.14.0 improved generate openapi script, this might help resolve #11.

Resolve issue with empty enums in the openapi yml config file

Running pygeoapi openapi generate on the current build results in a yml config that doesn't pass validation checks due to the presence of an empty enum field in the description of the get methods for each collection.

add export to csv method for station data in IwlsApiConnector()

Setup CI pipeline

Implementation of tests requires a CI pipeline, likely CircleCI or Jenkins which need to be automated.

Currently, I only have developer permissions on the repo and require something higher (admin likely) to setup the repo to the CircleCI or Jenkins.

Test if process_metadata can be moved to an external template file

Currently it is a big dictionary hard coded in process_iwls. It was done this way to match the example in the external process documentation for pygeoapi, metadata for the process can't be changed without changing the code, it would make more sense to keep them in a external Json file in the template directory.

Fix S111 provider to filter out empty returns in the web GUI map

Create utility script to export S-111/ S-104 to csv

Create utility script to export S-111/ S-104 generated by the API to csv

Test if PROCESS_METADATA can be move to a template file

Currently it is a big dictionary hard coded in process_iwls. It was done this way to match the example in the external process documentation for pygeoapi, Metadata for the process can't be changed without changing the code, it would make more sense to keep them in a external Json file in the template directory.

Add support for spine forecast

IwlsApiConnector request code needs to be modified to fetch spine forecasts when available.

Add the define-once-and-for-all principle in the coding guidelines.

Just one example for this: We now have the "properties" string defined at 11 places in three different source code files so if this "properties" string is changed for another string (for whatever reason) then we would have to change it 11 times.

Then it would be a good idea to define it just once in a variable that would be visible from all the source code and we would have to change only the content of this variable if the related string is changed.

Discuss spec attribute metadata fields in relation to S104/S111 and S100 spec standards

As per the discussion in PR #15 there is some redundancy between S104 and S100 spec. Further discussion is required to decide whether or not drop the horizontalDatumReference given that the horizontalDatum references the same thing which can lead to some confusion in addition to the possibility of both containing contrasting values (i.e. EPSG vs S100).

S<104,111>-DCF8: S<104,111>_TypeOf<WaterLevel,Current>Data: Implement the observedMinusPredicted and observedMinusForecast data production.

Create new install scripts for version 0.0.2

add S111 support to precompile python wheel.

Test new IWLS api features

IWLS public API has a new parameter for data request (temporal resolution), default is set at the highest possible resolution (one minute).

-Run server in test mode and check if the default value for the new parameter breaks the existing code.

Enforce Float and Integer precisions in metadata fields

Change templates and update methods to use Float and Integer (including enums) precisions described in S-111 1.2.0 and S-104 1.1.0.
Currently all floats are signed 64bit and all integer are signed 32bit.

S-104 1.1.0= page 93, tables 12.1, 12.2 ,12.3, 12,4
S-111 1.2.0 = page 95, tables 12.1, 12.2 ,12.3, 12,4

Use the following numpy class for encoding:
Float 64 = numpy.double
Float 32 = numpy.single
Int 32 = numpy.intc
Int 32 unsigned = numpy.uintc
Int 16 = numpy.short
Int 8 unsigned (for enumerations) = numpy.ubyte

Add Support for Hi-Lo S-104 files

Request Hi Lo data from IWLS with iwls_api_connector_waterlevels.py
Modify process_iwls.py to add option to request hi-lo in the process
Modify s104.py to allow correct trend calculation for hi-lo datasets (hardcode to steady or unknow?)

Use the same structure as the S104Dcf8:
---------------------------------------

    If one HADCP as more than one measurement point ("bin") associated with it then
    it can be considered as a "station" and should have its own "Group_nnn" structure.

    e.g. Woodward's Landing would have 3 bins (three measurements points with different coordinates
    in double precision in order that they can be displayed as two distinct points in a GUI
    viz app. in full zoom)

    Note that the "Hydrodynamic model forecast" current type would be interpolated
    at the stations "bins" points coordinates (when it is possible to do so)

    ...


    ATTRIBUTE "numInstances" {
        DATATYPE  H5T_STD_I32LE
        DATASPACE  SCALAR
        DATA {
        (0): 2
        }
    }
    ...

    GROUP "SurfaceCurrent.01" {
       ...

       ATTRIBUTE "typeOfCurrentData" {

           ...
           DATASPACE  SCALAR
           DATA {
           (0): Real-time observation
          }
       }
       ...

       ATTRIBUTE "numberOfStations" {
            DATATYPE  H5T_STD_I64LE
            DATASPACE  SCALAR
            DATA {
            (0): 3
            }
       }
       ...

       GROUP Group_001 {
           ...

           ATTRIBUTE "stationName" {
              ...

              DATASPACE  SCALAR
              DATA {
               (0): "Woodward's Landing: bin.001"
              }       
           }
           ...
        }
        GROUP Group_002 {
            ...

            ATTRIBUTE "stationName" {
                ...

               DATASPACE  SCALAR
               DATA {
               (0): "Woodward's Landing: bin.002"
               }
            }
            ...
        }
        GROUP Group_003 {
            ...

            ATTRIBUTE "stationName" {
                ...

               DATASPACE  SCALAR
               DATA {
               (0): "Woodward's Landing: bin.003"
               }
            }
            ...
        }
        GROUP "Positioning" {
          DATASET "geometryValues" {
             DATATYPE  H5T_COMPOUND {
                H5T_IEEE_F64LE "latitude";
                H5T_IEEE_F64LE "longitude";
             }
             DATASPACE  SIMPLE { ( 3 ) / ( 3 ) }
          ...         
        }
    } 
    GROUP "SurfaceCurrent.02" {
       ...

       ATTRIBUTE "typeOfCurrentData" {

           ...
           DATASPACE  SCALAR
           DATA {
           (0): Hydrodynamic model forecast
          }
       }
       ...

       ATTRIBUTE "numberOfStations" {
            DATATYPE  H5T_STD_I64LE
            DATASPACE  SCALAR
            DATA {
            (0): 3
            }
       }
       ...

       GROUP Group_001 {
           ...

           ATTRIBUTE "stationName" {
              ...

              DATASPACE  SCALAR
              DATA {
               (0): "Woodward's Landing: bin.001"
              }
           }
           ...
        }
        GROUP Group_002 {
            ...

            ATTRIBUTE "stationName" {
                ...

               DATASPACE  SCALAR
               DATA {
               (0): "Woodward's Landing: bin.002"
               }
            }
            ...
        }
        GROUP Group_003 {
            ...

            ATTRIBUTE "stationName" {
                ...

               DATASPACE  SCALAR
               DATA {
               (0): "Woodward's Landing: bin.003"
               }
            }
            ...
        }
        GROUP "Positioning" {
          DATASET "geometryValues" {
             DATATYPE  H5T_COMPOUND {
                H5T_IEEE_F64LE "latitude";
                H5T_IEEE_F64LE "longitude";
             }
             DATASPACE  SIMPLE { ( 3 ) / ( 3 ) }
          ...         
       }
    }

Add additional error handling

Tying up loose ends in the repo:

Currently the api returns a zip file which is actually a json file when the error. What the repo returns should reflect the type of file. I.e. if returning a zip, it should be a .zip extension and vice versa for a json.
Requests to pygeoapi are currently not handled, these should be wrapped in a try/except block.
Remove unnecessary generated files created from a request.
General refactoring where necessary.

Implement extraction of 2D models forecast data at CHS tides gauges and HADCP locations

Directly telated to issue #28.
We will use the tiled S104 DCF2 and S111 DCF2, DCF3 that will end up on https://dd.alpha.meteo.gc.ca (non-operational) and eventually on https://dd.weather.gc.ca (operational) as input data.
The CHS tide gauges are normally inside one tile pixel for HR DFO ports models and H2D2 models..
Nearest DCF2 wet pixel otherwise or-and for coarser res. models (CIOPSes).
Nearest DCF3 unstructured grid point otherwise for H2D2 models currents

Position compounds are created in reverse order (lat-lon instead of lon-lat)

GeometryValues arrays compounds are being created in reverse order (lat-lon instead of lon-lat).

Templates need to be fixed
Values in s100._create_positioning_group needs to be reversed
values in AxisName in templates are in correct order

dfo-chs-dynamic-hydrographic-products / iwls_pygeoapi Goto Github PK

iwls_pygeoapi's People

Contributors

Stargazers

Watchers

iwls_pygeoapi's Issues

Recommend Projects

Recommend Topics

Recommend Org