dfo-chs-dynamic-hydrographic-products / iwls_pygeoapi Goto Github PK
View Code? Open in Web Editor NEWpygeoapi plugins to access and process water level and surface currents from the IWLS public API
pygeoapi plugins to access and process water level and surface currents from the IWLS public API
1). Experiment on using variable time intervals even for constant time intervals and dropping use of missing data flags.
2). Experiment on files sizes and processing performance compared to the constant time intervals type.
1). We will begin with simple errors statistics (arithmetic mean, variance, median(?)) for tidal predictions for some of the most important TGs stations (keeping in mind that we could do those calculations for all TGs stations having WL obs data) .
2). We will use those errors statistics for the "uncertainties" attributes in the S111 and S104 DCF8 (and also S111,S104 DCF2 and DCF3) products.
3). We will also use this code to produce WLs forecast errors stats for both H2D2 and Spine-OneD models results and we will compare those forecast errors to prove that H2D2 forecasted WLs (adjusted using the last obs WLs a-la-Spine-OneD) provides better or similar forecast quality than Spine-OneD model.
4). We will then add error covariances calculations between the TGs afterwards.
Things to keep in mind:
The errors statistics code should be generic (data agnostic) as much as we can do since we will probably also use it for comparing HADCP measurements with models forecasted currents.
We will need to implement code that can extract data (WLs, currents) from models results (after conversion to S104, S111 (and S4**)) at the locations of the TGs and HADCP. It will be possible to use our already developped code (on the science network) that assign models grid points to tiles
Things to discuss:
After more reading, default behavior of variable length strings in datasets for h5py are specifically encoded as byte strings. This is something we can look into further but I am not completely sure why byte strings are the default (compressibility maybe?). Given that is the default, perhaps we should consider keeping it as is and using the dataset.asstr()
below to parse it. I have tested this and it works for the compound types as well.
From the h5py docs on strings:
String data in HDF5 datasets is read as bytes by default: bytes objects for variable-length strings, or numpy bytes arrays ('S' dtypes) for fixed-length strings. Use Dataset.asstr() to retrieve str objects.
Test S-104/S-111 file creation using custom formatter plugins instead of pygeoapi processes
The following changes in the latest version of the draft specs will need to be implemented:
-typeOfWaterLevelData and typeOfSurfaceCurrentData are now named dataDynamicity and the attribute is common between the two specs.
-verticalCoordinateBase is being removed from S111, seaSurface can now be set with verticalDatum
-verticalDatum enum as new possible values (seaSurface and seaFloor)
-new format for the horizontalCRS
-new optional metadata values for stations group (we might be able to populate some of them from the IWLS metadata):
• Location Maritime Resource Name
• URL to station or data portal
• Type of Tide
• Station type
We will need to wait for the final version of the specs to implement changes.
Current stuctures of the repo is unsorted with all files in a central folder. Moving classes into modules where possible will organize the structure of the repo to improve readability.
Fix typos S-111 template Group_F
-Remove duplicate S100 conversion script in utils folder.
Create new branch and tag for version 0.0.1
Running pygeoapi openapi generate on the current build results in a yml config that doesn't pass validation checks due to the presence of an empty enum field in the description of the get methods for each collection.
add export to csv method for station data in IwlsApiConnector()
Implementation of tests requires a CI pipeline, likely CircleCI or Jenkins which need to be automated.
Currently, I only have developer permissions on the repo and require something higher (admin likely) to setup the repo to the CircleCI or Jenkins.
Currently it is a big dictionary hard coded in process_iwls. It was done this way to match the example in the external process documentation for pygeoapi, metadata for the process can't be changed without changing the code, it would make more sense to keep them in a external Json file in the template directory.
Create utility script to export S-111/ S-104 generated by the API to csv
Currently it is a big dictionary hard coded in process_iwls. It was done this way to match the example in the external process documentation for pygeoapi, Metadata for the process can't be changed without changing the code, it would make more sense to keep them in a external Json file in the template directory.
IwlsApiConnector request code needs to be modified to fetch spine forecasts when available.
Just one example for this: We now have the "properties" string defined at 11 places in three different source code files so if this "properties" string is changed for another string (for whatever reason) then we would have to change it 11 times.
Then it would be a good idea to define it just once in a variable that would be visible from all the source code and we would have to change only the content of this variable if the related string is changed.
As per the discussion in PR #15 there is some redundancy between S104 and S100 spec. Further discussion is required to decide whether or not drop the horizontalDatumReference
given that the horizontalDatum
references the same thing which can lead to some confusion in addition to the possibility of both containing contrasting values (i.e. EPSG vs S100).
add S111 support to precompile python wheel.
IWLS public API has a new parameter for data request (temporal resolution), default is set at the highest possible resolution (one minute).
-Run server in test mode and check if the default value for the new parameter breaks the existing code.
Change templates and update methods to use Float and Integer (including enums) precisions described in S-111 1.2.0 and S-104 1.1.0.
Currently all floats are signed 64bit and all integer are signed 32bit.
S-104 1.1.0= page 93, tables 12.1, 12.2 ,12.3, 12,4
S-111 1.2.0 = page 95, tables 12.1, 12.2 ,12.3, 12,4
Use the following numpy class for encoding:
Float 64 = numpy.double
Float 32 = numpy.single
Int 32 = numpy.intc
Int 32 unsigned = numpy.uintc
Int 16 = numpy.short
Int 8 unsigned (for enumerations) = numpy.ubyte
Implement S-111 conversion process plugin
In the a recent MR for refactoring code, some of the previous work (that I wrote) does not comply with PEP8 style/formatting. The goal is to fix this to apply the PEP8 style and ensure that is consistent across the repository.
Update README to contain coding procedures for the repo.
The IWLS team is starting tests to integrate VTG stations forecasts to the database; support for the new data source (wlf-vtg) will need to be added to IwlsApiConnectorWaterLevels()
As discussed previously, the S104/S111 should have similar group metadata outlined below by @glmcr
In addition to this there are redundant attributes:
northBoundLongitude
- to be removed
southBoundLongitude
- to be removed
numInstance
- redundant (should be numInstances but missing an 's' so a new attribute was created, likely a typo)
Use the same structure as the S104Dcf8:
---------------------------------------
If one HADCP as more than one measurement point ("bin") associated with it then
it can be considered as a "station" and should have its own "Group_nnn" structure.
e.g. Woodward's Landing would have 3 bins (three measurements points with different coordinates
in double precision in order that they can be displayed as two distinct points in a GUI
viz app. in full zoom)
Note that the "Hydrodynamic model forecast" current type would be interpolated
at the stations "bins" points coordinates (when it is possible to do so)
...
ATTRIBUTE "numInstances" {
DATATYPE H5T_STD_I32LE
DATASPACE SCALAR
DATA {
(0): 2
}
}
...
GROUP "SurfaceCurrent.01" {
...
ATTRIBUTE "typeOfCurrentData" {
...
DATASPACE SCALAR
DATA {
(0): Real-time observation
}
}
...
ATTRIBUTE "numberOfStations" {
DATATYPE H5T_STD_I64LE
DATASPACE SCALAR
DATA {
(0): 3
}
}
...
GROUP Group_001 {
...
ATTRIBUTE "stationName" {
...
DATASPACE SCALAR
DATA {
(0): "Woodward's Landing: bin.001"
}
}
...
}
GROUP Group_002 {
...
ATTRIBUTE "stationName" {
...
DATASPACE SCALAR
DATA {
(0): "Woodward's Landing: bin.002"
}
}
...
}
GROUP Group_003 {
...
ATTRIBUTE "stationName" {
...
DATASPACE SCALAR
DATA {
(0): "Woodward's Landing: bin.003"
}
}
...
}
GROUP "Positioning" {
DATASET "geometryValues" {
DATATYPE H5T_COMPOUND {
H5T_IEEE_F64LE "latitude";
H5T_IEEE_F64LE "longitude";
}
DATASPACE SIMPLE { ( 3 ) / ( 3 ) }
...
}
}
GROUP "SurfaceCurrent.02" {
...
ATTRIBUTE "typeOfCurrentData" {
...
DATASPACE SCALAR
DATA {
(0): Hydrodynamic model forecast
}
}
...
ATTRIBUTE "numberOfStations" {
DATATYPE H5T_STD_I64LE
DATASPACE SCALAR
DATA {
(0): 3
}
}
...
GROUP Group_001 {
...
ATTRIBUTE "stationName" {
...
DATASPACE SCALAR
DATA {
(0): "Woodward's Landing: bin.001"
}
}
...
}
GROUP Group_002 {
...
ATTRIBUTE "stationName" {
...
DATASPACE SCALAR
DATA {
(0): "Woodward's Landing: bin.002"
}
}
...
}
GROUP Group_003 {
...
ATTRIBUTE "stationName" {
...
DATASPACE SCALAR
DATA {
(0): "Woodward's Landing: bin.003"
}
}
...
}
GROUP "Positioning" {
DATASET "geometryValues" {
DATATYPE H5T_COMPOUND {
H5T_IEEE_F64LE "latitude";
H5T_IEEE_F64LE "longitude";
}
DATASPACE SIMPLE { ( 3 ) / ( 3 ) }
...
}
}
Tying up loose ends in the repo:
GeometryValues arrays compounds are being created in reverse order (lat-lon instead of lon-lat).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.