nci / gsky Goto Github PK
View Code? Open in Web Editor NEWDistributed Scalable Geospatial Data Server
Distributed Scalable Geospatial Data Server
https://github.com/nci/gsky/blob/master/processor/tile_grpc.go#L27 sets the output channel to contain up to 100 rasters without blocking. In most WMS settings, due to small requested polygon size, there is virtually no chance we end up even nearly 100 rasters returned from gdal workers. But in WCS settings, users often want to study large region of interest. For example, a WCS request sent by @juan-guerschman to study Australia as a whole in full resolution, which is a fairly common scenario. http://gsky ip/ows?SERVICE=WCS&service=WCS&crs=EPSG:4326&format=GeoTIFF&request=GetCoverage&height=7451&width=9580&version=1.0.0&bbox=110,-45,155,-10&coverage=global:c6:monthly_frac_cover&time=2018-03-01T00:00:00.000Z
In this scenario, Gsky fills the channel irrespective of the physical memory size. If the server doesn't have enough memory, ows process will get hit by OOM kill.
A quick fix will be like this: we know as inputs the requested width and height. We then try to initialize the channel as a function of total free memory divided by (width x height). If memory is too low, we reject the request.
The above fix is a greedy algorithm that will not find optimum in terms of concurrent processing and memory allocation but might be good enough in practice. This problem is essentially a packing problem (https://en.wikipedia.org/wiki/Bin_packing_problem), which can be NP-hard. Finding a good long-term solution will be left to future work.
WMS 1.3.0 GetCapabilities document advertises that the GetMap request can return images in the image/png, image/jpeg and image/gif formats. GetMap requests with parameter FORMAT=image/jpeg
or FORMAT=image/gif
returns images in PNG format (and Content-Type: image/png
in the response header).
There are rooms for improvements for the current GSKY gRPC and workers at multiple places for high availability and performance boost.
1. If a gdal-process dies, the gRPC server currently does not restart the gdal-process and there is no way to restart the dead process other than restarting the entire gPRC server because gRPC server establishes a unix domain socket when it starts the gdal-process.
2. There has been a proposal for c++-based implementation of workers to boost performance. I studied the source of the c++ implementation. Although it was long thought that the overhead of go workers was due to the thread context switch when calling to gdal C function from go, I found the performance gap between c++ and go workers may be due to an architectural difference, at least in theory. Let's compare and contrast the main architectural difference of current go and the proposed c++ workers:
Current go workers - FastCGI style
Essentially the gRPC server acts as process supervisor and forwards gRPC requests to a pool of gdal processes via unix domain sockets. This is highly similar to FastCGI in the web server world. The drawback is that we have to marshal and unmarshal protocol buffer messages of large quantity of the image data between the main server and the gdal processes via the unix domain sockets hence the overhead.
Proposed c++ workers - share listener fd then fork to share nothing
The main server creates a server socket listener and then forks (i.e. linux fork()) a pool of child processes sharing the listener fd. The gRPC in each child process then accepts the shared listener fd and directly processes gRPC requests and sends back responses. This essentially eliminates the needs of an extra CGI layer hence no marshaling/unmarshaling overhead for large amount of image data. The drawback is that we won't be able to easily plug in third party processes like the FastCGI style. In the long run, this will be problem because we want to execute arbitrary client's processes. But in the short term, this shouldn't be a concern.
Having said that, the c++ workers are still in it's infancy and the algorithmic correctness needs to be verified. For the foreseeable short run, it is possible to enhance the current go workers to minimise or even close the performance gap using the following socket sharding strategy:
SO_REUSEPORT
for socket options. This feature allows load balancing TCP servers binding to same listening address and port in the kernel space. More importantly, one can launch identical gRPC servers directly bundled with a gdal process therefore eliminating the needs of IPC between the main process supervisory server and gdal processes. This is the essence of the architecture of the c++ workers. Concretely speaking, we proceed as follows:SO_REUSEPORT
and self fork a pool of processes. Later when gRPC server starts, socket sharding will automatically happen behind the scene in the kernel space.The socket sharding approach provides both high availability and performance boost due to the elimination of IPC between gRPC server and gdal processes. It also scales horizontally with number of cores/machines. At this stage, we consider this go implementation a reference implementation. Later when we actually move to c++ workers, we will have solid strong baseline to compare against.
PostgresSQL OVERLAPS bounds are non-inclusive at the top end: start <= time < end. The po_stamp_min and po_stamp_max columns form a range of [t_min, t_max) in the MAS database. Let [t0, t1) denote the requested time range from gsky. We have two possible edge cases for time range overlaps:
Currently GSKY uses hard-coded time generators to generate timestamps for the WMS and WCS GetCapabilities http operations. This requires hard coding the generation rules for each dataset as different datasets have different timesteps. However, the timestamps are stored in MAS by the crawler. We can pull these timestamps from MAS instead of generating them by hand.
@juan-guerschman discovered this performance issue when polygon is defined with large number of points. Note that this should not be considered corner cases, because this can happen frequently as far as region boundaries are concerned. For example, the boundary of a small town in Australia can be defined with thousands of polygon points. A typical town's boundary can reach about 55,000+ polygon points.
Therefore, exact polygon intersection is infeasible in such a case. Instead, one should seek for approximate algorithms to smooth out the polygon first before we intersect all the data files against it. A classic algorithm to smooth out the polygon is called Douglas-Peucker algorithm. PostGIS has corresponding function called ST_Simplify()
that implements this algorithm. ST_Simplify()
itself can be performance bottleneck if there are too many polygon points. Therefore, one should use ST_RemoveRepeatedPoints()
to remove the points within a tolerance distance first before we call ST_Simplify()
. An extensive discussion regarding this matter is here: https://www.standardco.de/squeezing-performance-out-maps-with-lots-of-polygons-leaflet-postgis
Note:
zoom_level | tolerance
------------|------------------
0 | 78271.516953125
1 | 39135.7584765625
2 | 19567.8792382812
3 | 9783.93961914062
4 | 4891.96980957031
5 | 2445.98490478516
6 | 1222.99245239258
7 | 611.496226196289
8 | 305.748113098145
9 | 152.874056549072
10 | 76.4370282745361
11 | 38.2185141372681
12 | 19.109257068634
13 | 9.55462853431702
14 | 4.77731426715851
Currently shard_creation.sh in mas/db doesn't check for the existence of the shard before creating it. This causes https://github.com/nci/gsky/blob/master/mas/db/shard_create.sh#L19 to run to wipe out everything in the existing schema by dropping all the existing tables. It will be desirable to check for the existence of the shard to ensure multiple runs of shard_creation.sh safe.
Apparently tile_grpc has no timeout in place. Clients can potentially request large polygons in either WMS or WCS concurrently several times to overwhelm the grpc workers. If the client is designed to malicious, it wont take long to DDOS the grpc workers.
Perhaps consider including the “current” attribute in the Dimension element for each layer, set to True.
Each layer in the WMS 1.3.0 GetCapabilities document has a Dimension element for time, with the “default” attribute set to “current” to indicate that the service should return the most recent data if the TIME parameter is not provided in GetMap requests (note that the spec is not explicit as to whether the “default” attribute must equal a literal time corresponding to a time value in the Dimension element text or whether the keyword “current” can be used to indicate the latest time in a list of times contained in the Dimension element text). The WMS 1.3.0 spec states that the “current” attribute in the Dimension element should be set to True or “1” to indicate that requests can be made for the most current data, i.e. TIME=current in the GetMap request. Given that omitting the TIME parameter in GetMap requests for this service is equivalent to including TIME=current, it would be a more complete GetCapabilities document if the “current” attribute set to True is included in the Dimension elements.
Currently the crawling and mas ingestion process require manual and ad-hoc steps. This is error prone for non-technical users. It would be good to streamline the crawl such that non-technical users need to only run a main script to recursively crawl a directory for certain types of files. It would be also good to streamline the process of mas ingestion such that non-technical users need to only supply a shard name and the crawler output files.
In addition, non-technical users will also benefit from documentations of crawler and mas to gain usage guidance.
https://github.com/nci/gsky/blob/master/processor/tile_indexer.go#L73 hits MAS end point for each namespace. This is unnecessary as MAS end point accepts packing multiple namespaces in a single request. https://github.com/nci/gsky/blob/master/mas/api/api.go#L72
There have been dynamically generated dates such as pulling dates from MAS. But GSKY right now only loads the dates when loading the config files. Thus the WMS/WCS GetCapabilities will not return the up-to-date dates.
The mas timestamps feature now has two bugs:
utils/config.go
, mas time generator uses since
http parameter to indicate end date but mas api accepts until
Neighbouring tiles often share a subset of underlying data files. The current Gsky code base treats each tile request independently and therefore doesn't cache the the gRPC outputs corresponding to the shared underlying data files. This is a pronounced performance problem in terria as terria virtually always requests a large region of neighbouring tiles. The problem is worsened for high resolution data and the zooming in operations as these operations span a fairly large amount of shared data files.
To fix this, one can compute a hash for several request parameters that uniquely identify the request in question before calling the gRPC worker (i.e. https://github.com/nci/gsky/blob/master/processor/tile_grpc.go#L188) and cache/return the gRPC outputs accordingly. For example, the key can be calculated like the following using a few request objects right before line 188.
h := fnv.New32a()
h.Write([]byte(BBox2WKT(geot)))
geoStamp := fmt.Sprintf("%s_%v_%v", g.Path, g.TimeStamp.UnixNano(), int64(h.Sum32()))
The actual cache solution is a strategic decision to make. We can implement:
@bje- Any suggestions on caching strategies?
Currently the shard_x.sh scripts have "runuser postgres
" statements in them. runuser
requires setuid privilege which often means root. This is undesirable if the scripts are run by data collection/management teams as they're not going to have root like sysadmin does.
The current WCS functionality does not scale beyond single ows
node. The clients who use WCS often want high or even full resolution data. This can result in very large output images beyond the memory limit of single node. Therefore, we need to build facility that coordinates a cluster of ows
nodes to compute large WCS requests. One variable implementation is as follows:
ows
node, we split the requested bounding box into a series of smaller bounding boxes. Doing so will both reduce memory pressure and boost performance due to concurrent processing among the splits on the gRPC nodes.ows
nodes.ows
node waits for the completion of worker nodes and merge the results.The following experimental results demonstrate the effectiveness of the above proposed strategy. The environment of this experiment is a single node with 8 physical cores and 16GB memory. The ows
cluster consists of 4 ows
processes running on the same node. The same node also has 8 gRPC workers. Our baseline ows
is also built in this environment to ensure all the experimental conditions are the same in order have a fair comparison. The test data is Geoglam monthly fractional cover. The WCS request is as follows:
http://<gsky server ip>/ows/?SERVICE=WCS&service=WCS&crs=EPSG:4326&format=geotiff&request=GetCoverage&height=<height>&width=<width>&version=1.0.0&bbox=-179,-80,180,80&coverage=global:c6:monthly_frac_cover&time=2018-01-01T00:00:00.000Z
The bounding box -179, -80, 180, 80
virtually covers the entire planet. The resultant number of data files for such a bounding box is 816.
The following table shows the processing time between the baseline and the proposed solution. Processing time is the round-trip time of the WCS request.
Wdith | Height | Baseline (secs) | Proposed (secs) |
---|---|---|---|
2000 | 2000 | 12.191 | 10.822 |
5000 | 5000 | 44.216 | 13.364 |
10000 | 10000 | OOM | 19.650 |
20000 | 20000 | OOM | 38.755 |
40000 | 40000 | OOM | 104.930 |
80000 | 80000 | OOM | 336.439 |
121717 | 54247 | OOM | 345.774 |
Note:
bbox
, 12717x54247 is the full resolution of the dataset i.e. the theoretical maximum width and height.It will be useful to include the nodata
value during the crawling process. For example, we may skip the grpc entirely for the data files full of nodata
. This will dramatically boost performance for datasets where ocean areas are marked as nodata
.
WCS DescribeCoverage doesn't display the netcdf file option even though it is accepted as valid input option.
E.g.,
http://130.56.242.16/ows?service=WCS&version=1.0.0&coverage=LS5:NBAR:TRUE&request=DescribeCoverage
<supportedFormats>
<formats>GeoTIFF</formats>
</supportedFormats>
Reading the entire temp geotiff/netcdf file into memory https://github.com/nci/gsky/blob/master/utils/output_encoders.go#L290 and then send the file contents over http connection is not efficient in terms of speed and memory. Instead, one should use io.copy() to copy the fd of the temp file into network socket. The implementation of io.copy uses sendfile() syscall golang/go#10948, which is far more efficient than manually reading the file and send it over the network.
The combination of shard
code and gpath
in the public.shards
table must be unique. Now consider this scenario: when a user calls shard_create.sh
with an existing shard
code but the user supplied gpath
is different from the gpath
associated with the existing shard code in MAS database. Apparently, shard_create.sh
assumes shard existed and skips shard creation. If the user then proceed with data ingestion, the data will be ingested into the same shard but with a different gpath
. Therefore, the newly ingested data will never be retrieved later.
It is unclear whether WMS 1.1.1 supported. Issuing a GetCapabilities request without a VERSION parameter returns the following error message suggesting 1.1.1 is supported:
This server can only accept WMS requests compliant with version 1.1.1 and 1.3.0: /ows?request=GetCapabilities&service=WMS
Submitting a GetCapabilities request with VERSION=1.1.1 returns a 1.3.0 statement.
For GSKY service endpoint http://130.56.242.16/ows, WCS GetCoverage requests with CRS=OGC:CRS84 (a supported CRS in the DescribeCoverage statement) fail with (abridged) error message:
Request ..... should contain a valid ISO 'crs/srs' parameter.
Note that successful GetCoverage requests can be made using the EPSG:4326 CRS, provided in the /CoverageDescription/CoverageOffering/domainSet/spatialDomain/gml:RectifiedGrid
element of the DescribeCoverage statement, although this CRS is not stated as supported.
Example request with CRS=OGC:CRS84 (fail):
http://130.56.242.16/ows?SERVICE=WCS&VERSION=1.0.0&REQUEST=GetCoverage&COVERAGE=global:c5:frac_cover&CRS=OGC:CRS84&BBOX=-1.8,-0.9,1.8,0.9&FORMAT=GeoTIFF&WIDTH=10&HEIGHT=10
Same request with CRS=EPSG:4326 (success):
http://130.56.242.16/ows?SERVICE=WCS&VERSION=1.0.0&REQUEST=GetCoverage&COVERAGE=global:c5:frac_cover&CRS=EPSG:4326&BBOX=-1.8,-0.9,1.8,0.9&FORMAT=GeoTIFF&WIDTH=10&HEIGHT=10
WMS 1.3.0 GetCapabilities document has the following XML Schema validation errors:
/WMS_Capabilities/Service/ContactInformation/ContactPersonPrimary/ContactPerson
element is missing
/WMS_Capabilities/Service/ContactInformation/ContactAddress/PostCode
element is missing
WMS 1.3.0 GetCapabilities request using the POST method with the URL provided in the /WMS_Capabilities/Capability/Request/GetCapabilities/DCPType/HTTP/Post/OnlineResource
element of the GetCaps document fails with message:
Not a valid OWS request. URL /ows does not contain a valid 'request' parameter.
POST body:
<?xml version="1.0" ?>
<ows:GetCapabilities xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ows="http://www.opengis.net/ows"
xsi:schemaLocation="http://www.opengis.net/ows http://schemas.opengis.net/ows/1.0.0/owsGetCapabilities.xsd">
<ows:AcceptVersions>
<ows:Version>1.3.0</ows:Version>
</ows:AcceptVersions>
</ows:GetCapabilities>
Does the POST request for the GetCapabilities operation need to be supported? If not, suggest removing POST OnlineResource element from the GetCaps statement.
If two copies of gsky
are started on the same port number, the server exits silently with no explanation of the problem. Furthermore, it exits with an exit code of 0.
Apparently, the WPS polygon drill facility is hard coded to work with Geoglam monthly fractional cover and CHIRPS 2.0 datasets only as well as hard coded csv output templates. As discussed with clients over multiple meetings, polygon drill needs to generalised to any datasets that GSKY can serve. This requires implementing the following changes into the current polygon drill facility:
WPS execute()
in ows such that it can work with arbitrary number of data sources with arbitrary start/end dates and number of bands (i.e. namespaces). Essentially this is data layer with arbitrary number of bands. In contrast, a data layer can only have either 1 or 3 bands in order to render a colour image such as a png image.drill_merger
such that it can work with arbitrary number of bands (i.e. namespaces) instead of hard coded 4 bands currently.drill_merger
such that it can evaluate one or more mathematical expressions (e.g. a golang expression) for the bands on the fly. Currently geoglam computes total cover on the fly by taking a sum between two bands. If we generalise beyond geoglam, we will need to evaluate general expressions. For example, an expression of band sum phot_veg + nphot_veg
will be parsed and evaluated by GSKY on the fly to compute geoglam total cover once this feature is implemented.drill.go
worker such that it can handle any data types that GSKY ows
can handle and seek approximate algorithms for computing the statistics for large datasets.ows.go
such that users can specify start datetime
and end datetime
for the time period of polygon drill.When adding WMS layer in QGIS using full GetMap request (http://gsky.nci.org.au/ows?layers=LS7%3ANBAR%3ATRUE&styles=&crs=EPSG%3A4326&bgcolor=0xFFFFFF&height=256&width=256&version=1.3.0&bbox=-35.3075%2C149.12441666666666%2C-35.1%2C151&time=1999-08-05T00%3A00%3A00.000Z&transparent=FALSE), an error message is displayed when trying to add any available layer. Seems to be trying to make a GetLegend request based on the part of the error message visible (error message gets cut off... will try and put screenshot if able):
Error downloading http://gsky.nci.org.au/ows?service=WMS&request=GetLeg
Currently MAS API doesn't properly check for the actual connection open state, which will cause issues for the downstream code such as gsky tile_indexer.go. Essentially sql.Open()
does lazy Open()
in a sense that database connection isn't open until the first sql query. Thus to test if the connection is okay, we will have to actually issue a simple query assert the connection open state.
Despite the fact that GSKY ships with fully automated build scripts in terms of ./configure make make install. The build and installation of GSKY dependencies in a fresh environment can still be a daunting task. For end users who want to quickly check out GSKY, they need a docker image like many other open source projects do. For our developers, it would be also beneficial to containerize GSKY for a variety purposes such as automated testing, quick dev environment setup, and easy deployment.
Apart from the technical side that creates the docker image, we also need to be aware that NCI needs to
a.
is the coordination of the ops team and a few human-involved approval processes.v0
bundles with the sample data.Blank images are returned if the TIME parameter is not provided in GetMap requests for the twelve Landsat 5, 7 and 8 layers (at http://gsky.nci.org.au/ows). The “default” attribute in the layer’s Dimension element in the GetCapabilities statement has the value of “current”, indicating the service should return the most recent layer data if the TIME parameter is omitted.
The load balancer https://github.com/nci/gsky/blob/master/ows.go#L224 randomly picks a worker node during the initialization of wms or wcs pipeline. Thus the pipeline sticks to the same worker node during its lifetime. In case we have large volume of intersected files returned from tile_indexer for a large requested polygon, https://github.com/nci/gsky/blob/master/processor/tile_grpc.go#L46 will become a bottleneck as we only have connection to single worker node here. Large volume of intersected files can come from either a large requested polygon or aggregation from long period of time (e.g. cloud removal using the past 3 months of data)
In order to scale beyond single worker node, we need the load balancer within tile_grpc.go before line 46.
An extreme example of large polygon request is -179,-80,180,80 which covers the entire world.
http://gsky ip address/ows/geoglam?SERVICE=WCS&service=WCS&crs=EPSG:4326&format=GeoTIFF&request=GetCoverage&height=500&width=1000&version=1.0.0&bbox=-179,-80,180,80&coverage=global:c6:monthly_frac_cover&time=2018-03-01T00:00:00.000Z
The above query results in processing 271 fractional cover files with 3 bands each. The experiment has two worker nodes. Each worker node contains identical worker code and identical number of worker processes. With some quick-n-dirty experimentation, I have the following performance benchmark data:
5 runs with the original code (i.e. baseline in seconds):
7.281
7.235
7.179
7.201
7.212
5 runs with the proposed load balancer within tile_grpc.go before line 46:
We load balance two worker nodes in round-robin fashion
3.986
3.862
3.845
3.887
3.862
We can see about 50% linear speedup, which is to be expected. The code for the experimentation contains hard-coded worker addresses and hacks for a quick proof of concept. A proper code solution is required.
This issue is uncovered by @juan-guerschman. In a typical WCS usage, users usually want to study a large region of interest. If the corresponding polygon is large enough, mas_intersects() fails to fully intersect all the files in question. This causes incorrect rendering on Gsky side. The reason mas_intersects() fails is because this function first computes a segmentation of the requested polygon before computing its convex hull in order for a good intersection. The segmentation can essentially be viewed as an interpolation method to smooth out the original polygon to facility convex hull computation. Thus the smoothness is a function of number of segments. By default it segments polygon with precision up to (xmax - xmin) / 2 (i.e. two segments). This works for small polygon but doesn't have enough precision for large polygon. Therefore we need number of segments configurable on a per layer basis.
For example, the following request gives incorrect rendering:
http://gsky ip/ows/?service=WCS&crs=EPSG:4326&format=GeoTIFF&request=GetCoverage&height=500&width=1000&version=1.0.0&bbox=-10,-50,180,50&coverage=global:c6:monthly_frac_cover&time=2018-03-01T00:00:00.000Z
Typical request for WCS from the geoglam instance is something like:
"I need a subset of the data for my study area in the native resolution (500 meters)"
The "study area" can be as large as a whole continent, or a whole country.
Current limit for WCS requests (~4MB) is too limiting and I doubt it will be of any practical use with the exception of very small subsets.
As an example, the whole of Australia will be an output of height=7451 & width=9580 (~70MB)
The WMS GetCapabilities statement provides a LegendURL for all layers (at http://gsky.nci.org.au/ows), although GetLegendGraphic requests fail with a 500 status response for all layers except the two DEA Intertidal Extents layers. Error message is:
open : no such file or directory
If a user requests http:///ows, golang http server will automatically send a 302 redirect to the client side to redirect to http:///ows/. Note the trailing slash. This will cause incompatibility in URLs configured for current terriajs servers. This problem is also documented here: https://golang.org/pkg/net/http/ (type ServeMux section)
There have been several issues/bugs in the gRPC workers since last major update.
tile_grpc.go
and drill_grpc.go
is done in a round robin fashion. Round robin can potentially have issues if each http request has very small number of gRPC calls. By round robin, all of the calls will go to the first few gRPC workers. To fix this, we need to use a random integer as the starting point (i.e. index of the first worker) then round robin starts from there.Currently, drill_grpc.go uses default grpc buffer size which 4MB and the concurrency limit over worker node pool is fixed to 16. The combination of these two is a performance bottleneck as drilling over long time series of high dimensional data is a very compute intensive task.
For the four Low Tide and High Tide layers, and the twelve Landsat 5, 7 and 8 layers (at http://gsky.nci.org.au/ows), GetMap requests are only successful when WIDTH and HEIGHT values are set to 256. Other values result in highly distorted images.
Following is an example of a successful GetMap request for the "DEA Low Tide Composite 25m v2.0 true colour" layer, WIDTH and HEIGHT set to 256:
Same request, with WIDTH and HEIGHT set to 500, image is corrupted:
Note that the two Intertidal Extents Model layers do not have this problem.
GetFeatureInfo requests fail for all layers (at http://gsky.nci.org.au/ows), returning a 502 status response with error message:
The proxy server received an invalid response from an upstream server
Following is an example GetFeatureInfo request that fails, generated from QGIS:
GSKY documentation needs updates to reflect the current state of development. The updates include
Saves space and internet transfer, maybe even make default?
Currently Gsky processes an HTTP request (i.e. WMS and WCS) by storing all the tiles from gRPC calls first and then proceed to downstream of the pipeline such as tile_merger.go. This is a batch processing algorithm whose memory requirements will not scale beyond the physical server memory.
Observing that we can first group the tile requests by its corresponding region (i.e. polygon coordinates) and then divide the inputs into shards whose keys are regions. We then send the shards of tile requests to the gRPC workers.Once we obtain all the gPRC output results for a particular shard, this shard will be sent asynchronously to the merger algorithm and then become ready for GC.
This algorithm is a streaming tile processing model in contrast to the current batch processing model. This streaming processing model has the following advantages:
I also set up the following experiment benchmark the extremeness of the streaming processing against the batch processing baseline.
There are three gRPC servers with identical hardware for 16G memory and 8 cpus. The Gsky ows server has 8G memory and 4 cpus. The baseline codebase is #91, which has the first iteration of performance improvements.
The WCS query has latlon of -179,-80,180,80 which covers the entire world. We vary height and width to simulate increasing demands of memory and cpu loads. One example query used in the experiment is as follows:
http://<gsky server>/ows/geoglam?SERVICE=WCS&service=WCS&crs=EPSG:4326&format=GeoTIFF&request=GetCoverage&height=6000&width=6000&version=1.0.0&bbox=-179,-80,180,80&coverage=global:c6:monthly_frac_cover&time=2018-03-01T00:00:00.000Z
Experimental results:
Baseline (batch tile processing model):
height x width response time (seconds)
500x1000 3.187s
2000x2000 6.181s
4000x4000 OOM
6000x6000 OOM
8000x8000 OOM
10000x10000 OOM
Streaming tile processing model:
height x width response time (seconds)
500x1000 2.814s
2000x2000 4.841s
4000x4000 15.773s
6000x6000 33.824s
8000x8000 57.941s
10000x10000 gRPC timed out
The experimental results clearly demonstrate the strength of the streaming tile processing model given an already strong baseline.
The CRS code EPSG:WGS84(DD)
specified by the GetCapabilities document in element /WMS_Capabilities/Capability/Layer/CRS
is not supported by GetMap requests. The following error is returned in the response with 400 status:
…should contain a valid ISO 'crs/srs' parameter
This CRS code appears to have been adopted from GeoServer and is a throwback to an early version of WCS. The code does not conform to the OGC WMS 1.3.0 spec for the EPSG namespace, which states in section 6.7.3.3:
“An “EPSG” CRS label comprises the “EPSG” prefix, the colon, and a numeric code.”
This bug is uncovered by @juan-guerschman.
Apparently raster_scaler.go doesn't clip any negative values. This creates problems for converting into uint8 and therefore renders colour images incorrectly.
There are two potential improvements in Gsky crawler:
gdalinfo
for each subdataset in the data file in sequence. This is an IO bottleneck for a data file with lots of subdatasets. We may use goroutine to concurrently call gdalinfo on these subdatasets.gdalinfo
on a subdataset in a data file, the error message is suppressed. This is not good for troubleshooting the crawler results. We may output the error messages to stderr so that we can log the errors while not conflicting with stdout for crawling outputs.The ServiceProvider
details in https://github.com/nci/gsky/blob/master/templates/WPS_GetCapabilities.tpl need to be updated.
Apparently the legend graphics are stored under static/legend folders. But these legends are part of the published datasets and should not be considered part of GSKY code base. Therefore, it will be data vendor's responsibility to maintain these legends and probably GSKY admin's responsibility to maintain a location (i.e. a directory) on a server that stores these legends during server deployment process.
There are a few small utility packages used by gsky. These repos do not have large number of maintainers compared to big github projects. If the repos of these utility packages are lost, our build process will be interrupted. In a sense of code availability/security, therefore, we should fork them under nci. A good example is https://github.com/nci/go.procmeminfo
Currently WPS doesn't have wps_exception.tpl
template file. Thus https://github.com/nci/gsky/blob/master/ows.go#L510 will certainly fail. In fact, we don't need the exception template but directly write any error messages to the client side.
GetCapabilities requests require the VERSION parameter to successfully return a GetCapabilities document. VERSION parameter is optional according to following excerpt from section 6.2.4 of the OGC WMS 1.3.0 specification:
In response to a GetCapabilities request (for which the VERSION parameter is optional)
that does not specify a version number, the server shall respond with the highest version it supports.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.