Comments (24)
About naming: we now have a read_stars and a read_ncdf. Is the idea to merge these into read_stars, or to keep them split into read_gdal and read_ncdf?
That depends on the level of complexity that you want the ncdf reader to reach, I guess. It's worth to split it, if you want it to be full-featured, for clarity's sake.
from stars.
I never replied about the read_ncdf/read_gdal/read_stars name issue.
I guess we need an abstraction over the ncsub and RasterIO idioms, I don't see that either of them is a good user-interface, but at least ncsub is the general form if you take it to cbind(start, count, outlength)
.
Use that as the general form and provide a bunch of helpers, to convert between geotransform/extent-dim-res/coord-ranges/RasterIO/start-count-stride? I've kind of been collecting these little components slowly, not being sure how to express them all in a single place.
Users will want a bounding box in coordinate space, and an output size or resolution - and often they'll be created by drawing or dragging sliders rather than directly numerically, or by a bit of back and forth guesswork reading the extent/proxy. Does it make sense to use the NetCDF start/count model, and add NearestNeighbour resampling option for outlength (after getting from the source), and then the gdal version can convert those to offset/window/outsize and also allow added resampling options?
from stars.
Thanks for your thoughts - great ideas; these should go probably into a new issue, given their importance and the topic of this one.
Continuing on support for curvilinear grids, two thoughts came up related to st_transform
ing them:
- transforming a curvilinear grid should be a simple matter of transforming the two coordinate matrices
- transforming any non-curvilinear grid should, at least optionally, return a curvilinear grid without doing any resampling.
from stars.
Just had a first look at this file
starsdata/sentinel5p/S5P_NRTI_L2__NO2____20180717T120113_20180717T120613_03932_01_010002_20180717T125231.nc
and it's not readable either by CRAN RNetCDF or ncdf4, because it's composed of groups. RNetCDF is aiming for group support in 2.0.0 but afaik no such plans (or appetite) exist for ncdf4 - oops, sorry I was wrong ncdf4 has had groups for a while now.
You could definitely read it with hdf5r, but I see you use GDAL with SDS syntax. (There might be other better hdf5 packages, it's been pretty dynamic lately).
I'd parked this with tidync, but it's probably time to get across the groups and the data model. It's been a while but with RNetCDF 2.0 from Github you need the extra group levels but the variables are accessible:
s5p = system.file("sentinel5p/S5P_NRTI_L2__NO2____20180717T120113_20180717T120613_03932_01_010002_20180717T125231.nc", package = "starsdata")
con = open.nc(s5p)
grp.inq.nc(con, "/PRODUCT/SUPPORT_DATA/GEOLOCATIONS")
## get latitude from the right group
var.get.nc(grp.inq.nc(con, "/PRODUCT")$self, "latitude")
from stars.
I've personally never seen a real curvilinear grid - that is, a grid which is NOT rectilinear in some projection - in a NetCDF file, is this something that you've seen often, or is it just a (very convenient) way to work around the many projected files which lack proper projection information?
Most programs people use in climate (e.g. CDO, GrADS) just define the grid as curvilinear whenever they do not understand the projection.
Are curvilinear grids going to play well with st_as_sf
? What about raster
conversion (iirc raster only support regular grids)?
(will test but can't right now, internet too slow to download starsdata
)
from stars.
@adrfantini I think it boils down to what you mean by some projection. If you mean a known, parametric projection (e.g. known PROJ string) then an example might be quite a few level 0 or level 1 satellite datasets; think of a raw dove satellite image. And yes, for model data it seems that you can choose a projection (although, as we've seen, if you rotate the Earth first, it doesn't make things easier), in which case it is more of a way of cheating your way out of doing it properly. A third where they arise: when you transform a regular or rectilinear grid to anything else, but don't want to resample (yet).
The plot
commands you see in the data_model
vignette first convert to sf
and then plots.
from stars.
These lines btw convert from raster centers to (the coordinates of) raster cell outlines. I simply take the first and last step size and copy it one cell out. I have the feeling that it will not be robust against grids covering poles or date lines (but who knows).
from stars.
There are heaps of examples of all kinds. The case @adrfantini mentions is pretty common in polar data, but also for Mercator -no record of the regular grid is kept in but sometimes can be inferred. It's not for automatic detection, though. There's an intermediate between points and polygons, for both the AREA and POINT cases - way more efficient to store and plot as meshes. There's no model for this in stars but I think it can hang off the bare structures as they are (storing rectilnear vectors or coordinate arrays). I got distracted by the groups thing in Sentinel but hoping to illustrate this soon.
from stars.
@edzer Thanks for the clarification, got it now.
@mdsumner would love to see meshes mesh with stars ;)
from stars.
Testing curvilinear grids right now... nice!
I have ben trying with a cropped version of the dataset in #52 (link, 2.3MB):
library(stars)
hmr = '/dev/shm/hmr.nc'
lat = read_stars(hmr, sub='lat') #Does not work, reads 'pr' instead. Separate issue?
library(raster) #Let's use raster to get the data in then
lat = raster(hmr, varname = 'lat') %>% st_as_stars %>% adrop
lon = raster(hmr, varname = 'lon') %>% st_as_stars %>% adrop
nit = read_stars(hmr) %>% adrop
ll = setNames(c(lon, lat), c("x", "y"))
nit.c = st_as_stars(nit, curvilinear = ll)
st_crs(nit.c) = 4326
plot(nit.c, breaks = "equal", reset = FALSE, axes = TRUE, as_points = FALSE, border = NA)
library(rnaturalearth)
plot(st_geometry(ne_countries(scale = "medium", returnclass="sf")), add = TRUE, border = 'grey')
It's still not clear to me what happens when exporting to Raster:
as(nit.c, 'Raster')
class : RasterLayer
dimensions : 248, 223, 55304 (nrow, ncol, ncell)
resolution : 0.07258437, 0.05175483 (x, y)
extent : 4.015204, 20.20152, 35.5148, 48.35 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
data source : in memory
names : layer
values : 0, 51.40039 (min, max)
The resulting dataset is equally spaced in x-y (which afaik is the only kind of data that raster support). Isn't this wrong?
from stars.
Here is very rough form from your example of treating this as a triangulated mesh, with the coordinate values as the vertices of the triangles.
coords <- cbind(as.vector(st_dimensions(nit.c)$x$values),
as.vector(st_dimensions(nit.c)$y$values))
tri <- RTriangle::triangulate(RTriangle::pslg(P = cbind(rep(seq_len(nrow(nit.c)), ncol(nit.c)),
rep(seq_len(ncol(nit.c)), each = nrow(nit.c))))
)
graphics.off()
plot(coords, type = "n", asp = 1/cos(42 * pi/180))
XY <- coords[t(tri$T), ]
ID <- rep(1:nrow(tri$T), each = 3)
## watch out here, we need a triangle vertex to get the colours in the right order
COL <- grey(seq(0, 1, length = 12))[scales::rescale(nit.c$hmr_ita.nc[tri$T[,1]], c(1, 12))]
isLL <- FALSE
vps <- gridBase::baseViewports()
grid::pushViewport(vps$inner, vps$figure, vps$plot)
grid::grid.polygon(XY[,1], XY[,2], ID, gp = grid::gpar(col = NA, fill = COL),
default.units = "native")
grid::popViewport(3)
plot(st_geometry(ne_countries(scale = "medium", returnclass="sf")), add = TRUE, border = 'grey')
Essentially we triangulate the index of the coordinates, then apply the values of the coordinates as we like - so you can trivially reproject them too, and it'll scale better than conversion to POLYGON.
The triangulation is on the coordinates, not the implied corners of the cells - so it's shrunk in a bit, I make this distinction more obvious here: http://rpubs.com/cyclemumner/415712
(I'm not suggesting we use RTriangle to do this basic index-to-triangulation, I just don't have the brain space to get the indexing logic right at the moment - I'll fix it soonish!)
I have to use grid::polygon
because it's fast, it's vectorized over ID so doesn't need to repeatedly call to draw different colours. So, I also need to mix grid and base graphics here.
I worked out a few other things here, I've been chasing this for a while from work in quadmesh::mesh_plot
- and it made me realize what's wrong with quadmesh's interpolation with raster::extract
: http://rpubs.com/cyclemumner/421078
from stars.
That is indeed fast. Took 0.4s on my machine, as compared to 11s for the normal sf::plot
.
Impressive.
from stars.
I made a comment about ncdf4 and groups above that was wrong, it does already support them - they are discoverable inside nc_open(file)$groups
.
from stars.
Back to the NetCDF groups:
ncdf4
doesn't seem to support groups in a useable way, but I'll keep exploring. RNetCDF
2.0.0 definitely will support it well, but doesn't have a release schedule afaik (the project is active, though).
An alternative is rhdf5
from BioConductor, installed with "BiocInstaller::biocLite" - the nice things is the data frame info upfront with all the variables and their groups nice arranged. The dim
is in there as a character column as well.
(s5p = system.file("sentinel5p/S5P_NRTI_L2__NO2____20180717T120113_20180717T120613_03932_01_010002_20180717T125231.nc", package = "starsdata"))
info <- rhdf5::h5ls(s5p)
paths <- file.path(info$group, info$name)
lon <- rhdf5::h5read(s5p, paths[grep("^longitude$", info$name)])
lat <- rhdf5::h5read(s5p, paths[grep("^latitude$", info$name)])
nit <- rhdf5::h5read(s5p, paths[grep("^nitrogendioxide_summed_total_column$", info$name)])
str(lon)
# num [1:450, 1:278, 1] 3.38 3.46 3.55 3.63 3.71 ...
str(nit)
# num [1:450, 1:278, 1] 9.97e+36 5.72e-05 6.01e-05 6.05e-05 6.78e-05 ...
A downside is we don't get missing value handling as per ncdf4 or GDAL, I haven't explored further
from stars.
Another correction, for ncdf4
: the group-path qualified variable names are simply available as names in the usual way.
I.e. get latitude
nc <- nc_open(s5p)
ncvar_get(nc, names(nc$var)[1])
Apologies for the long exploration, ncdf4 is definitely the best option!
from stars.
I am trying to reproduce this using a domain file from our climate model: here it is.
Everything is fine but nothing at all is plotted:
lon = read_stars('/home/netapp-clima-scratch/afantini/regcm_domain/EURO-CORDEX_81_DOMAIN000.nc', sub = 'xlon')
lat = read_stars('/home/netapp-clima-scratch/afantini/regcm_domain/EURO-CORDEX_81_DOMAIN000.nc', sub = 'xlat')
topo = read_stars('/home/netapp-clima-scratch/afantini/regcm_domain/EURO-CORDEX_81_DOMAIN000.nc', sub = 'topo')
ll = setNames(c(lon, lat), c("x", "y"))
topo.c = st_as_stars(topo, curvilinear = ll)
plot(topo.c, breaks = "equal", reset = FALSE, axes = TRUE, as_points = TRUE, pch = 16, key.pos = NULL)
from stars.
from stars.
Note, btw, that the refsys is misleading:
> topo.c
stars object with 2 dimensions and 1 attribute
attribute(s):
NETCDF:"EURO-CORDEX_81_DOMAIN000.nc":topo [m]
Min. : 0.00
1st Qu.: 0.00
Median : 83.59
Mean : 265.15
3rd Qu.: 354.25
Max. :3202.35
dimension(s):
from to offset delta refsys point
x 1 530 NA NA +proj=lcc +lat_1=30 +lat_... NA
y 1 530 NA NA +proj=lcc +lat_1=30 +lat_... NA
values
x [530x530] -63.2598 [°], ..., 82.8959 [°] [x]
y [530x530] 14.6105 [°], ..., 76.5259 [°] [y]
curvilinear grid
> st_crs(topo.c)
Coordinate Reference System:
No EPSG code
proj4string: "+proj=lcc +lat_1=30 +lat_2=65 +lat_0=48 +lon_0=9.75 +x_0=-6000 +y_0=-6000 +a=6371229 +b=6371229 +units=m +no_defs"
from stars.
It looks like we should be ignoring the lon,lat arrays in this case and just using jx,jy which seems like eminently sensible regular rectlinear coordinates.
nc <- ncdf4::nc_open("EURO-CORDEX_81_DOMAIN000_54/EURO-CORDEX_81_DOMAIN000.nc")
image(ncdf4::ncvar_get(nc, "jx"),
ncdf4::ncvar_get(nc, "iy"),
ncdf4::ncvar_get(nc, "topo"), useRaster = TRUE)
plot(st_transform(rnaturalearth::ne_coastline(returnclass = "sf"), "+proj=lcc +lat_1=30.00 +lat_2=65.00 +lat_0=48.00 +lon_0=9.75 +x_0=-6000. +y_0=-6000. +ellps=sphere +a=6371229. +b=6371229. +units=m +no_defs")$geometry, add = TRUE, border = "black")
I've never seen it done so clearly, the actual regular grid is right there. Most examples I've seen are more cryptic.
from stars.
GDAL does it right.
gdalinfo NETCDF:"EURO-CORDEX_81_DOMAIN000.nc":topo
Size is 530, 530
Coordinate System is:
PROJCS["unnamed",
GEOGCS["unknown",
DATUM["unknown",
SPHEROID["Spheroid",6371229,0]],
PRIMEM["Greenwich",0],
UNIT["degree",0.0174532925199433]],
PROJECTION["Lambert_Conformal_Conic_2SP"],
PARAMETER["standard_parallel_1",30],
PARAMETER["standard_parallel_2",65],
PARAMETER["latitude_of_origin",48],
PARAMETER["central_meridian",9.75],
PARAMETER["false_easting",-6000],
PARAMETER["false_northing",-6000],
UNIT["metre",1,
AUTHORITY["EPSG","9001"]]]
Origin = (-3180000.000000000000000,3180000.000000000000000)
Pixel Size = (12000.000000000000000,-12000.000000000000000)
from stars.
Note, btw, that the refsys is misleading:
@edzer why do you say so?
from stars.
This is also the way read_stars
reads it:
o = read_stars("EURO-CORDEX_81_DOMAIN000.nc", sub = "topo")
plot(o, axes=T)
st_crs(o)
# Coordinate Reference System:
# No EPSG code
# proj4string: "+proj=lcc +lat_1=30 +lat_2=65 +lat_0=48 +lon_0=9.75 +x_0=-6000 +y_0=-6000 +a=6371229 +b=6371229 +units=m +no_defs"
I guess that when we create a curvilinear (long/lat) grid out of it in the second st_as_stars
step, we should drop the native (lcc) CRS and replace it with ... st_crs(4326)
?
from stars.
I'm not sure. Both grids are correct: a rectilinear equally spaced grid in x-y and a curvilinear lat-lon grid, both indicating the same points (albeit grid corners might be slightly different?). It might be handy to have both.
from stars.
Updated to latest stars
, works. Might have been a few commits behind on this machine, sorry. The title is garbled, tho.
from stars.
Related Issues (20)
- Rotate around the centroid in st_geotransform
- Exporting s3 methods from namespace HOT 3
- Reading local Zarr files into stars HOT 13
- netcdf CF 1-8 conform data is missing meta data HOT 1
- Behaviour of `aggregate.stars` when multiple overlaps occur in the same raster pixel HOT 6
- Pixels in the border get weird values with `st_rasterize()` HOT 5
- Enhancement Request: Direct Access to Band Values Using Band Names in stars Object HOT 2
- Non `udunits` confirm units break the read in of netcdf files HOT 5
- st_rasterize Error in rep_len HOT 4
- Unexpected behavior: st_transform a raster and transforming back, it is not identical unless applying st_warp HOT 7
- Three different stars object descriptions from the same file HOT 4
- Proposal: Record time dimension in stars objects using CFtime package HOT 6
- Issues with particular file: North American Land Cover (NALC) Dataset HOT 5
- Additional nest in dimension while reading HDF5 file and terra workaround solution HOT 7
- GDAL Access window out of range on read-in of stars_proxy HOT 2
- aggregate.stars mean by time (by = "2 days") HOT 2
- netcdf dimensions get messed up if they are out of order and if axis attributes are available HOT 12
- starsdata not available, stars fails revdeps. HOT 7
- issue using `st_cells()` on a cropped image HOT 3
- workflow recommendations for simple spatial mean computations HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stars.