pmav99 / inspectds Goto Github PK
View Code? Open in Web Editor NEWA CLI utiltiy to print the metadata of datasets in various formats (NetCDF, zarr, grib, selafin etc)
License: The Unlicense
A CLI utiltiy to print the metadata of datasets in various formats (NetCDF, zarr, grib, selafin etc)
License: The Unlicense
$ inspectds ../20220125000000-24h-oper-fc.grib2
Couldn't open netcdf dataset: [Errno -51] NetCDF: Unknown file format: '../20220125000000-24h-oper-fc.grib2'
I am trying to
pipx install 'git+https://github.com/pmav99/inspectds.git'
but I get
ERROR: Command errored out with exit status 1:
command: /tmp/tmp_rg9cnoj/bin/python /home/nik/.local/pipx/shared/lib/python3.8/site-packages/pip/_vendor/pep517/in_process/_in_process.py prepare_metadata_for_build_wheel /tmp/tmp0b5qvxbs
cwd: /tmp/pip-req-build-_o9kujoi
Complete output (14 lines):
Traceback (most recent call last):
File "/home/nik/.local/pipx/shared/lib/python3.8/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 363, in <module>
main()
File "/home/nik/.local/pipx/shared/lib/python3.8/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 345, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
File "/home/nik/.local/pipx/shared/lib/python3.8/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 164, in prepare_metadata_for_build_wheel
return hook(metadata_directory, config_settings)
File "/tmp/pip-build-env-dh_nw11_/overlay/lib/python3.8/site-packages/poetry/core/masonry/api.py", line 43, in prepare_metadata_for_build_wheel
poetry = Factory().create_poetry(Path(".").resolve(), with_dev=False)
File "/tmp/pip-build-env-dh_nw11_/overlay/lib/python3.8/site-packages/poetry/core/factory.py", line 43, in create_poetry
raise RuntimeError("The Poetry configuration is invalid:\n" + message)
RuntimeError: The Poetry configuration is invalid:
- Additional properties are not allowed ('group' was unexpected)
----------------------------------------
WARNING: Discarding git+https://github.com/pmav99/inspectds. Command errored out with exit status 1: /tmp/tmp_rg9cnoj/bin/python /home/nik/.local/pipx/shared/lib/python3.8/site-packages/pip/_vendor/pep517/in_process/_in_process.py prepare_metadata_for_build_wheel /tmp/tmp0b5qvxbs Check the logs for full command output.
ERROR: Command errored out with exit status 1: /tmp/tmp_rg9cnoj/bin/python /home/nik/.local/pipx/shared/lib/python3.8/site-packages/pip/_vendor/pep517/in_process/_in_process.py prepare_metadata_for_build_wheel /tmp/tmp0b5qvxbs Check the logs for full command output.
Cannot determine package name from spec 'git+https://github.com/pmav99/inspectds'. Check package spec for errors.
Trying to understand if the error is caused by some local misconfiguration or if something needs to be updated in the repository.
Can't create file 'GRIB/ssrd/ssrd_CDS_era5_2018.grib.923a8.idx'
Traceback (most recent call last):
File "/home/mavropa/.local/pipx/venvs/inspectds/lib/python3.8/site-packages/cfgrib/messages.py", line 343, in from_indexpath_or_filestream
with compat_create_exclusive(indexpath) as new_index_file:
File "/scratch/mavropa/conda/jeodpp/lib/python3.8/contextlib.py", line 113, in __enter__
return next(self.gen)
File "/home/mavropa/.local/pipx/venvs/inspectds/lib/python3.8/site-packages/cfgrib/messages.py", line 264, in compat_create_exclusive
fd = os.open(path, os.O_WRONLY | os.O_CREAT | os.O_EXCL)
OSError: [Errno 30] Read-only file system: 'GRIB/ssrd/ssrd_CDS_era5_2018.grib.923a8.idx'
Can't read index file 'GRIB/ssrd/ssrd_CDS_era5_2018.grib.923a8.idx'
Traceback (most recent call last):
File "/home/mavropa/.local/pipx/venvs/inspectds/lib/python3.8/site-packages/cfgrib/messages.py", line 353, in from_indexpath_or_filestream
index_mtime = os.path.getmtime(indexpath)
File "/scratch/mavropa/conda/jeodpp/lib/python3.8/genericpath.py", line 55, in getmtime
return os.stat(filename).st_mtime
FileNotFoundError: [Errno 2] No such file or directory: 'GRIB/ssrd/ssrd_CDS_era5_2018.grib.923a8.idx'
I.e. use xr.open_mfdataset()
I.e. instead of this:
Dimensions: (step: 73, latitude: 285, longitude: 313)
Coordinates:
number int64 ...
time datetime64[ns] ...
* step (step) timedelta64[ns] 00:00:00 01:00:00 ... 3 days 00:00:00
surface float64 ...
* latitude (latitude) float64 74.97 74.9 74.83 74.76 ... 55.15 55.08 55.01
* longitude (longitude) float64 -29.95 -29.88 -29.81 ... -8.085 -8.015
valid_time (step) datetime64[ns] ...
display this:
Dimensions: (step: 73, latitude: 285, longitude: 313)
Coordinates:
number int64 0
time datetime64[ns] 2018-10-01
* step (step) timedelta64[ns] 00:00:00 01:00:00 ... 3 days 00:00:00
surface float64 0.0
* latitude (latitude) float64 74.97 74.9 74.83 74.76 ... 55.15 55.08 55.01
* longitude (longitude) float64 -29.95 -29.88 -29.81 ... -8.085 -8.015
valid_time (step) datetime64[ns] 2018-10-01 ... 2018-10-04
We now throw an exception similar to this:
inspectds /home/user/Downloads/20220725.00.tropical_cyclone.grib
Traceback (most recent call last):
File "/home/user/.local/bin/inspectds", line 8, in <module>
sys.exit(app())
^^^^^
File "/home/user/.local/pipx/venvs/inspectds/lib/python3.11/site-packages/inspectds/cli.py", line 126, in inspect_dataset
dataset_type = infer_dataset_type(path)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/.local/pipx/venvs/inspectds/lib/python3.11/site-packages/inspectds/cli.py", line 96, in infer_dataset_type
dataset_type = DATASET_TYPE.GRIB
^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/enum.py", line 786, in __getattr__
raise AttributeError(name) from None
AttributeError: GRIB
We should suggest installing cfgrib instead.
The new typer API suggests using Annotated. We should switch to that.
We should simplify the UI. This is what we should be aiming for
inspectds /path/to/dataset
inspectds --dataset_type netcdf /path/to/dataset
Not sure if it is possible since eccodes seems to be a prerequisite
There are netcdf files that are not compatible with xarray (source). We should either allow the user to drop the problematic variables or automatically detect them and drop them.
pinging @tomsail
For example:
$ inspectds http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/ncep.reanalysis/surface/air.sig995.2020.nc
Should output something like this:
<xarray.Dataset> Size: 62MB
Dimensions: (lat: 73, lon: 144, time: 1464)
Coordinates:
* lat (lat) float32 292B 90.0 87.5 85.0 82.5 ... -82.5 -85.0 -87.5 -90.0
* lon (lon) float32 576B 0.0 2.5 5.0 7.5 10.0 ... 350.0 352.5 355.0 357.5
* time (time) datetime64[ns] 12kB 2020-01-01 ... 2020-12-31T18:00:00
Data variables:
air (time, lat, lon) float32 62MB ...
Attributes:
Conventions: COARDS
title: 4x daily NMC reanalysis (2014)
history: created 2017/12 by Hoop (netCDF2.3)
description: Data is from NMC initialized reanalysis\...
platform: Model
dataset_title: NCEP-NCAR Reanalysis 1
_NCProperties: version=2,netcdf=4.6.3,hdf5=1.10.5
References: http://www.psl.noaa.gov/data/gridded/dat...
DODS_EXTRA.Unlimited_Dimension: time
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.