Giter VIP home page Giter VIP logo

Comments (14)

hayesgb avatar hayesgb commented on May 23, 2024

When I tested this with adlfs 0.5.1, it passed. Did it pass for you with 0.5.1?

I'll also note that this bug report did highlight a missing dunder method, which has been added in the xarray_test branch. Can you take a look at that one?

from adlfs.

basnijholt avatar basnijholt commented on May 23, 2024

Oops, wrapping the call in a function makes the error disappear.
I have edited my original post.

The error still happens in 0.5.1 and in the xarray_test branch BTW.

from adlfs.

hayesgb avatar hayesgb commented on May 23, 2024

When I run this, it passes. Can you try it without the custom **storage_options method?

from adlfs.

basnijholt avatar basnijholt commented on May 23, 2024

The only thing that method does is returning a dict:

{'account_name': 'acnt_name_here',
 'account_key': 'keyhere=='}

Sorry, that wasn't very obvious from my code.

If I leave that out, I don't have access.

from adlfs.

basnijholt avatar basnijholt commented on May 23, 2024

Maybe it's because of the specific versions I have?

import xarray as xr
xr.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.8.1 | packaged by conda-forge | (default, Jan 29 2020, 14:55:04) 
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 3.10.0-957.27.2.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.6
libnetcdf: 4.7.4

xarray: 0.16.0
pandas: 1.1.1
numpy: 1.19.1
scipy: 1.5.2
netCDF4: 1.5.4
pydap: None
h5netcdf: 0.8.1
h5py: 2.10.0
Nio: None
zarr: None
cftime: 1.2.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2.25.0
distributed: 2.25.0
matplotlib: 3.3.1
cartopy: None
seaborn: 0.10.0
numbagg: None
pint: None
setuptools: 45.1.0.post20200119
pip: 20.2.3
conda: 4.8.4
pytest: None
IPython: 7.18.1
sphinx: None

from adlfs.

hayesgb avatar hayesgb commented on May 23, 2024

Looks like the bug is occurs when I install netCDF4.

I started with the environment below, without h5netcdf and netCDF4. Installed h5netcdf v0.8.1. The file opened. Once I installed netCDF4 v1.5.4, I replicated your error.

INSTALLED VERSIONS
------------------
commit: None
python: 3.7.6 | packaged by conda-forge | (default, Jan  7 2020, 22:05:27) 
[Clang 9.0.1 ]
python-bits: 64
OS: Darwin
OS-release: 19.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.5
libnetcdf: 4.6.3

xarray: 0.16.0
pandas: 1.0.4
numpy: 1.18.5
scipy: 1.5.2
netCDF4: 1.5.4
pydap: None
h5netcdf: 0.8.1
h5py: 2.10.0
Nio: None
zarr: None
cftime: 1.2.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2.25.0
distributed: 2.25.0
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
pint: None
setuptools: 45.1.0.post20200119
pip: 20.0.2
conda: None
pytest: 5.4.2
IPython: 7.14.0
sphinx: None

stacktrace...

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-0df956816f6e> in <module>
      1 with fsspec.open("abfs://" + loc, **storage_options) as f:
----> 2     x  = xr.open_dataset(f).load()

~/anaconda3/envs/adlfs/lib/python3.7/site-packages/xarray/core/dataset.py in load(self, **kwargs)
    664         for k, v in self.variables.items():
    665             if k not in lazy_data:
--> 666                 v.load()
    667 
    668         return self

~/anaconda3/envs/adlfs/lib/python3.7/site-packages/xarray/core/variable.py in load(self, **kwargs)
    379             self._data = as_compatible_data(self._data.compute(**kwargs))
    380         elif not hasattr(self._data, "__array_function__"):
--> 381             self._data = np.asarray(self._data)
    382         return self
    383 

~/anaconda3/envs/adlfs/lib/python3.7/site-packages/numpy/core/_asarray.py in asarray(a, dtype, order)
     83 
     84     """
---> 85     return array(a, dtype, copy=False, order=order)
     86 
     87 

~/anaconda3/envs/adlfs/lib/python3.7/site-packages/xarray/core/indexing.py in __array__(self, dtype)
    675 
    676     def __array__(self, dtype=None):
--> 677         self._ensure_cached()
    678         return np.asarray(self.array, dtype=dtype)
    679 

~/anaconda3/envs/adlfs/lib/python3.7/site-packages/xarray/core/indexing.py in _ensure_cached(self)
    672     def _ensure_cached(self):
    673         if not isinstance(self.array, NumpyIndexingAdapter):
--> 674             self.array = NumpyIndexingAdapter(np.asarray(self.array))
    675 
    676     def __array__(self, dtype=None):

~/anaconda3/envs/adlfs/lib/python3.7/site-packages/numpy/core/_asarray.py in asarray(a, dtype, order)
     83 
     84     """
---> 85     return array(a, dtype, copy=False, order=order)
     86 
     87 

~/anaconda3/envs/adlfs/lib/python3.7/site-packages/xarray/core/indexing.py in __array__(self, dtype)
    651 
    652     def __array__(self, dtype=None):
--> 653         return np.asarray(self.array, dtype=dtype)
    654 
    655     def __getitem__(self, key):

~/anaconda3/envs/adlfs/lib/python3.7/site-packages/numpy/core/_asarray.py in asarray(a, dtype, order)
     83 
     84     """
---> 85     return array(a, dtype, copy=False, order=order)
     86 
     87 

~/anaconda3/envs/adlfs/lib/python3.7/site-packages/xarray/core/indexing.py in __array__(self, dtype)
    555     def __array__(self, dtype=None):
    556         array = as_indexable(self.array)
--> 557         return np.asarray(array[self.key], dtype=None)
    558 
    559     def transpose(self, order):

~/anaconda3/envs/adlfs/lib/python3.7/site-packages/xarray/backends/h5netcdf_.py in __getitem__(self, key)
     27     def __getitem__(self, key):
     28         return indexing.explicit_indexing_adapter(
---> 29             key, self.shape, indexing.IndexingSupport.OUTER_1VECTOR, self._getitem
     30         )
     31 

~/anaconda3/envs/adlfs/lib/python3.7/site-packages/xarray/core/indexing.py in explicit_indexing_adapter(key, shape, indexing_support, raw_indexing_method)
    835     """
    836     raw_key, numpy_indices = decompose_indexer(key, shape, indexing_support)
--> 837     result = raw_indexing_method(raw_key.tuple)
    838     if numpy_indices.tuple:
    839         # index the loaded np.ndarray

~/anaconda3/envs/adlfs/lib/python3.7/site-packages/xarray/backends/h5netcdf_.py in _getitem(self, key)
     36         with self.datastore.lock:
     37             array = self.get_array(needs_lock=False)
---> 38             return array[key]
     39 
     40 

~/anaconda3/envs/adlfs/lib/python3.7/site-packages/h5netcdf/core.py in __getitem__(self, key)
    144 
    145     def __getitem__(self, key):
--> 146         return self._h5ds[key]
    147 
    148     def __setitem__(self, key, value):

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

~/anaconda3/envs/adlfs/lib/python3.7/site-packages/h5py/_hl/dataset.py in __getitem__(self, args)
    541             arr = numpy.ndarray(selection.mshape, dtype=new_dtype)
    542             for mspace, fspace in selection:
--> 543                 self.id.read(mspace, fspace, arr, mtype)
    544             if len(names) == 1:
    545                 arr = arr[names[0]]

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/h5d.pyx in h5py.h5d.DatasetID.read()

h5py/_proxy.pyx in h5py._proxy.dset_rw()

h5py/_proxy.pyx in h5py._proxy.H5PY_H5Dread()

h5py/defs.pyx in h5py.defs.H5Dread()

h5py/h5fd.pyx in h5py.h5fd.H5FD_fileobj_read()

~/Documents/Projects/adlfs/adlfs/spec.py in readinto(self, b)
   1547         https://docs.python.org/3/library/io.html#io.RawIOBase.readinto
   1548         """
-> 1549         data = self.read(len(b))
   1550         memoryview(b).cast("B")[: len(data)] = data
   1551         return len(data)

~/Documents/Projects/adlfs/adlfs/spec.py in read(self, length)
   1566             length = self.size - self.loc
   1567         if self.closed:
-> 1568             raise ValueError("I/O operation on closed file.")
   1569         logger.debug("%s read: %i - %i" % (self, self.loc, self.loc + length))
   1570         if length == 0:

ValueError: I/O operation on closed file.

from adlfs.

hayesgb avatar hayesgb commented on May 23, 2024

I get the same error with adlfs==0.4.0

from adlfs.

hayesgb avatar hayesgb commented on May 23, 2024

This seems to be related to the nested open and load methods. The following seems to work:

with fsspec.open("abfs://" + loc, **storage_options) as f:
    x  = xr.open_dataset(f)

with fsspec.open("abfs://" + loc, **storage_options) as f:
    x  = xr.open_dataset(f)

Can you provide a working conguration?

from adlfs.

basnijholt avatar basnijholt commented on May 23, 2024

I can confirm that the code you posted works too.

However, adding a single .load() outside of the context manager, like

with fsspec.open("abfs://" + loc, **storage_options) as f:
    x  = xr.open_dataset(f)

with fsspec.open("abfs://" + loc, **storage_options) as f:
    x  = xr.open_dataset(f)

x = x.load()  # <-------- this

Raises the exception again.

And what do you mean with:

Can you provide a working configuration?

from adlfs.

hayesgb avatar hayesgb commented on May 23, 2024

You indicated that the code above worked with adlfs==0.4. However, when I downgraded adlfs in the above config, I got the same error. Can you provide an xr.show_versions() for your working code? I'm trying to determine if the issue lies in adlfs, or elsewhere.

Thanks.

from adlfs.

basnijholt avatar basnijholt commented on May 23, 2024

Sorry, if I had posted that originally then that was incorrect. However, there is no mention above about it working with 0.4.0. Perhaps it's because I copied some text from #94 (including something about it working with 0.4.0) and then later edited it out. Sorry for the confusion.

from adlfs.

hayesgb avatar hayesgb commented on May 23, 2024

No worries. So it hasn't ever worked with adlfs. Do you know if it works with either s3fs or gcsfs?

I'm asking because adlfs passes the existing unit tests for the context manager, and my debugging efforts suggest this is related to calling .load() with NetCDF4. However, both of those packages are also based on fsspec, so if one of them works, then that may give me another route to explore.

from adlfs.

basnijholt avatar basnijholt commented on May 23, 2024

Unfortunately, I do not have access to any other cloud storage.

Maybe I can try to get access to one of them next week.

from adlfs.

hayesgb avatar hayesgb commented on May 23, 2024

Closing as stale.

from adlfs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.