Giter VIP home page Giter VIP logo

h5py's Introduction

image

image

HDF5 for Python

h5py is a thin, pythonic wrapper around HDF5, which runs on Python 3 (3.8+).

Websites

Installation

Pre-built h5py can either be installed via your Python Distribution (e.g. Continuum Anaconda, Enthought Canopy) or from PyPI via pip. h5py is also distributed in many Linux Distributions (e.g. Ubuntu, Fedora), and in the macOS package managers Homebrew, Macports, or Fink.

More detailed installation instructions, including how to install h5py with MPI support, can be found at: https://docs.h5py.org/en/latest/build.html.

Reporting bugs

Open a bug at https://github.com/h5py/h5py/issues. For general questions, ask on the HDF forum (https://forum.hdfgroup.org/c/hdf-tools/h5py).

h5py's People

Contributors

ajelenak avatar andrewcollette avatar aparamon avatar aragilar avatar cydanil avatar ddale avatar dguest avatar dota17 avatar hameerabbasi avatar helveg avatar jakirkham avatar jreadey avatar juliagarriga avatar kif avatar kmuehlbauer avatar llchan avatar matham avatar mkitti avatar mraspaud avatar payno avatar pierlauro avatar satra avatar scopatz avatar t20100 avatar tacaswell avatar takluyver avatar ulrikpedersen avatar valiantljk avatar vallsv avatar vasole avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

h5py's Issues

Support generic VLEN types

Original author: [email protected] (June 18, 2009 18:49:11)

Now that vlen strings are supported, generic variable-length types should
also be possible. They will be represented by object arrays containing 1D
ndarray "vectors". At this point the biggest challenge is how to handle
recursively nested arrays (vlens which contain vlens), and how to inform
the conversion machinery about the intended "sub-dtype" of the object array.

Original issue: http://code.google.com/p/h5py/issues/detail?id=48

Switch to standard Python method names

Original author: [email protected] (April 29, 2009 23:21:03)

Objects which expose a dictionary-style interface (Group, AttributeManager)
currently use non-standard names for the methods .keys(), .iterkeys(),
.values() and .itervalues(). The motivation for these odd names no longer
applies. For h5py 1.2, we should:

(1) Rename these to their standard Python equivalents
(2) Keep the old-style methods and have them issue a DeprecationWarning.

Original issue: http://code.google.com/p/h5py/issues/detail?id=36

Non-monotonically increasing lists in fancy selections not detected

Original author: [email protected] (February 20, 2009 09:04:25)

This can be seen in, for example:

{{{
>>> dset[[1,4,2]]
File "/home/faltet/python/h5py/trunk/h5py/highlevel.py", line 900, in
getitem

self.id.read(mspace, fspace, arr)                                     

File "/home/faltet/python/h5py/trunk/h5py/_sync.py", line 74, in wrap

return func(*args, **kwds)                                            

File "h5d.pyx", line 203, in h5py.h5d.DatasetID.read (h5py/h5d.c:1836)

h5py.h5.DatasetError: Can't read data (H5Dread)

HDF5 Error Stack:

0: "Can't read data" at H5Dread                                       

1: "Src and dest data spaces have different sizes" at H5D_read        

}}}

With the attached patch the detection works well:

{{{
>>> dset[[1,4,2]]
File "/home/faltet/python/h5py/trunk/h5py/highlevel.py", line 889, in
getitem

selection = sel.select(self.shape, args)                              

File "/home/faltet/python/h5py/trunk/h5py/selections.py", line 42, in
select
sel[args]

File "/home/faltet/python/h5py/trunk/h5py/selections.py", line 336, in
getitem

raise ValueError("Selection lists must be in increasing order")       

ValueError: Selection lists must be in increasing order

}}}

Original issue: http://code.google.com/p/h5py/issues/detail?id=29

Several lists in fancy selections do not work correctly

Original author: [email protected] (February 20, 2009 09:58:28)

The next code shows the problem:

{{{
import numpy as np
import h5py

SHAPE = (10,5)
DTYPE = np.dtype('i')
SIZE = np.product(SHAPE)

f = h5py.File('/tmp/test.hdf5','w')
mydata = np.arange(SIZE,dtype=DTYPE).reshape(SHAPE)
dset = f.create_dataset("mydata", data=mydata)
f.flush()

print "h5py -->", dset[[1,3],[1,2]]
print "numpy -->", mydata[[1,3],[1,2]]
f.close()
}}}

and the output:

{{{
h5py --> [[ 6 7]
[16 17]]
numpy --> [ 6 17]
}}}

Original issue: http://code.google.com/p/h5py/issues/detail?id=30

Support automatic creation of intermediate groups

Original author: [email protected] (June 19, 2009 19:58:55)

The HDF5 link-creation API supports auto-creation of missing groups. If a
file contains no groups, and a dataset is created at e.g /a/b/dataset, then
HDF5 1.8 will auto-create groups a and b. It's preferred to use native
HDF5 functionality rather than a Python system.

This will require:

  1. Adding H5Pset/get_create_intermediate_group()
  2. Auditing h5g/h5d create functions for lcpl support
  3. Transparent high-level support based on version.api_version

Original issue: http://code.google.com/p/h5py/issues/detail?id=49

VLEN string support via object arrays

Original author: [email protected] (April 23, 2009 22:19:50)

Variable-length strings are very common in real-world HDF5 files. There
needs to be some facility in h5py to both read and write (and possibly,
create) variable-length data. For now, this enhancement is limited to
variable-length byte strings (vlen H5T_STRING types).

The only standard NumPy facility for this is object arrays (dtype "O").
Since object arrays do not enforce restrictions on the classes of their
elements, this requires the following modifications to h5py:

  1. A new HDF5 type (opaque?) representing an object pointer

  2. A C (or Cython) HDF5 conversion callback which can translate from HDF5
    vlen byte strings to native Python strings. The callback would need to be
    bi-directional and be prepared to do a certain amount of type checking and
    possibly type coercion.

  3. A standard way of associating a "declared type" with a dtype of kind
    "O", representing the type of its elements. This could use the previously
    investigated technique of encoding information via field metadata.

To begin with, this enhancement specifies the following:

  1. Only the case of vlen string -> Python string (and reverse) is considered.

  2. Python-side object arrays may be required to consist only of "str" or
    "None" objects, with an error raised at conversion time if an incompatible
    entry is found.

A new branch of the project, based on 1.1, should be used to keep things sane.

Original issue: http://code.google.com/p/h5py/issues/detail?id=35

Support more standard Python exceptions

Original author: [email protected] (April 23, 2009 22:02:25)

The current technique of raising H5Error subclasses for all HDF5 errors is
incompatible with standard Python expectations, and produces especially
surprising results at the high-level interface.
ValueError/TypeError/KeyError subclasses should be raised whenever it's
possible to figure this out.

Trunk revision 304 now contains an implementation based on a static mapping
using HDF5 minor error codes. Exception classes are created dynamically at
run-time to inherit from both the current standard H5Error subclass and an
appropriate native-Python exception class.

Original issue: http://code.google.com/p/h5py/issues/detail?id=34

Flaky behavior on 32-bit platforms where dimensions > 2**32 are involved

Original author: [email protected] (October 07, 2008 23:26:47)

A few places in the codebase don't like dimensions bigger than a long int.
The official policy, on both 32 and 64 bit platforms, is:

  • h5py must accept large (64-bit) dimensions anywhere HDF5 does
  • The maximum dimension size allowed is the limit of a signed 64-bit int
    (roughly 2**63).

The following steps will address this issue:

  • Identify every API call which can take a dimension > 2**32 and add unit
    test conditions which explicitly check large values
  • Since len cannot return a value larger than 2**32 in 32-bit Python,
    add a custom len() method to each class which can requires this behavior.
  • Add a description of this policy to the documentation

Original issue: http://code.google.com/p/h5py/issues/detail?id=3

Array flags validation is unreliable

Original author: [email protected] (October 07, 2008 22:21:04)

Requirements for C-style ordering are erratically enforced, thanks to
idiotic misuse of bitwise operators in the validation routines. This can
lead to data corruption or inappropriate failure when attempting to
store/read an array from/into non-contiguous memory or Fortran order. What
needs to happen:

  • Fix low-level array flags validation and add separate unit tests just for
    these validation routines
  • Remove requirement that an array owns its data, as long as obj->data
    pointer is valid and data is C-contiguous
  • Standardize exception for incorrect array flags; should be TypeError
    everywhere

Additional required behavior:

  • The low-level modules should continue to raise an exception for an
    illegal input array
  • The h5py.highlevel routines should be modified to transparently coerce an
    illegal array into flat C-order
  • Data read back from HDF5 using h5py.highlevel routines will continue to
    always be provided in plain C-contiguous format

Reported behavior (thanks Z. Pincus):

  • Fortran-contiguous arrays can be stored as a dataset; however they
    come out a bit garbled:

In : f = h5py.File('test.dat', 'w')
In : a = numpy.array([[1,2,3],[4,5,6]], order='F')
In : a
Out:
array([[1, 2, 3],
[4, 5, 6]])
In : a.strides
Out: (4, 8)
In : f['a'] = a
In : f['a'].value
Out:
array([[1, 4, 2],
[5, 3, 6]])
In : f['a'].value.strides
Out: (12, 4)
In : fixed = numpy.asarray(f['a'].value)
In : fixed.strides = (4,8)
In : fixed
Out:
array([[1, 2, 3],
[4, 5, 6]])

  • Non-contiguous arrays that do own their own data can be "stored"
    but of course everything breaks:

In : b = numpy.arange(2_3_4, dtype=numpy.uint8)
In : b
Out:
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16,
17, 18, 19, 20, 21, 22, 23], dtype=uint8)
In : b.shape=(2,3,4)
In : b.strides=(0,1,1)
In : b
Out:
array([[[0, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5]],

    [[0, 1, 2, 3],
     [1, 2, 3, 4],
     [2, 3, 4, 5]]], dtype=uint8)

In : b.flags
Out:
C_CONTIGUOUS : False
F_CONTIGUOUS : False
OWNDATA : True
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
In : f['b'] = b
In : f['b'].value
Out:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],

    [[12, 13, 14, 15],
     [16, 17, 18, 19],
     [20, 21, 22, 23]]], dtype=uint8)
  • C-contiguous arrays that don't own their own data can be stored
    just fine:

In : b = numpy.arange(2_3_4, dtype=numpy.uint8)
In : c = numpy.ndarray(buffer=b, offset=2, shape=(2,4),
dtype=numpy.uint8)
In : c
Out:
array([[2, 3, 4, 5],
[6, 7, 8, 9]], dtype=uint8)
In : c.flags
Out:
C_CONTIGUOUS : True
F_CONTIGUOUS : False
OWNDATA : False
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
In : f['c'] = c
In : f['c'].value
Out:
array([[2, 3, 4, 5],
[6, 7, 8, 9]], dtype=uint8)

  • Fortran-contiguous arrays that do not own their own data cannot be
    stored (as expected):

In : b = numpy.arange(2_3_4, dtype=numpy.uint8)
In : d = numpy.ndarray(buffer=b, offset=2, shape=(2,4), strides=(1,
2), dtype=numpy.uint8)
In : d
Out:
array([[2, 4, 6, 8],
[3, 5, 7, 9]], dtype=uint8)
In : d.flags
Out:
C_CONTIGUOUS : False
F_CONTIGUOUS : True
OWNDATA : False
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
In : f['d'] = d
ValueError: Array must be C-contiguous and own its data.

  • Non-contiguous arrays that do not own their data cannot be stored
    (as expected):

In : b = numpy.arange(2_3_4, dtype=numpy.uint8)
In : e = numpy.ndarray(buffer=b, offset=2, shape=(2,4), strides=(-1,
1), dtype=numpy.uint8)
In : e
Out:
array([[2, 3, 4, 5],
[1, 2, 3, 4]], dtype=uint8)
In : e.flags
Out:
C_CONTIGUOUS : False
F_CONTIGUOUS : False
OWNDATA : False
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
In : f['e'] = e
ValueError: Array must be C-contiguous and own its data.

Original issue: http://code.google.com/p/h5py/issues/detail?id=1

Don't unconditionally overwrite attributes

Original author: [email protected] (May 07, 2009 18:28:38)

The current high-level behavior for attributes simply replaces them on
assignment. This has the following problems:

(1) The type is not preserved; for example, assigning "42" to an existing
int8 attribute results in a new int32 attribute.

(2) It is non-atomic, as the existing attribute is removed before the new
one is inserted.

(3) There is no way to create a new attribute from a shape and dtype, only
from an existing scalar or array.

The behavior of AttributeManager.setitem will be changed in h5py 1.2 to
conform to the following:

(1) If no attribute with the indicated name exists, create it as normal
(2) If an attribute already exists, try to write to it using the standard
HDF5 type-conversion machinery
(3) If the types are not conversion-compatible, raise TypeError (presumably
automatic as a result of (2) failing)

A new method, AttributeManager.create(name, value=None, shape=None,
dtype=None) will provide a way to create uninitialized attributes, and will
implicitly overwrite existing attributes regardless of type.

Original issue: http://code.google.com/p/h5py/issues/detail?id=37

NumPy elements not allowed in fancy indexing

Original author: [email protected] (February 23, 2009 18:35:13)

Fancy selections like mydata[2, l] is supported when l is a list or
tuple, but not when is a !NumPy array. The error is:

{{{
File "/home/faltet/python/h5py/trunk/h5py/selections.py", line 37, in
select
if not isinstance(a, slice) and a != Ellipsis:

ValueError: The truth value of an array with more than one element is
ambiguous. Use a.any() or a.all()

}}}

Attached is a patch for allowing !NumPy arrays in fancy selections.

Original issue: http://code.google.com/p/h5py/issues/detail?id=32

High-level input coercion is incomplete

Original author: [email protected] (October 07, 2008 22:32:45)

The high-level routines do not always coerce input data (for example,
nested sequences) to the correct array data-type and order.

  • This affects dataset creation, attribute creation, and slice assignment
  • Wherever a NumPy array is accepted, a nested sequence should also be
    accepted, perhaps using asarray
  • Unit tests with non-NumPy types should be added for all three cases
  • NumPy input arrays (in all three cases) must NOT have their types
    converted (e.g. =i4 to <f8); the HDF5 library can do this on the fly
    without having to copy the entire array to a new buffer
  • NumPy input arrays MUST be coerced to C-contiguous, in all three cases

Therefore "asarray(input, order='C')" seems the best choice.

Thanks Z. Pincus for original report.

Original issue: http://code.google.com/p/h5py/issues/detail?id=2

h5py doesn't play well with Pytables on Windows

Original author: [email protected] (December 04, 2008 13:37:13)

h5py and Pytables can't be used at the same time when installed from the
windows installers. This appears to be due to an API incompatibility
between the HDF5 dll's used. The h5py installer provides the 1.8 API by
default and the Pytables installer provides the 1.6 API.

>>> import h5py
>>> import tables # fails to find dll entry point

or

>>> import tables
>>> import h5py # fails to find dll entry point

This will probably be surprising to many users and should at least be
documented in the name of the installer that is offered.

  • What version of h5py, and HDF5

Summary of h5py config

HDF5: 1.8.2
1.6 API: True
1.8 API: True
Thread-aware: True
Diagnostic mode: False
Complex names: ('r', 'i')

* What operating system (Linux/Windows/etc)

Windows XP

* What major version of Python (2.5 or 2.6)

Python 2.5

Original issue: http://code.google.com/p/h5py/issues/detail?id=16

Validate dimensions in fancy selection fails

Original author: [email protected] (February 20, 2009 12:06:41)

Here it is a script showing the problem:

{{{
import numpy as np
import h5py

SHAPE = (4,5)
DTYPE = np.dtype('i')
SIZE = np.product(SHAPE)

f = h5py.File('/tmp/test.hdf5','w')
mydata = np.arange(SIZE,dtype=DTYPE).reshape(SHAPE)
dset = f.create_dataset("mydata", data=mydata)
f.flush()

keys = (slice(None), [0,1,2,3,4])
a = mydata[keys]
print "numpy-->", a
b = dset[keys]
print "h5py-->", b
assert np.alltrue(a == b)
f.close()
}}}

which raise the next error:

{{{
numpy--> [[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]
[15 16 17 18 19]]
Traceback (most recent call last):
File "multidim.py", line 16, in <module>
b = dset[keys]
File "/home/faltet/python/h5py/trunk/h5py/highlevel.py", line 889, in
getitem
selection = sel.select(self.shape, args)
File "/home/faltet/python/h5py/trunk/h5py/selections.py", line 42, in select
sel[args]
File "/home/faltet/python/h5py/trunk/h5py/selections.py", line 338, in
getitem
validate_number(select_idx, length)
File "/home/faltet/python/h5py/trunk/h5py/selections.py", line 295, in
validate_number
raise IndexError('Index out of bounds: %d' % num)
IndexError: Index out of bounds: 5
}}}

Attached is a fix for this.

Original issue: http://code.google.com/p/h5py/issues/detail?id=31

setup.py build doesn't work correctly for multi-platform builds

Original author: [email protected] (January 14, 2009 22:04:44)

Noticed the following when building h5py 1.0.0 for multiples architectures.
The first build (on ia32) worked without problems. But then when I logged
in on an AMD64 machine, went to the same directory (it's shared via NFS)
and ran "python setup.py build" again, it just printed a couple of lines to
the screen without doing any work. The result of the Cythoning step can
probably be shared between architecture, but I'm pretty sure the result of
compiling the C files output by Cython can't.

This was handled correctly for all python packages I compiled up to now.
(Architecture-dependent files go in different subdirectories for each arch.)

Original issue: http://code.google.com/p/h5py/issues/detail?id=21

h5py user guide in PDF

Original author: [email protected] (March 03, 2009 17:18:58)

Hi,

Would it be possible to add a link somewhere in
http://h5py.alfven.org/docs/ for the pdf version of the documentation?
I see that you used sphinx to generate it, so it shouldn't be hard to
create a PDF file, and it would be very helpful for people working on
private networks (like me and my 1000+ colleagues...).

Thanks very much,
And thank you for this excellent module!

Pierre

Original issue: http://code.google.com/p/h5py/issues/detail?id=33

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.