Giter VIP home page Giter VIP logo

Comments (12)

gmarkall avatar gmarkall commented on June 2, 2024 1

The error message for "np.vstack(gen())"
TypeError: arrays to stack must be passed as a "sequence" type such as list or tuple.
seems to be OK as per Python's definition of a Sequence.

The part of the message I thought was a problem was the implication that a list must be passed - your second example in the issue report appears to be passing a list but that still doesn't work.

I have seen the function being used in #4271
This lead me to the conclusion it might be supported.

I think if any of the snippets in that issue did run at some point, they might have fallen back to object mode - I can't get any of the code in that issue to run at present. A variation on those snippets I can get to run is:

from numba import jit
import numpy as np


@jit(forceobj=True)
def func(x):
    return x[0]


@jit(forceobj=True)
def error():
    topo = [chr(ord('a') + i) for i in range(5)]
    types = [(var_name, "int") for var_name in topo]
    sampled = np.zeros(10, dtype=types)
    return np.fromiter([func(x) for x in sampled], dtype=int)


print(error())

but you don't really want to be running in object mode, since its only effect is really to remove a tiny bit of interpreter loop overhead within your function.

from numba.

gmarkall avatar gmarkall commented on June 2, 2024 1

What you observe above is expected. The types.containers.ListType is a Numba type used for Numba's type system. The object you get returned when creating a typed list is a numba.typed.List:

from numba import njit, typed

@njit
def give_a_list():
    return typed.List([1, 2, 3])

print(isinstance(give_a_list(), typed.List))

prints

$ python listtype.py 
True

from numba.

gmarkall avatar gmarkall commented on June 2, 2024

Thanks for the nice report - I think this needs labelling as a feature request to support these functions.

I also tried not using the list in the vstack version:

import numpy as np
import numba as nb

@nb.njit
def gen():
    """Generator function returning 1D-arrays."""
    for i in range(3):
        yield np.arange(i, i+3)

@nb.njit
def use_gen_vstack():
    return np.vstack(gen())

print(use_gen_vstack())

and the error message says:

TypeError: arrays to stack must be passed as a "sequence" type such as list or tuple.

which is a bit odd, given that the list created from the generator isn't accepted, as in your original report.

from numba.

OyiboRivers avatar OyiboRivers commented on June 2, 2024

@gmarkall , thank you very much for your quick response.
The generator function returns an iterable of 1d-numpy arrays.
The error message for "np.vstack(gen())"
TypeError: arrays to stack must be passed as a "sequence" type such as list or tuple.
seems to be OK as per Python's definition of a Sequence.

from collections.abc import Sequence, Iterable
import numpy as np

def gen():
    for i in range(3):
        yield np.arange(i, i+3)

iterable = gen()
isinstance(iterable, Sequence)
# False
isinstance(iterable, Iterable)
# True

The same error appears if you use pure numpy.

import numpy as np

def gen():
    """Generator function returning 1D-arrays."""
    for i in range(3):
        yield np.arange(i, i+3)

np.vstack(gen())
# TypeError: arrays to stack must be passed as a "sequence" type such as list or tuple.

The overloaded Numba function in arrayobj.py expects a "BaseTuple".

@overload(np.vstack)
def impl_np_vstack(tup):
    if isinstance(tup, types.BaseTuple):
        def impl(tup):
            return _np_vstack(tup)
        return impl

As the function does not receive a BaseTuple as argument it seems to return the original numpy error message.

from numba.

OyiboRivers avatar OyiboRivers commented on June 2, 2024

It seems that numpy.fromiter is not supported in your current version of Numba.
TypingError: Use of unsupported NumPy function 'numpy.fromiter' or unsupported use of the function.
I have seen the function being used in
#4271
This lead me to the conclusion it might be supported.

from numba.

OyiboRivers avatar OyiboRivers commented on June 2, 2024

Observation: numpy.asarray fails to generate 2D-array from list of 1D-arrays

Converting the generator into a list of 1D-arrays allows numpy.asarray to return a 2D-array.
Utilizing numpy.vstack would not be necessary for generating a 2D array in NumPy.
The numba implementation of numpy.asarray throws an error.

import numpy as np
import numba as nb

@nb.njit
def gen():
    """Generator function returning 1D-arrays."""
    for i in range(3):
        yield np.arange(i, i+3)

@nb.njit
def gen_as_list():
    """Convert generator to list of 1D-arrays."""
    return nb.typed.List(gen())

@nb.njit
def gen_as_array_nb():
    """Convert generator to 2D array."""
    return np.asarray(gen_as_list())

def gen_as_array_np():
    """Convert generator to 2D array."""
    return np.asarray(gen_as_list())

print(gen_as_array_np())
# [[0 1 2]
#  [1 2 3]
#  [2 3 4]]

print(gen_as_array_nb())
# TypingError: No implementation of function Function(<built-in function asarray>) found for signature:
# asarray(ListType[array(int64, 1d, C)])

If you apply type checking according to @overload(np.asarray) in arraymath.py from line 4291, function type_can_asarray rejects the ListType[array...).

import numpy as np
import numba as nb
from numba import types
from numba.np.numpy_support import type_can_asarray

@nb.njit
def gen():
    """Generator function returning 1D-arrays."""
    for i in range(3):
        yield np.arange(i, i+3)

@nb.njit
def gen_as_list():
    """Convert generator to list of 1D-arrays."""
    return nb.typed.List(gen())

def gen_type_check():
    a = gen_as_list()
    # type checking as per @overload(np.asarray) in arraymath.py from line 4291
    if not type_can_asarray(a):
        return 1
    if isinstance(a, types.Array):
        return 2
    elif isinstance(a, (types.Sequence, types.Tuple)):
        return 3
    elif isinstance(a, (types.Number, types.Boolean)):
        return 4
    elif isinstance(a, types.containers.ListType):
        return 5
    elif isinstance(a, types.StringLiteral):
        return 6
    else:
        return 7

print(gen_type_check())
# 1

# def type_can_asarray(arr):
#     ok = (types.Array, types.Sequence, types.Tuple, types.StringLiteral,
#           types.Number, types.Boolean, types.containers.ListType)
#     return isinstance(arr, ok)

print(type_can_asarray(gen_as_list()))
# False

print(isinstance(gen_as_list(), types.containers.ListType))
# False

This behavior seems to be related to:
#6803

from numba.

OyiboRivers avatar OyiboRivers commented on June 2, 2024

Unfortunately, the implementation of numpy.fromiter seems to be problematic.

  1. The output array shape and type must be determined by a function argument "dtype" not by an argument type.
@nb.njit
def np_fromiter_impl(iter, dtype):
    arraylist = nb.typed.List(iter)
    size = len(arraylist)
    out = np.empty(size, dtype=dtype)
    for i in range(size):
        out[i] = arraylist[i]
    return out

dtype = np.dtype(('<i8', (3,)))

print(np_fromiter_impl.py_func(gen(), dtype=dtype))
# [[0 1 2]
#  [1 2 3]
#  [2 3 4]]

print(np_fromiter_impl(gen(), dtype=dtype))
# TypingError: non-precise type pyobject
# During: typing of argument at /tmp/ipykernel_1004317/1459046023.py (1)
  1. You can't specify advanced array types which would be necessary to describe the shape and type of the output array.
@nb.njit
def make_dtype():
    return np.dtype(('<i8', (3,)))

print(make_dtype.py_func())
# ('<i8', (3,))

print(make_dtype())
# TypingError: No implementation of function Function(<class 'numpy.dtype'>) found for signature:
# dtype(Tuple(Literal[str](<i8), UniTuple(int64 x 1)))

@gmarkall should I open a separate issue to support advanced data types in the numpy.dtype implementation?

from numba.

gmarkall avatar gmarkall commented on June 2, 2024

@gmarkall should I open a separate issue to support advanced data types in the numpy.dtype implementation?

I think that seems like a good thing to do, to keep the discussions simpler to follow in issues - many thanks!

from numba.

OyiboRivers avatar OyiboRivers commented on June 2, 2024

@gmarkall no problem, I will open another issue for advanced use of numpy.dtype as feature request #9527

What about the issue in np.asarray where a typed list is not identified as a ListType?

print(isinstance(gen_as_list(), types.containers.ListType))
# False

This seems weird to me. Should I also open an issue on this matter or is this the expected behavior?

from numba.

OyiboRivers avatar OyiboRivers commented on June 2, 2024

@gmarkall thank you.

The implementation of numpy.asarray converts a typed list of scalars into a numpy array.
The function fails converting a typed list of 1D-numpy arrays into a 2D-numpy array.
This feature does not seem to be supported. Is that correct?

import numpy as np
from numba import njit, typed

@njit
def give_list():
    return typed.List([1, 2, 3])

@njit
def give_arraylist():
    return typed.List([np.array([1, 2, 3]), np.array([1, 2, 3])])

@njit
def asarray(a):
    return np.asarray(a)

print(asarray(give_list()))
# [1 2 3]

print(asarray(give_arraylist()))
# TypingError: No implementation of function Function(<built-in function asarray>) found for signature:
# asarray(ListType[array(int64, 1d, C)])

from numba.

gmarkall avatar gmarkall commented on June 2, 2024

This feature does not seem to be supported. Is that correct?

I think that is correct. The docs are not very clear about what it might accept though: https://numba.readthedocs.io/en/stable/reference/numpysupported.html#other-functions

from numba.

OyiboRivers avatar OyiboRivers commented on June 2, 2024

@gmarkall thank you.
If asarray would be able to convert a typed list of 1D-arrays into a 2D-array that could theoretically make room to implement np.fromiter. Although I'm not sure about the performance and safety of this operation since you can't influence the dtype for typed.List(iterable). You may have to cast the final array to the desired dtype.
np.fromiter => np.asarray(typed.List(iterable), dtype)

Should I open an issue to support conversion of a typed list of 1D-arrays into a 2D-array using asarray?

Edit:
You can already implement np.fromiter for scalars using reflected lists in combination with asarray.

import numpy as np
from numba import njit, types
from numba.extending import overload
from numba.core.errors import TypingError

@overload(np.fromiter)
def ovl_fromiter(iter, dtype):
    def np_fromiter_impl(iter, dtype):
        return np.asarray(list(iter), dtype)
    # Type check
    if not isinstance(iter, types.IterableType):
        raise TypingError("First argument must be an iterable.")
    else:
        return np_fromiter_impl

@njit
def gen_scalars():
    """Generator function returning scalars."""
    for i in range(3):
        yield i

@njit
def use_fromiter_scalars(dtype):
    return np.fromiter(gen_scalars(), dtype)

print(use_fromiter_scalars(np.int64))
# [0 1 2]

Unfortunately, this method sometimes works and sometimes fails using typed lists.

@overload(np.fromiter)
def ovl_fromiter(iter, dtype):
    def np_fromiter_impl(iter, dtype):
        return np.asarray(typed.List(iter), dtype)
    # Type check
    if not isinstance(iter, types.IterableType):
        raise TypingError("First argument must be an iterable.")
    else:
        return np_fromiter_impl

Edit 2:
I can't reproduce the error I received earlier using typed lists. Now both methods work on my machine. Not sure what the problem was.
Both methods fail on generators returning 1D-arrays.

Edit 3:
I've just realized that np.asarray surprisingly works on iterables as scalars without the intermediate step of converting to a list. Numpy's asarray does not return an array. There is a deviation in behavior.

import numpy as np
from numba import njit, types
from numba.extending import overload
from numba.core.errors import TypingError

@overload(np.fromiter)
def ovl_fromiter(iter, dtype):
    def np_fromiter_impl(iter, dtype):
        return np.asarray(iter, dtype)
    # Type check
    if not isinstance(iter, types.IterableType):
        raise TypingError("First argument must be an iterable.")
    else:
        return np_fromiter_impl

@njit
def gen():
    """Generator function returning 1D-arrays."""
    for i in range(3):
        yield i

@njit
def use_fromiter(dtype):
    return np.fromiter(gen(), dtype)

print(use_fromiter(np.int64))
# [0 1 2]

print(np.asarray(gen.py_func()))
# <generator object gen at 0x7f7782e64880>

from numba.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.