inducer / arraycontext Goto Github PK

View Code? Open in Web Editor NEW

5.0 8.0 11.0 779 KB

Choose your favorite numpy-workalike!

Python 99.80% Shell 0.20%

arraycontext's Introduction

arraycontext: Choose your favorite `numpy`-workalike

GPU arrays? Deferred-evaluation arrays? Just plain numpy arrays? You'd like your code to work with all of them? No problem! Comes with pre-made array context implementations for:

numpy
PyOpenCL
JAX
Pytato (for lazy/deferred evaluation) with backends for pyopencl and jax.
Debugging
Profiling

arraycontext started life as an array abstraction for use with the meshmode unstrucuted discretization package.

Distributed under the MIT license.

arraycontext's People

Contributors

Stargazers

Watchers

Forkers

alexfikl thomasgibson nchristensen majosm mitkotak kaushikcfd mtcam matthiasdiener illinois-ceesd a-alveyblanc nkoskelo

arraycontext's Issues

Renaming parts of partitioned DAGs

Currently, kernels of partitioned DAGs all have the same name. It would be helpful for debugging to have different names for each part.

Removing BaseFakeNumpySpace's dependency on call_loopy

Would it be helpful to move this loopy execution from BaseFakeNumpyNameSpace to PyOpenCLArrayContext so that we can remove BaseFakeNumpyNameSpace's dependence on having a functional call_loopy and it looks something like this ?

Container-over-container broadcasting, broadcasting rules

As pointed out by @MTCam in illinois-ceesd/mirgecom#355 (comment):

E TypeError: unsupported operand type(s) for *: 'DOFArray' and 'ConservedVars'

(Here, DOFArrays clearly want to be inside ConservedVars, so there must be a rule to specify who is the "outer" structure.)

The biggest question here is which type should broadcast over which other type, and how the rules are to be communicated.

Some proposals:

Just give the outer container's with_container_arithmetic a bcast_container_types=(type1, type2) argument.
Define a _container_broadcast_priority: the higher the "outermost". All containers broadcast over all other containers (or at least ones that have this enabled in with_container_arithmetic. Kind of like numpy's __array_priority__.
Have containers have a method may_broadcast_over_container_type(other_type).

Some thoughts:

This is defining a partial order of who broadcasts over who. How do we ensure that this stays consistent? (An easy way to get an inconsistent situation is to tell two types that they should each be the "outer" type with respect to each other. That's easy to catch, but scenarios with three or more "edges" in the cycle are also possible and harder to catch.) Also, maybe it's not necessary to catch this statically...? Joys of multiple dispatch. What does Julia do?
Technically, object arrays are part of this hierarchy. bcast_obj_array here, unfortunately, casually defines them as "outer". I think we'll have to deprecate that argument, barely a week after it was introduced...
Of the three proposals above, I think I like bcast_container_types the best. The priority thing feels restrictive and ill-defined. may_broadcast_over_container_type is probably slow.
Maybe we'll need ways to specify both "inner" and "outer" types. DOFArray has no reason to know about ConservedVars (so it shouldn't have to know who its outer types are). Meanwhile, if we let object arrays particpate in the hierarchy, then many types will want them as their outermost type, but we can't ask object arrays who their inner types are.

Thoughts?

cc @alexfikl @thomasgibson

Handling temperature initial guesses

In full-on combustion mirgecom, temperature is found via Newton iteration in https://github.com/ecisneros8/pyrometheus. This is expensive, and it's made substantially cheaper/more reliable by the availability of a starting guess (typically, the last temperature). In eager eval, this is easy: the last value can be sent "along for the ride" in the ConservedVars and looked at when needed. There are two aspects here:

So it sits in an array container, but it may not participate in arithmetic. (Realize that it's possible that no meaningful "temperature increments" are being computed, so the array container parts related to temperature initial guesses returned for the ODE RHS are likely None. The container arithmetic should be able to take that. Conceivably we could compute those increments, and the arithmetic might start making sense... but I'm not sure we want to.) So this is the first question: Do we teach with_container_arithmetic about what's needed? How? (Include Optional in the field type spec? More arguments like allow_none=["field1", "field2"]?)
The other aspect is that this temperature initial guess needs to be "placeholderized" and propagated into actx.compiled code. As long as it's a somewhat normal part of an array container, that happens automatically. (but @MTCam: the "just copy it from the template" mode we discussed likely won't work with lazy, because it won't get "placeholderized" upon entry to a compiled function.)

cc @MTCam @kaushikcfd @alexfikl

`actx.compile` for pytato should allow things that are not array containers

Add optional `allow_single_version` Boolean arg to `actx.compile`

Benefits:

Avoids unexpected repeated compilation
Avoid dispatch cost to select which implementation to use

Drawbacks:

Less flexible, of course :)

Naming of the flag is an open question.

cc @majosm @MTCam

CI failing in mypy

Failing CI run here. Any ideas?

Seems to have started occurring today (CI runs in #129 passed yesterday).

Better explain the freeze/thaw concept

x-ref: inducer/grudge#121 (comment)

cc @thomasgibson

Broadcasting rules distinguish between host scalar and device scalars

Consider this script:

actx = _acf()

@with_container_arithmetic(
        bcast_obj_array=True,
        bcast_numpy_array=True,
        rel_comparison=True,
        _cls_has_array_context_attr=True)
@dataclass_array_container
@dataclass(frozen=True)
class Foo:
    u: DOFArray

foo = Foo(DOFArray(actx, (actx.zeros(3, dtype=np.float64) + 41, )))

print(foo)                                 # prints: Foo(u=DOFArray((cl.Array([41., 41., 41.]),)))
print(foo + 1)                             # prints: Foo(u=DOFArray((cl.Array([42., 42., 42.]),)))
print(foo + actx.from_numpy(np.ones(())))  # ERROR! 😲

I don't see a good reason why we don't support this.

I think we should also go a step ahead and make sure that we broadcast every cl.Array to be added (if legal) with all the leaf arrays of the array container. One way I could see this working is if every ArrayContext comes with a ARRAY_TYPE static, frozen attribute which gives us a hint to broadcast to all successive arrays.

`(zeros|ones|full)_like` and `axes`/`tags`

~~Argh, accidentally pressed Enter. Ignore for the moment. 🙂~~ (See below.)

`test_array_context_np_workalike` may be flaky

A failing test log:

_ test_array_context_np_workalike[<_PyOpenCLArrayContextForTests for <pyopencl.Device 'pthread-Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz' on 'Portable Computing Language'>>0-sum-1-complex64] _
[gw2] linux -- Python 3.11.3 /home/runner/work/pytato/pytato/arraycontext/.miniforge3/envs/testing/bin/python3
Traceback (most recent call last):
  File "/home/runner/work/pytato/pytato/arraycontext/test/test_arraycontext.py", line 391, in test_array_context_np_workalike
    assert_close_to_numpy_in_containers(actx, evaluate, args)
  File "/home/runner/work/pytato/pytato/arraycontext/test/test_arraycontext.py", line 307, in assert_close_to_numpy_in_containers
    assert_close_to_numpy(actx, op, args)
  File "/home/runner/work/pytato/pytato/arraycontext/test/test_arraycontext.py", line 298, in assert_close_to_numpy
    assert np.allclose(
AssertionError: assert False
 +  where False = <function allclose at 0x7ff320bf0680>(array(-0.20130634-0.00813293j, dtype=complex64), (-0.20131063-0.00812912j))
 +    where <function allclose at 0x7ff320bf0680> = np.allclose
 +    and   array(-0.20130634-0.00813293j, dtype=complex64) = <bound method PyOpenCLArrayContext.to_numpy of <test_arraycontext._PyOpenCLArrayContextForTests object at 0x7ff2f59e3690>>(cl.TaggableCLArray(-0.20130634-0.00813293j, dtype=complex64))
 +      where <bound method PyOpenCLArrayContext.to_numpy of <test_arraycontext._PyOpenCLArrayContextForTests object at 0x7ff2f59e3690>> = <test_arraycontext._PyOpenCLArrayContextForTests object at 0x7ff2f59e3690>.to_numpy
 +      and   cl.TaggableCLArray(-0.20130634-0.00813293j, dtype=complex64) = <function test_array_context_np_workalike.<locals>.evaluate at 0x7ff2f5a40400>(<arraycontext.impl.pyopencl.fake_numpy.PyOpenCLFakeNumpyNamespace object at 0x7ff30583e310>, *[cl.TaggableCLArray([-2.43978456e-01+0.12956458j,  9.78083789e-01+0.5325213j ,\n        1.09432292e+00+0.54306734j, -1.... -4.12956858e+00+0.09275813j,\n        4.27362353e-01-0.51016074j, -1.46767467e-01+0.9700044j ],\n      dtype=complex64)])
 +        where <arraycontext.impl.pyopencl.fake_numpy.PyOpenCLFakeNumpyNamespace object at 0x7ff30583e310> = <test_arraycontext._PyOpenCLArrayContextForTests object at 0x7ff2f59e3690>.np
 +    and   (-0.20131063-0.00812912j) = <function test_array_context_np_workalike.<locals>.evaluate at 0x7ff2f5a40400>(np, *[array([-2.43978456e-01+0.12956458j,  9.78083789e-01+0.5325213j ,\n        1.09432292e+00+0.54306734j, -1.12375605e+00+... -4.12956858e+00+0.09275813j,\n        4.27362353e-01-0.51016074j, -1.46767467e-01+0.9700044j ],\n      dtype=complex64)])

From https://github.com/inducer/pytato/actions/runs/5062185418/jobs/9087382913

Issues with `zeros` et al functions of array contexts

Decide between actx.np.zeros and actx.zeros.
zeros_like doesn't currently retain metadata from its argument, resulting in lazy compile issues.

cc @majosm @MTCam @kaushikcfd

Pytato actx compile fails if the function to be compiled does not return a container

cc @kaushikcfd @MTCam

Precision: What actually do `freeze`/`thaw` mean?

@kaushikcfd raised the question after attempting a precise description in a paper.

Here are some aspects:

Arithmetic/other ops are guaranteed to be available only on thawed arrays.
Exchange between array contexts is only allowed on frozen arrays.

Maybe that suffices?

Introduce a `-O0` mode to avoid full-blast code transformation for code that runs once

Meaning of `ArrayContainerT`

I'm a little unclear on whether ArrayContainerT is supposed to indicate an array container only, or if it represents either an array container or an underlying array. The documentation suggests (to me) the former, but it seems like it's being used more like the latter (for example in some of the mapping functions in container/traversal.py). Is there any consensus on what it should mean?

How should lazy actx.compile interact with memoization (on the output end)?

This is a longer-shot companion issue to https://github.com/inducer/pytato/issues/164 (and #99).

Suppose an actx.compiled function evaluates something that's memoized (for the sake of argument, imagine a temperature in mirgecom). After the compiled evaluation concludes, the memoization of dependent data becomes unavailable, since we're now working on a different array (/array container) instance. It'd be nice if that weren't the case.

(This isn't immensely relevant for mirgecom, since the output of the compiled function will typically be fed into a time integrator, which would invalidate any memoized temperature anyhow.)

cc @kaushikcfd

Deprecate `is_array_container`

I think it's confusing because it only looks at the type. That means, for example for non-object numpy arrays, it can give the wrong answer.

cc @majosm @alexfikl

Regression: "Out of host memory" on Nvidia ICD

1593a67 (just before the #189) is known good
167ae7c is broken

Here's a log of the CI failure: https://gitlab.tiker.net/inducer/arraycontext/-/jobs/462367

Host memory grows to 6-ish gigabytes when running this by hand and then keels over. If run with --sw, this fails as before, but it succeeds upon resuming when rerun. I suspect there's a memory leak here somewhere.

@matthiasdiener Could you take a look?

Dealing with constants in `freeze`

Right now, I think we bake constants into the code no matter what, at considerable expense. (For example, we might recompile every time the time step changes.) Maybe defaulting the other way (replace every numerical constant with a variable) is better?

cc @kaushikcfd @matthiasdiener

actx.tag() comes too late to benefit eager mode execution

For lazy mode, actx.tag(MyTag(), 2*x+3) works great. For eager mode, this doesn't help at all, as the result will already be computed (i.e. the kernel whose transformation would have benefited from the metadata was already executed). Is there a tagging interface that can work for both eager and lazy?

cc @alexfikl @thomasgibson @nchristensen @kaushikcfd

x-ref: inducer/meshmode#176

Defining the semantics for a lazy array context

Background

Lazy Array framework we will be using: github.com/inducer/pytato
Lazy Array Context requires a context for symbolic operations (i.e. Pytato.Namespace) and an execution context. (probably a PyOpenCL command queue).
Compile is a new operation that needs to be supported by ArrayContext. ArrayContext.compile would take a python callable, f, and return a JIT-ed function corresponding to f.

Looking at PyOpenCLArrayContext

freeze: gets rid of the array context's queue, giving the caller the freedom to operate on the frozen array without being wary of the actx's state.
thaw: attaches the array context's queue to the array, so any operations on the array are enqueued onto actx's queue.
compile: takes a python callable and just returns the python callable.

Proposal: PytatoPyOpenCLArrayContext

Uses pytato for lazy evaluation and any valued arrays are represented via pyopencl arrays.
freeze: takes a Union[pytato.Array, pyopencl.array] and returns a pyopencl.array without the actx's queue.
thaw: takes a pyopencl.array and returns a pytato.DataWrapper with the actx's namespace and queue attached to the cl array.
compile: takes a callable and a numpy object array of DofArrays of pyopencl arrays and returns a A: CompiledOperator that would have a method A.__call__(self, x: np.ndarray[DofArray[PyOpenCLArray]]) -> np.ndarray[DofArray[PyOpenCLArray]]. The pyopencl arrays fed into and out of A.__call__ would still have the actx's command queue.

Make sure array containers handle `eq` reasonably

A risk I perceive is that the dataclass implements an eq which just goes, hey, let's equality-compare all members, then decides, incorrectly, that, yup, things are equal.

I don't know that it's mishandling stuff (as in, I haven't observed unsafe behavior). I'm writing this issue as a reminder to check (and write a test).

x-ref: https://github.com/inducer/arraycontext/pull/46/files#r659211749

Scalars in PyOpenCLArrayContext should be device scalars

This is for consistency with the pytato context and for performance: if all that's needed is a device scalar, there's no real need for the device/host/device round-trip.

Unfortunately, that's a rather larger compatibility break. One way to do this would be to introduce a flag on the CL array context that changes the behavior, and whose old default value is deprecated. The pytato array context kind of leads the charge here anyway: it requires that to_numpy is called on lazy scalars before they're "normally" usable.

cc @kaushikcfd @matthiasdiener

Make a way to ensure we always use the actx's allocator

#186 got me worried that we might not be passing allocators in other places where they're needed.

Maybe make a way to turn off this logic in Loopy:

https://github.com/inducer/loopy/blob/2919c968b23467694ee2cb8b645aeefb6146e736/loopy/target/pyopencl_execution.py#L176-L178

and then use it?

Check that array expressions don't mix array contexts

(at least optionally)

How should lazy `actx.compile` interact with memoization (on the input end)?

This is a longer-shot companion issue to https://github.com/inducer/pytato/issues/164 (and #100):

Suppose an actx.compiled function is passed an array container (for the sake of argument, imagine a mirgecom ConservedVars) that somehow holds on to already-computed, dependent state that is the subsequently used in the compiled function. Right now, any such association is destroyed when "placeholderizing" the inputs. It'd be nice to instead realize that something we're about to (re)compute has already been computed and just use it instead.

A few (substantial) challenges:

Unless we retain the object identity of the data passed in as a placeholder, we have no way of finding the existing state.
Even if we can find it, we can't really reliably generate an identifier for it.

cc @kaushikcfd @MTCam

Provide a default implementation for `transform_loopy_program`

@nkoskelo points out that the first-time user experience kind of stinks otherwise.

This should still issue a warning to emphasize that things won't perform without additional transformation.

`arraycontext.container.to_numpy` semantics

For the following script:

import pyopencl as cl
from arraycontext import (PytatoPyOpenCLArrayContext as BasePytatoArrayContext,
                          to_numpy)


class PytatoArrayContext(BasePytatoArrayContext):
    def transform_loopy_program(self, t_unit):
        return t_unit


cl_ctx = cl.create_some_context()
cq = cl.CommandQueue(cl_ctx)
actx = PytatoArrayContext(queue=cq)

u = actx.freeze(actx.zeros(10, dtype="float64"))
actx.to_numpy(u)  # Works
to_numpy(u, actx)  # Fails!

I get the following error:

TypeError: array of type 'TaggableCLArray' not in supported types (<class 'pytato.array.Array'>,)

My proposal would be to keep the arraycontext.container.to_numpy's implementation lean by simply calling actx.to_numpy on every leaf array and let the to_numpy method raise TypeError whenever appropriate.

Make it so array containers do not know their array context?

This is aimed particularly at DOFArray.array_context, which is AFAIK the only instance of this, but a pervasive one. This change, if implemented, is rather break-the-world-y, so we would have to do this slowly and carefully.

Why is it worth doing? It's forcing arraycontext.container.traversal.{freeze,thaw} to be @singledispatch over the container type, just for a way to remove the stored .array_context. For batch-freezing in the array context, this is inconvenient (see discussion in #158).

What does it break?

grudge gets its array context from the arrays it's passed for many operations.
Any function that's actx.compiled, because that's its sole source of array cotnext.

That seems like a lot of breakage for limited gain...

@alexfikl @kaushikcfd @majosm Thoughts?

Is passing frozen arrays to call_loopy okay?

This is common practice throughout much of meshmode and grudge at the moment. The Pytato array context has to work around it:

arraycontext/arraycontext/impl/pytato/__init__.py

Lines 92 to 94 in 1525116

 # thaw frozen arrays 

 kwargs = {kw: (self.thaw(arg) if isinstance(arg, cla.Array) else arg) 

 for kw, arg in kwargs.items()}

On the one hand, it's at least a little bit contradictory that this would be OK for call_loopy, but not OK for things like frozen * thawed. On the other hand, the array context is explicit when call_loopy is used. I'm a bit torn. Thoughts?

cc @matthiasdiener @kaushikcfd @alexfikl

Container serialize/deserialize: Do we need the keys?

When reviewing #91, it occurred to me that, at least for all our current use cases, we're completely fine with "positional" deserialization, and the keys are kind of a waste. Should we get rid of them?

One counterpoint is that the keys do provide metadata that pytato uses to generate better names.

cc @alexfikl @thomasgibson @kaushikcfd Thoughts?

Function outlining errors in mirgecom

Trying out the outlining capability in #221 with mirgecom:

with pytato from (inducer/pytato#431)
with mirgecom from (illinois-ceesd/mirgecom#880)

To reproduce this, install mirgecom:
emirge/install.sh --branch=production-outlining

Then run any example with CNS operator (cd examples):
python -m mpi4py combozzle-mpi.py

The error is as follows:

frozen_inv_metric_deriv_vol: check array access within bounds: started 3s ago
frozen_inv_metric_deriv_vol: check array access within bounds: completed (3.94s wall 1.00x CPU)
frozen_inv_metric_deriv_vol: generate code: completed (2.18s wall 1.00x CPU)
build program: kernel 'frozen_inv_metric_deriv_vol' was part of a lengthy source build resulting from a binary cache miss (2.60 s)
/p/gpfs1/mtcampbe/CEESD/AutomatedTesting/MIRGE-Timing/timing/y3-prediction-testing/emirge/miniforge3/envs/prediction.env/lib/python3.11/site-packages/pyopencl/invoker.py:366: UserWarning: Kernel 'frozen_inv_metric_deriv_vol_0' has 468 arguments with a total size of 3744 bytes, which approaches the limit of 4352 bytes on <pyopencl.Device 'Tesla V100-SXM2-16GB' on 'Portable Computing Language' at 0x101f47d38>. This might lead to compilation errors, especially on GPU devices.
  warn(f"Kernel '{function_name}' has {num_cl_args} arguments with "
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/p/gpfs1/mtcampbe/CEESD/AutomatedTesting/MIRGE-Timing/timing/y3-prediction-testing/emirge/miniforge3/envs/prediction.env/lib/python3.11/site-packages/mpi4py/__main__.py", line 7, in <module>
    main()
  File "/p/gpfs1/mtcampbe/CEESD/AutomatedTesting/MIRGE-Timing/timing/y3-prediction-testing/emirge/miniforge3/envs/prediction.env/lib/python3.11/site-packages/mpi4py/run.py", line 198, in main
    run_command_line(args)
  File "/p/gpfs1/mtcampbe/CEESD/AutomatedTesting/MIRGE-Timing/timing/y3-prediction-testing/emirge/miniforge3/envs/prediction.env/lib/python3.11/site-packages/mpi4py/run.py", line 47, in run_command_line
    run_path(sys.argv[0], run_name='__main__')
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "combozzle-mpi.py", line 1309, in <module>
    main(use_logmgr=args.log, use_leap=args.leap, input_file=input_file,
  File "/p/gpfs1/mtcampbe/CEESD/AutomatedTesting/MIRGE-Timing/timing/y3-prediction-testing/emirge/mirgecom/mirgecom/mpi.py", line 200, in wrapped_func
    func(*args, **kwargs)
  File "combozzle-mpi.py", line 1211, in main
    advance_state(rhs=my_rhs, timestepper=timestepper,
  File "/p/gpfs1/mtcampbe/CEESD/AutomatedTesting/MIRGE-Timing/timing/y3-prediction-testing/emirge/mirgecom/mirgecom/steppers.py", line 439, in advance_state
    _advance_state_stepper_func(
  File "/p/gpfs1/mtcampbe/CEESD/AutomatedTesting/MIRGE-Timing/timing/y3-prediction-testing/emirge/mirgecom/mirgecom/steppers.py", line 164, in _advance_state_stepper_func
    state = timestepper(state=state, t=t, dt=dt, rhs=maybe_compiled_rhs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/p/gpfs1/mtcampbe/CEESD/AutomatedTesting/MIRGE-Timing/timing/y3-prediction-testing/emirge/mirgecom/mirgecom/integrators/lsrk.py", line 66, in euler_step
    return lsrk_step(EulerCoefs, state, t, dt, rhs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/p/gpfs1/mtcampbe/CEESD/AutomatedTesting/MIRGE-Timing/timing/y3-prediction-testing/emirge/mirgecom/mirgecom/integrators/lsrk.py", line 53, in lsrk_step
    k = coefs.A[i]*k + dt*rhs(t + coefs.C[i]*dt, state)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/p/gpfs1/mtcampbe/CEESD/AutomatedTesting/MIRGE-Timing/timing/y3-prediction-testing/emirge/arraycontext/arraycontext/impl/pytato/compile.py", line 316, in __call__
    output_template = self.f(
                      ^^^^^^^
  File "combozzle-mpi.py", line 1154, in cfd_rhs
    ns_operator(
  File "/p/gpfs1/mtcampbe/CEESD/AutomatedTesting/MIRGE-Timing/timing/y3-prediction-testing/emirge/mirgecom/mirgecom/navierstokes.py", line 462, in ns_operator
    grad_cv = get_grad_cv(state)
              ^^^^^^^^^^^^^^^^^^
  File "/p/gpfs1/mtcampbe/CEESD/AutomatedTesting/MIRGE-Timing/timing/y3-prediction-testing/emirge/arraycontext/arraycontext/impl/pytato/outline.py", line 159, in __call__
    call_site_output = func_def(**call_parameters)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/p/gpfs1/mtcampbe/CEESD/AutomatedTesting/MIRGE-Timing/timing/y3-prediction-testing/emirge/pytato/pytato/function.py", line 172, in __call__
    if expected_arg.dtype != kwargs[argname].dtype:
                             ~~~~~~^^^^^^^^^
KeyError: '_actx_in_1_0_mass_0'
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------

As an initial guess, it looks like maybe the outlining infrastructure picks apart the array containers down to DOFArrays, but then looks for the function arguments to be the DOFArrays instead of the containers?

Reenable downstream CI

x-ref: 3bb91f9

Repeated freezes cause repeated computation

(Maybe this is an arraycontext issue? Not sure.)

Suppose a moderately complex expression is built and evaluated. Then it is evaluated again. Repeated codegen (most likely cached, though see inducer/pytato#163) and repeated evaluation ensues. This isn't ideal. Possibly, a symbolic array, once evaluated, should hold on to its value, to avoid reevaluation. If we don't deal with this somehow, then pytato's arrays can't be meaningfully memoized (e.g. with @memoize/@memoize_method from pytools), as, sure, they memoize the expression, but the actual evaluation is done again and again.

A simple approach to this would be simply stuff the evaluated value into a hidden attribute (_evaluated?) on the expression nodes. I'm aware that this introduces mutation into otherwise immutable objects... but it's the cache-y kind of mutability, and it doesn't change the semantics of the object in the slightest. So I suspect it's OK.

Another possible concern is the lifetime of the evaluated value (i.e. it might outstay its welcome). I don't think that'll be a big concern because

somebody held on to the symbolic value, demonstrating that there's at least potential interest in reevaluating.
if it becomes an issue, we could make an interface to free the stored value, enabling the memory to be freed.

Generally, codegen should stop at already-evaluated notes (and treat them as DataWrappers).

@MTCam is hitting this with cache retrievals upon freeze when exercising lazy evaluation for chemistry in https://github.com/illinois-ceesd/mirgecom.

cc @kaushikcfd

Do not import PyOpenCLArrayContext into the root namespace

Move PyOpenCLFakeNumpyNamespace (and the linalg bit) to a submodule so that we can import pyopencl globally there and avoid all these repetitive, performance-killing imports

arraycontext/arraycontext/impl/pyopencl.py

Lines 114 to 127 in 322c18f

 import pyopencl.array as cl_array 

 return cl_array.sum( 

 a, dtype=dtype, queue=self._array_context.queue).get()[()] 

 def min(self, a): 

 import pyopencl.array as cl_array 

 return cl_array.min(a, queue=self._array_context.queue).get()[()] 

 def max(self, a): 

 import pyopencl.array as cl_array 

 return cl_array.max(a, queue=self._array_context.queue).get()[()] 

 def stack(self, arrays, axis=0): 

 import pyopencl.array as cla

while still avoiding a hard dependency on pyopencl.

rec_map_array_container: ValueError on trying to traverse a dataclass_array_container with a leaf array

On running:

@with_container_arithmetic(bcast_obj_array=True, rel_comparison=True)
@dataclass_array_container
@dataclass(frozen=True)
class State:
    u: np.ndarray

mystate = State(np.zeros(10))
rec_map_array_container(lambda x: x, mystate)

I get an error saying: ValueError: only object arrays are supported, given dtype 'float64'. Should dataclass_array_container not contain primitive-dtyped numpy arrays?

I was expecting to get a shallow copy of mystate after the traversal.

eager arraycontext bug

I have modified pyrometheus slightly to get a better estimate for Cp. Something like this:

return self._pyro_make_array([ self.usr_np.where(self.usr_np.greater(temperature, 6000.0), 15.857914857900802, self.usr_np.where(self.usr_np.greater(temperature, 1000.0), 6.21785087461938 + 0.0066860330395617*temperature + -1.79535702793147e-06*temperature**2 + 2.18865397063934e-10*temperature**3 + -1.01220733135537e-14*temperature**4, self.usr_np.where(self.usr_np.greater(temperature, 50.0), 3.73312012902641 + -0.00225161088114689*temperature + 2.35442451235782e-05*temperature**2 + -1.37084841614577e-08*temperature**3, 3.6776866372578287)))

where before it was like this

return self._pyro_make_array([ self.usr_np.where(self.usr_np.greater(temperature, 2000.0), 11.3144488322845 + 0.00199411450797709*temperature + -2.99633189643295e-07*temperature**2 + 1.89518716793814e-11*temperature**3 + -3.37402443785122e-16*temperature**4, 3.37402443785122 + 0.00843506109462805*temperature + 2.95775949418238e-07*temperature**2 + -1.43695501015373e-09*temperature**3 + 2.69921955028098e-13*temperature**4),

The biggest difference I can see is the nesting of actx.np.where(). If I run with a lazy array context this is fine
however, when I run with an eager array context I get something like

File "/Users/someguy/work/CEESD/MirgeCom/Drivers/CEESD-Y3_prediction/emirge/arraycontext/arraycontext/impl/pyopencl/fake_numpy.py", line 362, in where return rec_multimap_array_container(where_inner, criterion, then, else_) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/someguy/work/CEESD/MirgeCom/Drivers/CEESD-Y3_prediction/emirge/arraycontext/arraycontext/container/traversal.py", line 338, in rec_multimap_array_container return _multimap_array_container_impl( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/someguy/work/CEESD/MirgeCom/Drivers/CEESD-Y3_prediction/emirge/arraycontext/arraycontext/container/traversal.py", line 193, in _multimap_array_container_impl return f(*args) ^^^^^^^^ File "/Users/someguy/work/CEESD/MirgeCom/Drivers/CEESD-Y3_prediction/emirge/arraycontext/arraycontext/impl/pyopencl/fake_numpy.py", line 359, in where_inner return cl_array.if_positive(inner_crit != 0, inner_then, inner_else, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/someguy/software/Install/Conda/envs/mirgeDriver.Y3prediction/lib/python3.11/site-packages/pyopencl/array.py", line 3054, in if_positive criterion.queue, criterion.shape, then_.dtype, ^^^^^^^^^^^^^^^ AttributeError: 'numpy.bool_' object has no attribute 'queue'

I was able to reproduce it, in a small example, it seems to be related to the temperature coming in as a np.float64 vs a plain old float

small reproducer:
https://gist.github.com/anderson2981/0cc6f47e0b034caef7c76853e07a6b00

Loopy a hard dependency?

Is there a specific reason why Loopy should be a hard dependency¹? My understanding was transform_loopy_program will get kicked in only when the implementation demands.

Thanks!

https://github.com/inducer/arraycontext/blob/main/setup.py#L46 ↩

PyOpenCLArrayContext does not support elementwise functions for scalars

In [1]: from arraycontext import _acf

In [2]: actx = _acf()

In [3]: actx.np.sin(3.14)

    150             actx = self._array_context
    151             prg = _get_scalar_func_loopy_program(actx,
--> 152                     c_name, nargs=len(args), naxes=len(args[0].shape))
    153             outputs = actx.call_loopy(prg,
    154                     **{"inp%d" % i: arg for i, arg in enumerate(args)})

AttributeError: 'float' object has no attribute 'shape'

Make a utility for reduction over array container structures

cc @alexfikl

Based on #60 (comment)

Exposing `np.select`?

Should we support np.select? It's a bit controversial because one could chain np.where to emulate np.select, but I thought this would be a cleaner way to implement inducer/meshmode#355.

	# thaw frozen arrays
	kwargs = {kw: (self.thaw(arg) if isinstance(arg, cla.Array) else arg)
	for kw, arg in kwargs.items()}

	import pyopencl.array as cl_array
	return cl_array.sum(
	a, dtype=dtype, queue=self._array_context.queue).get()[()]

	def min(self, a):
	import pyopencl.array as cl_array
	return cl_array.min(a, queue=self._array_context.queue).get()[()]

	def max(self, a):
	import pyopencl.array as cl_array
	return cl_array.max(a, queue=self._array_context.queue).get()[()]

	def stack(self, arrays, axis=0):
	import pyopencl.array as cla

inducer / arraycontext Goto Github PK

arraycontext's Introduction

arraycontext: Choose your favorite numpy-workalike

Links