mrocklin / multipledispatch Goto Github PK

View Code? Open in Web Editor NEW

804.0 18.0 72.0 215 KB

Multiple dispatch

Home Page: https://multiple-dispatch.readthedocs.io/en/latest/

License: Other

Python 100.00%

multipledispatch's Introduction

Multiple Dispatch

A relatively sane approach to multiple dispatch in Python.

This implementation of multiple dispatch is efficient, mostly complete, performs static analysis to avoid conflicts, and provides optional namespace support. It looks good too.

See the documentation at https://multiple-dispatch.readthedocs.io/

Example

>>> from multipledispatch import dispatch

>>> @dispatch(int, int)
... def add(x, y):
...     return x + y

>>> @dispatch(object, object)
... def add(x, y):
...     return "%s + %s" % (x, y)

>>> add(1, 2)
3

>>> add(1, 'hello')
'1 + hello'

What this does

Dispatches on all non-keyword arguments
Supports inheritance
Supports instance methods
Supports union types, e.g. (int, float)
Supports builtin abstract classes, e.g. Iterator, Number, ...
Caches for fast repeated lookup
Identifies possible ambiguities at function definition time
Provides hints to resolve ambiguities when they occur
Supports namespaces with optional keyword arguments
Supports variadic dispatch

What this doesn't do

Diagonal dispatch

a = arbitrary_type()
@dispatch(a, a)
def are_same_type(x, y):
    return True

Efficient update: The addition of a new signature requires a full resolve of the whole function. This becomes troublesome after you get to a few hundred type signatures.

Installation and Dependencies

multipledispatch is on the Python Package Index (PyPI):

pip install multipledispatch

It is Pure-Python and depends only on the standard library. It is a light weight dependency.

License

New BSD. See License file.

multipledispatch's People

Contributors

Stargazers

Watchers

multipledispatch's Issues

Python 2 benchmark consistently reports min of 0

I got reasonable numbers for Python 3.6, but 2.7 is giving me trouble.

------------------------------------------------------------------------------------------------------------- benchmark: 4 tests ------------------------------------------------------------------------------------------------------------
Name (time in ns)                                       Min                     Max                   Mean                StdDev                 Median                   IQR            Outliers  OPS (Kops/s)            Rounds  Iterations
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_call_multiple_dispatch[val0]          0.0000 (1.0)       21,934.5093 (1.16)        806.1531 (1.11)       525.9095 (1.0)         953.6743 (1.0)          0.0000 (1.0)    17675;30524    1,240.4592 (0.90)      83887           1
test_benchmark_call_single_dispatch[1]               0.0000 (1.0)       18,835.0677 (1.0)         728.6384 (1.0)        539.7561 (1.03)        953.6743 (1.0)        953.6743 (inf)      11750;21    1,372.4229 (1.0)       41528           1
test_benchmark_call_single_dispatch[a]               0.0000 (1.0)       22,172.9279 (1.18)        731.4872 (1.00)       594.9043 (1.13)        953.6743 (1.0)        953.6743 (inf)      24387;70    1,367.0779 (1.00)      83887           1
test_benchmark_add_and_use_instance             46,968.4601 (inf)      148,057.9376 (7.86)     49,903.3053 (68.49)    6,154.7847 (11.70)    48,875.8087 (51.25)    1,192.0929 (inf)      355;1806       20.0388 (0.01)      11097           1
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Idea: Dispatch on a fixed number of arguments

Optionally fix the number of arguments Dispatcher is dispatching on. This would allow accepting extra positional arguments:

collide = Dispatcher('collide', 2)

@collide.register(Rocket, Asteroid)
def death(rocket, asteroid, collision_point=None):
    exit('Game Over')

@collide.register(Asteroid, Asteroid)
def collide_split(a1, a2, collision_point):
    a1.split_at_point(collision_point)
    a2.split_at_point(collision_point)

Currently, extra parameters can be passed as keyword arguments. (I don't think it's really documented anywhere, though; is it?)

Also this would enable compatibility with the "functional form" of singledispatch's register in Python 3, since we could then allow the following:

collide.register(Rocket, FriendlyFire, lambda: 'nothing happens')

This would allow multipledispatch to eventually evolve into a drop-in singledispatch replacement - see #33

Doesn't work in py3.6: inspect.getargspec removed

For py3.x, in ismethod() you could use getfullargspec instead. But maybe there is a more portable solution.

Dispatch with optional arguments

It seems to me that there's no way to use dispatch with optional arguments, is there?

@dispatch(int)
def f (x, y=0):
    print "f(%d%d)" %(x,y)

While the following calls work

f(1)
f(1,y=2)

this does not work

f(1,2)

Issue with using Union type of multiple arguments

Consider f

@dispatch(object, object)
def f(a,b):
    return a*b 

@dispatch(((object, type(None)), (type(None), object)))
def f(a,b):
    return b if b is not None else a

and g:

@dispatch(object, object)
def g(a,b):
    return a*b

@dispatch(type(None), object)
def g(a,b):
    return b

@dispatch(object, type(None))
def g(a,b):
    return a

and ignore the ambiguity warning for g.

Can anyone explain to me what is wrong with f (compare f(1, None) and f(None,1) )? Imho both calls should be dispatched to the second definition for f and thus shoud return 1; f(None, 1) however does dispatch to the (object, object) definition. Compared to that g does behave as would be expected.

Support *args without dispatch

One way of handling *args would be to simple not do matching on that part. This would allow defining n-ary operators in terms of the binary version in the following form:

@dispatch(Expr, Expr)
def add(x, y, *args):
    result = add(x,y)
    for i in range(len(args)):
        result = add(result, args[i])
    return begin

Find docstring and sourcecode of function given particular inputs

If I have some multiple dispatched function f and some inputs a and b I might want to find out what f does on a and b. Sadly help and source on f will point to the general dispatcher, not the particular implementation. Maybe we need something like the following

f.doc(a, b)

f.doc(type(a), type(b))

and

f.source(a, b)

cc @cpcloud

Use mro(), not issubclass to resolve function dispatch

With the current Dispatcher.resolve, the following will always fail:

from multipledispatch import dispatch

class A(object):
    pass

class B(object):
    pass

class AB(A, B):
    pass

class BA(B, A):
    pass

@dispatch(A)
def f(x):
    return 'a'

@dispatch(B)
def f(x):
    return 'b'

assert f(A()) == 'a'
assert f(B()) == 'b'
# One of the following will fail
assert f(AB()) == 'a'
assert f(BA()) == 'b'

Although multiple inheritance is often frowned upon (but not by all), there are situations where it is a good option, and I think it would be awesome if multipledispatch passed the above test.

This can work perfectly with #25 (left-most arguments get top priority) with two changes:

issubclass should not be used in Dispatcher.resolve. Instead, iterate over the types in each arguments MRO until a match is found for that argument.
Dispatcher.ordering should no longer be a list. A nested dictionary would be better, where each level maps based on the nth argument type. This can also scale more efficiently than the current approach.

Cythonize `Dispatcher.call/resolve`

Dispatched functions currently incur about a microsecond latency. This stops them from being used in tight inner loops. The Dispatcher.__call__/resolve methods could be written in optimized Cython to reduce this duration.

multipledispatch should still be accessible from a Pure Python environment. The Cython code should be optional and should fall back onto the Python version if a C compiler is not present upon installation.

CC @eriknw , in case he's looking for another project to accelerate :)

Subtle scoping issue with dispatch in closures

I have recently been bitten by namespace isses with @dispatch inside a closure. As an example look at

def f():

    @dispatch(int)
    def inner(x): pass

    @dispatch(str)
    def inner(x): pass

    return inner

and notice f() is f() evalutes to True. This really suprised me because with

def closure():
    def inner():
        pass
    return inner

closure() is closure() obviously is False.

Using the namespace argument of dispatch solves this issue (see below), I think however that

The documentation fails to warn the user about this behaviour, as it only suggests (emphasis mine)

[...] If several projects use this global namespace unwisely then conflicts may arise, causing difficult to track down bugs.

and proposes that one should

[...] establish a namespace for an entire project [...]

which imho is not clear at all on the point of scoping and namespaces.

With multiple dispatching being a concept (more) at home in the world of functional programming, the f() is f() behaviour feels wrong to the point I would consider this a bug.

The solution, quite frankly, is ugly as hell:

def makefunc(arg):
    d = dict()

    @dispatch(int, namespace=d)
    def inner(x): pass

    @dispatch(str, namespace=d)
    def inner(x): pass

    return inner

Keyword-argument dispatch

Hello. I ran into needing keyword-argument dispatch in my own project (although it's hardly a project, more a wrapper around Numpy and multipledispatch).

I need to override Numpy functions of the form func(arg1, arg2, kwarg1=default1, kwarg2=default2) (as an example).

Fortunately, I can write separate signatures that accept 2, 3 or 4 arguments.

Unfortunately, these will ignore the types of keyword-arguments, and it needs three signatures to work.

So, here is the API I propose (feel free to discuss)

from multipledispatch import Dispatcher

func1 = Dispatcher('func1')

def func1_default(arg1, arg2, kwarg1=default1, kwarg2=default2)
    pass

func1_dispatcher.add((type1, type2, ('kwarg1', type3), ('kwarg2', type4)), func1_default)

Or,

from multipledispatch import dispatch

@dispatch(type1, type2, ('kwarg1', type3), ('kwarg2', type4))
def func1(arg1, arg2, kwarg1=default1, kwarg2=default2)
    pass

I would have liked something like the following, but unfortunately, it isn't possible due to dicts being unordered in Python < 3.6 and even in 3.6 it being an implementation detail rather than a documented feature:

from multipledispatch import dispatch

@dispatch(type1, type2, kwarg1=type3, kwarg2=type4)
def func1(arg1, arg2, kwarg1=default1, kwarg2=default2)
    pass

CC-ing interested parties:

@shoyer for XArray, which might ultimately need Arrayish.
@mrocklin for Dask, which might ultimately need Arrayish.
@mrocklin Once again, to ping more interested parties.
And, of course, me for pydata/sparse and Arrayish.

If someone is willing to point me to code, I'll try my hand at a PR if I understand anything. 😅

Add a reference to singledispatch somewhere

@mrocklin wrote:

Multipledispatch supports a few interfaces, one of which is the interface used by singledispatch in Python 3.4 functools.

It would be nice to provide a link to this in the docs and mention it in the design that this is the supported interface.

Move to a new organization

This project should live outside of my personal github organization. I propose one of the following options:

pydata
multipledispatch (new)

I would prefer pydata. This project is used within SymPy (though they're currently vendoring) (xref: sympy/sympy#14106) and is proposed to be used within XArray to support various ndarray operations (xref: pydata/xarray#1938).

cc @asmeurer @shoyer @aterrel @rgommers @llllllllll @mariusvniekerk

Put try/except getargspec logic in the compatibility module for Python 3 annotations

And convert try/except to branching logic.

Per the conversation here:
https://github.com/mrocklin/multipledispatch/pull/4/files#diff-a52b3f47e7d9a8fdde8e03be476e06dbL179

This can be done in the following branch:
https://github.com/oubiwann/multipledispatch/tree/python3-annotations

Upstream integration of some new features

Dear multipledispatch developers,
I have done some customization to multipledispatch to fit the need of one of my libraries.
While some of the changes are very specific to my needs, I think that most of the changes might be interesting to commit upstream.

The patched version of @dispatch decorator is available at this link, as well as some unit tests.

I would like to have your suggestions on which (if any) of my changes you would deem interesting upstream. I try to summarize them here:

new features:
- do not store Dispatchers in a dict, but rather in a module. Note that this BREAKS backward compatibility. The motivation for doing this are strongly related to my specific needs (there is some summary in the multiline comment at line 366, but nonetheless I think that it might be a more natural way to store Dispatchers and being able to import them. There are a few tests on this: lines 366, 394.
- allow @dispatch to be applied to classes. Tests at line 500, 647. This feature depends on the backward incompatible change concerning Dispatcher storage inside a module (rather than a dict).
- allow dispatched functions to be replaced through @dispatch kwargs replaces=... and replaces_if=... . Implementation from line 79. Test at line 459, 669, 715. This feature depends on the backward incompatible change concerning Dispatcher storage inside a module (rather than a dict).
- implementation of dispatching based on subtypes for dict, lists, tuple etc. This is already discussed in issue #72, and there is also a WIP pull request #69. My implementation is very different from the two aforementioned approaches, and relies on custom defined classes named list_of(TYPES), tuple_of(TYPES), set_of(TYPES), dict_of(TYPES_FROM, TYPES_TO), where TYPES could be a tuple of types. I had laid out this part before learning about pytypes, so I am open to suggestions on replacing my custom made classes with pytypes. However, I personally do not like the square brackets in their notations: when declaring a @dispatch to take either int or floats one would use @dispatch((int, float)), where "OR" is denoted by a parenthesis, while for pytypes object on would need to write @dispatch(Tuple[int, float]), where "OR" is denoted by a square bracket. The notation of my implementation is instead list_of((int, float)). More implementation detail:
  - the definition of custom *_of classes goes from line 428 to line 522
  - I have a strong assumption in my code that, whenever I dispatch on iterable, I will always know the subtype. For this reason, I have disabled the plain iterables in a validation stage at line 524. This assumption might be too strong for general purposes, and is definitely NOT backward compatible.
  - when calling the Dispatcher iterable input arguments are converted to the corresponding *_of types in the auxiliary function at line 553. In a similar way tuple expansion has been patched at line 599
  - the implementation of conflict operators for *_of classes is custom made from line 635 to line 729
  There are a few tests on this part:
  - adaptation of upstream tests at lines 128, 144, 158
  - custom conflict operators for *_of classes are tested at lines 224, 286, 342
  - validation that standard iterable types have been disabled is on the test at line 552
  - additional few tests related to class initialization at lines 564, 572, 596
  - tests that subtypes provided to *_of properly account for inheritance at lines 731, 754, 776, 798, 822, 845, 866, 888, 910, 934
  - one additional minor test at line 953
- have MethodDispatcher account for inheritance. In the current master version, dispatched methods are not inherited. The work from line 192 to line 307 allows dispatched class methods to be inherited and possibly overridden. Tests for this are at line 967, 993, 1002, 1040.
- allow lambda functions to be passed instead of hardcoded types. The typical use case for which I needed this was in the definition of arithmetic operators, e.g.
```
class F(object):
    def __init__(self, arg):
        self.arg = arg
    
    @dispatch(F)
    def __mul__(self, other):
        return self.arg*other.arg
```
  Python interpreter fails on @dispatch(F) because F is not fully defined yet. My proposed solution has thus been to use
```
class F(object):
    def __init__(self, arg):
        self.arg = arg
    
    @dispatch(lambda cls: cls)
    def __mul__(self, other):
        return self.arg*other.arg
```
  Note that in this case the evaluation of the signature is delayed to the first time the dispatched method is called.
  Tests for this feature are:
  - basic unit test at 1070
  - a test to show a caveat of the delayed signature computation, line 1096
  - compatibility with inheritance: lines 1113, 1137
- compatibility with optional default None argument. Implementation at line 732, test at line 1193
minor changes:
- slightly changed the implementation for what concerns annotations. There are a few tests on this:
  - adaptation of upstream tests at lines 196, 208
  - new test for function dispatchers at line 416, 424, 439, 530
  - new test on wrong number of annotation being provided, line 1218
- new overload decorator, which can be called as @overload''' rather than @overload()'' if the signature is provided through annotations. Implementation at line 419, test at line 1169. This is essentially a duplicate of @dispatch, so I might be willing to rename it back to dispatch before pushing upstream.
- additional minor tests related to calling method from class rather than instance at line
minor changes that break backward compatibility, and thus may not be worthy to push upstream:
- enforce an error rather than a warning in case of ambiguity. See lines 47, 58 for the implementation. See test at line 104.
- slight customization of the error which is raised in case a signature is not available. See lines 36, 161 for the implementation.

Apologies for the long post, necessary to clearly present all features of the patched version of the library in order to understand from you which features might be of general interest.

ping @mrocklin @hameerabbasi @llllllllll @mariusvniekerk @shoyer who were involved in the aformentioned issues and pull requests.

function overloading with keyword arguments

Hello. I was trying to achieve function overloading with keyword arguments and ran into an issue.

In [3]: from multipledispatch import dispatch

In [4]: class abc():
    def __init__(self):
        print('class abc')
    @dispatch(a=str, b=int)
    def x(self, a, b):
        print('x')
    @dispatch(a=str)
    def x(self, a):
        print('only string')
   ...:

In [5]: y = abc()
class abc

In [6]: y.x(a='a')
only string

In [7]: y.x(a='a', b=1)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-7-51c5936a9f65> in <module>()
----> 1 y.x(a='a', b=1)

/phaedrus/home/athamma/eureka_repo/eureka_perf/PerfRobo/env2.7/lib/python2.7/site-packages/multipledispatch/dispatcher.pyc in __call__(self, *args, **kwargs)
    360             raise NotImplementedError('Could not find signature for %s: <%s>' %
    361                                       (self.name, str_signature(types)))
--> 362         return func(self.obj, *args, **kwargs)
    363
    364

TypeError: x() got an unexpected keyword argument 'b'

License file in pypi archive

The pypi source archive isn't including the LICENSE.txt file. Would it be possible to add it? It is very helpful when packaging this for Linux distributions. Thank you.

str vs unicode

I've got problems with make working @dispatch(str). It doesn't match unicode strings, while @dispatch(str) doesn't match ordinary strings :-(. Perhaps a virtual type that would match both would be helpful in this common case.

Cutting a release

There have been quite a few improvements in multipledispatch since the last release.
Is it time to make a 0.4.10 / 0.5.0 with the import time improvements and basic py3 type annotation support?

@mrocklin @llllllllll

Add support for dispatching to descriptors

Consider this example:

from multipledispatch import dispatch

class Frob(object):
    @dispatch(int)
    @classmethod
    def bar(cls, things):
        ...

Will fail with AttributeError: 'classmethod' object has no attribute '__name__', similarly with staticmethod and any other descriptor that doesn't proxy access to __name__ (property aside)

There's a few ways to address this issue:

The simple and easy way: Simply type check for classmethod and staticmethod. There's nothing wrong with this, unless someone uses their own descriptors (say a caching/memoizing descriptor). In which case, they're out of luck for using multiple dispatch with it.
The extendable but slightly questionable way -- this works, as I implemented functools.singledispatch as a descriptor using this method however, it chokes on staticmethod in Python 2.7 (due to the bound/unbound distinction there).
The middleground where a registry of allowed descriptors is kept. Pros: Check just becomes isinstance(method, one_these_descriptor_types) and it won't choke on unraveling staticmethod. Cons: Global state or passing a registry to dispatch and it's subordinates.

Serialization with Pickle

This test fails

from multipledispatch.dispatcher import Dispatcher
import pickle

f = Dispatcher('f')

@f.register(int)
def inc(x):
    return x + 1

@f.register(float)
def dec(x):
    return x - 1

def test_serialize_register():
    pickle.dumps(f)

The issue is that inc and dec have been decorated and so aren't what they claim to be. Pickle doesn't interact well with decorated functions.

See this simpler example showing the limitations of pickle.

def identity(f):
    def newf(*args, **kwargs):
        return f(*args, **kwargs)
    return newf

@identity
def foo(x):
    return x

def test_serialize_simple():
    pickle.dumps(foo)

Dill does fine though.

Eventlet threads?

Is this safe to use with eventlet threads?

I seem to be getting some weird behavior, and I'm not sure whether it's my code or the multipledispatch.

New collaborators

Welcome @llllllllll and @mariusvniekerk !

Better ambiguity resolution based on parameter order

Consider the ambiguity example from the documentation:

@dispatch(object, object)
def f(x, y):
    return x + y

@dispatch(object, float)
def f(x, y):
    """ Square the right hand side if it is a float """
    return x + y**2

@dispatch(float, object)
def f(x, y):
    """ Square left hand side if it is a float """
    return x**2 + y

The call f(2.0, 10.0) is considered ambiguous, and one of the last two f functions will be used arbitrarily. This particular example doesn't need to be arbitrary, although I think an ambiguity warning should still be issued.

The strategy I propose is to give higher precedence to earlier parameters that match exactly. Hence, the function with signature (float, object) would be used. To illustrate, the keys for the three f functions could be (1, 1), (1, 0), and (0, 1) respectively, and the function with the smallest key is selected. I think this strategy is used in other languages; I'll try to find a supporting reference.

It would be straightforward to add this to Dispatcher.resolve, but, as currently structured, it will require iterating over every signature in self.ordering.

Flatten example for strings does not work

I tried the example in the documentation for flatten, which works for the first examples but fails on string:

RuntimeError: maximum recursion depth exceeded while calling a Python object

The problem is, I guess, that string is a subclass of Iterable:

In [27]: isinstance("string", Iterable)
Out[27]: True

When iterating through a string, each element is again a string:
In [26]: for c in "string":
....: print("{} : {}".format(c,c.class))
....:
s : <class 'str'>
t : <class 'str'>
r : <class 'str'>
i : <class 'str'>
n : <class 'str'>
g : <class 'str'>

This creates an endless loop in the dispatch.

Has the example not been tested, or do I do something wrong?

Should multiple dispatch handle NotImplementedErrors intelligently?

Should we continue searching for valid implementations if the function we call raises a NotImplementedError?

Example

@dispatch(object)
def foo(x):
    return 'foo'

@dispatch(int)
def foo(x):
    if x % 2 == 0:
        return 'even!'
    else:
        raise NotImplementedError()

>>> foo('hello')
'foo'
>>> foo(4)
'even!'

>>> foo(3)  # old behavior
Exception...
>>> foo(3)  # proposed new behavior
'foo'

Thoughts on multiple dispatch for object creation?

I tripped over this recently: I wanted to use multiple dispatch to handle the instantiation of different classes -- I tried something like

class Foo(object):
    def __init__(self, a):
        pass

class Bar(object):
    def __init__(self, a, b):
        pass

@dispatch((Foo, float, int)):
def build(cls):
    return cls(0)

@dispatch(Bar):
def build(cls):
    return cls(1, 2)

build(Foo)

This doesn't work, because dispatch looks at the types of the inputs, and the type of Foo is type not Foo. For my code to work properly I would need reference instances for each type, which is the thing I was trying to abstract into build in the first place.

I assume that dispatch is using isinstance, where the lookup logic I want in this case is issubclass. Do you think there's worth being able to toggle which method to use (maybe as a dispatch kwarg), or is this too special-purpose?

Improve compatibility with singledispatch

This was discussed in #23, which probably wasn't the best place. The takeaway from there is:

Seems like there would be some value to having multipledispatch be a drop-in replacement for singledispatch.

Here's what would need to happen for that:

Have the Dispatcher.register decorator return the undecorated function, to enable stacking
Provide a Dispatcher.dispatch method to return the implementation for a signature (what resolve does currently, but with a slightly different signature -- taking *args instead of a tuple)
Provide a decorator that constructs Dispatcher from a function, copying its name, docstring, and signature, and registering the function as "default implementation" (the signature copying might be tricky in Python 2, and I'm not sure if registering of a default is desirable in the multipledispatch case)
Povide a singledispatch decorator :)
Provide the "functional form" of @register. This might not actually be possible -- singledispatch allows something like the following, where multipledispatch would consider (int, str) a type signature (unless it remembers the number of arguments it dispatches on? Still it would be confusing):
```
@prettyprint.register(list)
def print_list(x):
    return '[%s]' % ', '.join(prettyprint(e) for e in x)

prettyprint.register(int, str)
```
Use a resolution algorithm compatible with singledispatch (C3 linearization with ABCs taken into account, invalidation based on ABC cache token). (the singledispatch backport does this using private ABCMeta API for Python < 3.4)
Provide a read-only registry attribute that maps signatures to implementations.
"Faced with ambiguity, resist the temptation to guess." Raise RuntimeError rather than AmbiguityWarning on conflicts by default. (Maybe a per-Dispatcher policy for this?)

Ambiguity checking is O(n^3) in the number of signatures

Improving introspection for functions decorated with dispatch()

The root cause of several other issues like breaking pickle (#19) and help() and source() not working (#41, #30) are due to closure-based decorators breaking introspection. Graham Dumpleton discusses this in detail on his blog at http://blog.dscpl.com.au/search/label/decorators, but the short version is that you need a transparent object proxy as a wrapper. Dumpleton has written a library called wrapt, https://pypi.python.org/pypi/wrapt, that includes a transparent object proxy in both pure-Python and C versions and some tools for building decorators that preserve introspection. This would also improve compatibility with singledispatch by preserving name, docstring, and signature (#33) and might allow multiple dispatch to work with class and static methods. If there's interest, I could probably use wrapt to fix these problems, but I don't know if adding an additional dependency is verboten.

Pickable dispatched functions

Hi,

I would love to know how one can make these functions pickleable?

Dispatch on traits

https://github.com/ipython/traitlets

The traitlet stuff is getting some momentum in ipython and may be included in Dynd.

How hard would it be to extend this library to dispatch on traits?

Python 3 annotation branch is failing tests in some versions of Python

In particular, Travis shows the following as failing:

2.6
2.7
pypy

Here's an example:
https://travis-ci.org/mrocklin/multipledispatch/builds/41680363

Remove duplicate unit tests in Python 3 annotations branch

And identify key differences to test.

From @mrocklin:
"""
Rather than duplicate the entire test suite I suggest that we select a few tests that ensure that Python 3 annotations are correctly interpreted as type signatures. While copying the entire suite is comprehensive I don't think we should expect future contributions to make two tests for every behavior. Ideally the code is set up so that once the signatures are ingested there is no difference between the two approaches to be tested.
"""

Per the conversation here:
https://github.com/mrocklin/multipledispatch/pull/4/files#diff-adc1e6d88c0a054cafbf357b7c11c941R183

This can be done in the following branch:
https://github.com/oubiwann/multipledispatch/tree/python3-annotations

No coverage

Badge show me "unknown" status (gray). Other badges in README.rst - also are broken.

Personally, I would also suggest using codecov.io instead of coveralls.io.

Stack dispatch decorators

We should be able to simultaneously register multiple signatures by stacking dispatch decorators

@dispatch(int, float)
@dispatch(float, int)
def f(x, y):
    pass

Add signature performance

Currently whenever you add a new signature we recompute the dependency graph and the toposort. This gets expensive once a function has a few hundred registered type signatures. This cost could be significantly reduced if we kept the graph around and only computed the toposort when necessary.

better support for metaclass

It would be interesting to better support metaclasses

Feature request

For example, one may want to create a function whoami which returns different results when running whoami(int) and whoami(str). (PS, I meant literally whoami(int), not whoami(1))

@dispatch(type(str))
def whoami(x):
    print("you are a string")

@dispatch(type(int))
def whoami(x):
    print("you are a int")

It would work if type(str) and type(str) have different types. Unfortunately, they don't, and they are both type.

It could, however, be accomplished by using a different type function. Consider

class DataType(type):
    def __instancecheck__(self, instance):
        return self.t == instance

    def __subclasscheck__(self, subclass):
        return isinstance(subclass, DataType) and self.t == subclass.t


_data_types = {}

def MyType(t):
    if isinstance(t, type) and t is not object:
        if t not in _data_types:
            _data_types[t] = DataType(
                "{}_type".format(t.__name__),
                (type,),
                {"t": t, "__new__": lambda cls: cls.t})
        return _data_types[t]
    else:
        return type(t)

With this new type function, I could determine the type of str and int with a new system.

isinstance(str, MyType(str))
#> True
MyType(str) == MyType(int)
#> False

Finally, it allows me to do what I originally wanted to do

@dispatch(MyType(str))
def whoami(x):
    print("you are a string")

@dispatch(MyType(int))
def whoami(x):
    print("you are a int")

whoami(int)
#> you are a int
whoami(str)
#> you are a string

Changes required in multipledispatch

In the above example, I would need to replace the type of thisline by MyType though.

In all, I am requesting a customizable type function when the dispatcher is created.

0.6.0 Release

@llllllllll @mrocklin

Any chance we could cut a release sometime soon? Happy to help in any way possible to get that done.

Support for annotated funcitons?

@dispatch(float)
def foo(a: float): 
    pass

will get an error: Function has keyword-only arguments or annotations, use getfullargspec() API which can support them

from multipledispatch/core.py line 79 spec = inspect.getargspec(func)

Dispatching to list subtypes

As an (temporary?) alternative to full support for Python typing (#69), I'd like to propose adding multipledispatch.TypedList. The use case here is separate functions for different list subtypes, e.g., lists of strings vs lists of integers. See pydata/xarray#1938 for discussion.

I have an example implementation here and am happy to work on putting together a PR if desired:
https://colab.research.google.com/drive/18zdyUpWLNFzFaz08GUOC5vs1GxE_jHg-#scrollTo=XDL0cBeS-lub

Example usage:

@dispatch(TypedList[int])
def f(args):
  print('integers:', args)

@dispatch(TypedList[str])
def f(args):
  print('strings:', args)

@dispatch(TypedList[str, int])
def f(args):
  print('mixed str-int:', args)

f([1, 2])  # integers: [1, 2]
f([1, 2, 'foo'])  # mixed str-int: [1, 2, 'foo']
f(['foo', 'bar'])  # strings: ['foo', 'bar']
f([[1, 2]])  # NotImplementedError: Could not find signature for f: <TypedList[list]>

The exact public facing API is up for discussion. I'm tentatively calling this TypedList for clarity and to distinguish this from typing.List (this is actually equivalent to typing.List[typing.Union[...]).

Calling a less specific variant from a more specific one

Is it possible in some way to call a less specific variant of a function from a more specific one, kind of like with super() and usual inheritance model? If not, is it possible to implement this?

How to dispatch for generic types?

I'm trying to make some wraps for different "Tensor" like objects, and give them a unified interface. One typical use case unified shape function for np.ndarray and tf.Tensor, the problem is tf.Tensor's shape property does not return list/tuple object.

import tensorfolow as tf
import numpy as np
from typing import Generic, List, Optional, TypeVar

T = TypeVar('T')
class Tensor(Generic[T]):
    def __init__(self, data: T):
        self.data = data

@dispatch()
def shape(t: Tensor[np.ndarray]) -> List[int]:
    return list(t.data.shape)

@dispatch()
def shape(t: Tensor[tf.Tensor]) -> List[Optional[int]]:
    return t.data.shape.as_list()

To make things easier, we may first have an minimal example, can we have the following code work:

@dispatch()
def foo(xs: List[int]):
    return [x**2 for x in xs]

@dispatch()
def foo(xs: List[str]):
    return [x + '1' for x in xs]

foo([1,2,3]) # [1, 4,  9]
foo(['1', '2']) # ['11', '12']

I've noticed these problems:

Generic types in typing can not be used in issubclass or isinstance;
covariant/contravariant/invariant problem, this might be marked by TypeVar, but how to use it in multidipatch? (Julia seems has good solution for this)
TypeVar information is not kept after init, there might be no actual difference between different concrete types objects after instantiated, like:

from typing import Generic, TypeVar

T = TypeVar('T')
class D(Generic[T]):
    def __init__(self, data: T):
        self.data = data

d = D[int](1)
s = D[str]('s')
type(d) is type(s) # True

There might be much more problem than I noticed. A solution I can image is creating a new Generic metaclass, and let it produce truly concrete type when subscribed like Generic[T] (cache maybe needed).

IMO, type system might be not expressive enough in python, it might be ok if we don't use multidispatch. But it becomes somehow inconvenient if we use multidispatch heavily in our program.

I'm have not seriously used/learned type theory yet, this might completely a wrong issue.
So,

is this a "true" issue, or is this just a misunderstanding/wrong usage of Generic or multidispatch?
if it is a proper issue, do we have solution/work around for this?

Adding convinient, shorter syntax variant for Dispatcher

I often encounter use cases where I want to dispatch just a few different types to very short functions (such that lambdas would suffice). Take for example:

@dispatch(int)
def f(x): return 2*x

@dispatch(str)
def f(x): return 3*x

What is imho. missing is a nice, shorter syntax variant. I propose something like

f = Map(int=lambda x: 2*x, str=lambda x: 3*x)

where Map is

def Map(**kwargs):
    f = Dispatcher('lambda')
    for key, func in kwargs.items():
        f.register(eval(key))(func)
    return f

help() is not useful for functions that use multipledispatch

If I'm working in the interpreter, and use help() on a function that uses multipledistpatch, it only returns information for multipledispatch. However, in ipython, the ? magic function returns useful help for a function that uses multipledispatch. For example:

Results from help(blaze.into)

In [1]: import blaze

In [2]: help(blaze.into)

Help on Dispatcher in module multipledispatch.dispatcher object:

class Dispatcher(__builtin__.object)
 |  Methods defined here:
 |  
 |  __call__(self, *args, **kwargs)
 |  
 |  __getstate__(self)
 |  
 |  __init__(self, name, doc=None)
 |  
 |  __repr__ = __str__(self)
 |  
 |  __setstate__(self, d)
 |  
 |  __str__(self)
 |  
 |  add(self, signature, func, on_ambiguity=<function ambiguity_warn>)
 |      Add new types/method pair to dispatcher
 |      
 |      >>> D = Dispatcher('add')
 |      >>> D.add((int, int), lambda x, y: x + y)
 |      >>> D.add((float, float), lambda x, y: x + y)
 |      
 |      >>> D(1, 2)
 |      3
 |      >>> D(1, 2.0)
 |      Traceback (most recent call last):
 |      ...
 |      NotImplementedError: Could not find signature for add: <int, float>
 |      
 |      When ``add`` detects a warning it calls the ``on_ambiguity`` callback
 |      with a dispatcher/itself, and a set of ambiguous type signature pairs
 |      as inputs.  See ``ambiguity_warn`` for an example.
 |  
 |  dispatch(self, *types)
 |      Deterimine appropriate implementation for this type signature
 |      
 |      This method is internal.  Users should call this object as a function.
 |      Implementation resolution occurs within the ``__call__`` method.
 |      
 |      >>> from multipledispatch import dispatch
 |      >>> @dispatch(int)
 |      ... def inc(x):
 |      ...     return x + 1
 |      
 |      >>> implementation = inc.dispatch(int)
 |      >>> implementation(3)
 |      4
 |      
 |      >>> print(inc.dispatch(float))
 |      None
 |      
 |      See Also:
 |          ``multipledispatch.conflict`` - module to determine resolution order
 |  
 |  dispatch_iter(self, *types)
 |  
 |  register(self, *types, **kwargs)
 |      register dispatcher with new implementation
 |      
 |      >>> f = Dispatcher('f')
 |      >>> @f.register(int)
 |      ... def inc(x):
 |      ...     return x + 1
 |      
 |      >>> @f.register(float)
 |      ... def dec(x):
 |      ...     return x - 1
 |      
 |      >>> @f.register(list)
 |      ... @f.register(tuple)
 |      ... def reverse(x):
 |      ...     return x[::-1]
 |      
 |      >>> f(1)
 |      2
 |      
 |      >>> f(1.0)
 |      0.0
 |      
 |      >>> f([1, 2, 3])
 |      [3, 2, 1]
 |  
 |  reorder(self, on_ambiguity=<function ambiguity_warn>)
 |  
 |  resolve(self, types)
 |      Deterimine appropriate implementation for this type signature
 |      
 |      .. deprecated:: 0.4.4
 |          Use ``dispatch(*types)`` instead
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  doc
 |  
 |  funcs
 |  
 |  name
 |  
 |  ordering

Results from ?blaze.into

Type:       Dispatcher
String Form:<dispatched into>
File:       /usr/local/lib/python2.7/dist-packages/multipledispatch/dispatcher.pyefinition: blaze.into(self, *args, **kwargs)
Docstring:
Multiply dispatched method: into
    0  Alice  100
    1    Bob  200

Inputs: <Collection, JSON>
---------------------------
into function which converts TSV/CSV/JSON into a MongoDB Collection
    Parameters
    ----------
    if_exists : string
        {replace, append, fail}
    json_array : bool
        Accepts the import of data expressed with multiple MongoDB documents within a single JSON array.

Inputs: <Collection, CSV>
--------------------------
Convert from TSV/CSV into MongoDB Collection

    Parameters
    ----------
    if_exists : string
        {replace, append, fail}
    header: bool (TSV/CSV only)
        Flag to define if file contains a header
    columns: list (TSV/CSV only)
        list of column names
    ignore_blank: bool
        Ignores empty fields in csv and tsv exports. Default: creates fields without values
...

Find projects that could use this library and send them a PR

That way we might increase projects that use this library, or at least increase its awareness.

Bugfix release?

@mrocklin, could you do at least a bugfix release, that will include py3.6 fixes (#58). Latest release is more than year old.

Adding predicative dispatch and annotations

This is more an informational note than a real issue, most likely. I have forked the project to https://github.com/ContinuumIO/multipledispatch. I might also rename the fork, but I'm not sure what name I would use yet.

There are two major features I'd like to add, but these quite likely are not things you'd want in this project itself. I've forked to the ContinuumIO namespace because if I find time for these things. The first change is I'd like to (optionally) support annotations. For example, whereas current spelling is:

@dispatch(int, int)
def ceil_add(i, j):
    return i+j

@dispatch(float, float)
def ceil_add(x, y):
    from math import ceil
    return int(ceil(x+y))

I'd like to be able to spell this in a manner more-or-less consistent with PEP 484

@multimethod
def ceil_add(i: int, j: int) -> int:
    return i+j

@multimethod
def ceil_add(x: float, y: float) -> int:
    return int(ceil(x+y))

multimethod may or may not be just another name for dispatch, but it would certainly share most of the same machinery.

The second thing was inspired by a group discussion involving Peter Wang. He spoke affectionately of Picture clauses in COBOL (or at least the concept: https://en.wikipedia.org/wiki/COBOL#PICTURE_clause). That is, Peter likes the idea of the function following the "shape" of the data, not only its type. A million years ago, Phillip J Eby's PEAK did something like this, i.e. "predicate dispatch" and I've been wanting to implement this in a more modern context for a long time (See http://gnosis.cx/publish/programming/charming_python_b22.html for a discussion). I had an even more ancient multimethods package that I once wrote (http://gnosis.cx/download/gnosis/magic/multimethods.py), but starting with this project feels like it makes more sense.

So in particular, I'd like to be able to write something like this:

from operator import mul
from functools import reduce
from math import sqrt, e, pi as π

@multimethod
def approx_factorial(n: Predicate["int < 100_000"]) -> int:
    return reduce(mul, range(1,n+1), 1)

@multimethod
def approx_factorial(n: int) -> int:
    return int(sqrt(2*π*n)) * int(n/e)**n

Note: This is a terrible approximation because casting int(n/e) introduces large errors. However, if we do not do it, floats quickly overflow in the **n. What we actually need to do is to take n/e to the largest power we can without float overflow, then multiply those partial exponents together as ints until we've got n worth of them. But the one-liner is meant to illustrate the interface not the function implementation.

For data in something like more like a picture clause (e.g. in processing a CSV file), we might have something like:

@multimethod
def process_date(dt: DateTime["yyyy-mm-dd"]) -> float:
    # ... do something with some sort of datetime object
    return the_answer

@multimethod
def process_date(dt: DateTime["mm/dd/yy"]) -> float:
    # ... do something with some sort of datetime object
    return the_answer

Interaction between py.test and mutually recursive functions containing local numpy arrays is bad

Here's the smallest example I could come up with: a mutually recursive flatten function (it's a simplistic example, but it's easy to understand and shows the error well). There's a lot of mutual recursion inherent in certain classes of problems such as expression evaluation. We see this a lot in blaze (for example).

import numpy as np
from multipledispatch import dispatch
from collections import Iterable


@dispatch(tuple)
def flatten(t):
    return flatten(list(t))


@dispatch(np.ndarray)
def flatten(s):
    raise NotImplementedError('mwahaha!')


@dispatch(Iterable)
def flatten(lst):
    for el in lst:
        for e in flatten(el):
            yield e


def test_flatten():
    r = np.array([1, 2, 3])
    x = list(flatten([(r,)]))
    assert x is not None

Here's a breakdown of why this happens in py.test:

py.test tries to detect recursion by checking if a raised error is an instance of RuntimeError:

a. NotImplementedError happens to be a subclass of RuntimeError (I didn't know this) and therefore recursion is "detected" (in quotes because a NotImplmentedError hardly seems like an error that would be thrown when in endless recursion)
this happens when py.test tries to repr the traceback of, in this case, a NotImplementedError, and in doing so it caches the local variables of all the stack frames leading up to the supposed recursive call (stack frames are hashed on their code object). So, if S' is a recursive call to S py.test asserts that S' and S have the same local variables. But, since you cannot get a scalar boolean from a == b if a or b are instances of ndarray another error is thrown and thus the all caps shouting of INTERNALERROR>.

For reference, py.test's internal stack during the traceback repr looks like this:

[<TracebackEntry /Users/pcloud/test_pytest.py:29>,  # we entered here
 <TracebackEntry /Users/pcloud/test_pytest.py:23>,  # yield
 <TracebackEntry /Users/pcloud/test_pytest.py:23>,  # yield
 <TracebackEntry /Users/pcloud/Documents/code/py/multipledispatch/multipledispatch/dispatcher.py:160>,  # dispatch on Series
 <TracebackEntry /Users/pcloud/test_pytest.py:13>]  # raise NotImplementedError

There are two ways (that I've thought of) to go about solving this:

Have multipledispatch throw something else (easiest, quickest solution). I'd vote for a TypeError or maybe a subclass
Submit a patch to py.test that checks specifically for instances of RuntimeError and not anything else

Consistency between calls of the random choise for resolving ambiguities

This is about documentation:

Could you specify in the last line of https://github.com/mrocklin/multipledispatch/blob/master/docs/source/resolution.rst

If you do not resolve ambiguities by creating more specific functions then one of the competing functions will be selected pseudo-randomly.

or somewhere else in the documentation, whether this pseudorandom choice is made once and cached (so it is consistent in future calls) or whether the decision is done randomly on each call.

I am sorry if this is already specified and I have just missed it.

mrocklin / multipledispatch Goto Github PK

multipledispatch's Introduction

Multiple Dispatch

Example

What this does

What this doesn't do

Installation and Dependencies

License

Links

multipledispatch's People

Contributors

Stargazers

Watchers

Forkers

multipledispatch's Issues

Example

Feature request

Changes required in multipledispatch

Results from help(blaze.into)

Results from ?blaze.into

Recommend Projects

Recommend Topics

Recommend Org