scikit-hep / formulate Goto Github PK

View Code? Open in Web Editor NEW

12.0 12.0 4.0 202 KB

Easy conversions between different styles of expressions

License: BSD 3-Clause "New" or "Revised" License

Python 99.58% Makefile 0.42%

formulate's Issues

Evaluate Expression objects directly

parsing `from_root` can be slow

Hi,

this is an excellent library. I am incorporating it into my ATLAS analysis code together with @jpivarski's uproot and awkward as a drop-in replacement for TTree::Draw

I notices that some expressions can be slow to parse e.g. this takes almost 2 seconds:

$> cat slow.py 
import formulate
formulate.from_root('((weight * (n_mu > 0)) * ((tt_cat==0 || tt_cat==3 || tt_cat==6)))')
$> time python slow.py 
real	0m1.803s
user	0m1.995s
sys	0m0.094s

is there any way this can be sped up?

Thanks,
Lukas

Note: it's not slow of course in an absolute sense, but if you do such an operation very often it can accumulate quite quickly and I was surprised parsing such an expression would have a noticable time cost

Backends should patch `to_X` and `from_X` methods on to the `Expression` object

Add constants

Add support for variable scoping

I've started looking into using this package for a project I'm working on. We'll want to be able to specify branches which might be nested within our trees, eg. branch.sub_branch. I've tested this briefly with version 0.0.7 of formulate though and I see that such variables cannot be identified:

$ python -m formulate --from-numexpr 'branch.sub_branch < 4' --variables
ERROR:formulate:TODO TRACEBACK: ('branch.sub_branch < 4', 6, 'Expected end of text')
ERROR:formulate:Error parsing: branch.sub_branch < 4
ERROR:formulate:                     ▲
ERROR:formulate:                     ┃
ERROR:formulate:                     ┗━━━━━━ Error here or shortly after

Would it be possible to add support for this? PyParsing has a specific helper method which might be useful here, delimitedList. I think the easiest for the user is to return a single variable in this case branch.sub_branch in the example above. That might mean just including the . in the definition of the Word for Variables?

Optimise deep recursion

Currently deeply nested functions result in a RecursionError. This could be optimised by having a simpler parser find functions and then evaluate their arguments iteratively instead of recursively.

Add pypi password as secret for CD

Can we add the PyPI Password as a secret? This is needed for the CD and 0.1.0. Or should this even be a org secret?

@henryiii @eduardo-rodrigues?

Remove unneeded parentheses in output strings

Add support for converting more TMath functions to numexpr equivlents

Currently requested:

TMath::Odd
TMath::Even

Add ROOT to travis tests

Add support for comparing Expression objects

idea: add sympy conversion

As an idea, we could add conversion to sympy. This would allow to use the full power, including:

latex formatting
simplification of expression
resolve #2 by converting forth and back as sympy removes redundant brackets automatically
functional form with lambdify allowing to use "arbitrary" backend

The implementation should be straight forward:

create all variables as symbols and inject in future sympify calls
convert an expression to numexpr and sympify it

The only caveat I see so far is that the == operator has a different meaning in Sympy and is not a logical equal (https://docs.sympy.org/latest/gotchas.html#double-equals-signs), furthermore booleans and numericals don't mix so implicitly.

So it seems to me an interesting idea where we may gain a lot, but it could also be a nightmare if the differences in behavior are too large. I am not a Sympy expert, so maybe others have an opinion?

BUG: pow operator without whitespace errors

Hi all, thanks a lot for the great package!

There seems to be a bug with missing whitespaces and the power operator **. It can only be parsed if a whitespace is inserted before.

formulate.from_numexpr('a**2') raises an error
formulate.from_numexpr('a ** 2') works

Other operators work fine without whitespace.

This is the full error:

In [12]: formulate.from_numexpr('a**2')                                         
ERROR:formulate:TODO TRACEBACK: ('a**2', 1, 'Expected end of text')
ERROR:formulate:Error parsing: a**2
ERROR:formulate:                ▲
ERROR:formulate:                ┃
ERROR:formulate:                ┗━━━━━━ Error here or shortly after
---------------------------------------------------------------------------
ParseException                            Traceback (most recent call last)
~/anaconda3/envs/rkq37/lib/python3.7/site-packages/formulate/parser.py in to_expression(self, string)
    232         try:
--> 233             result = self._parser.parseString(string, parseAll=True)
    234             assert len(result) == 1, result

~/anaconda3/envs/rkq37/lib/python3.7/site-packages/pyparsing.py in parseString(self, instring, parseAll)
   1954                     exc.__traceback__ = self._trim_traceback(exc.__traceback__)
-> 1955                 raise exc
   1956         else:

~/anaconda3/envs/rkq37/lib/python3.7/site-packages/pyparsing.py in parseImpl(self, instring, loc, doActions)
   3813         if loc < len(instring):
-> 3814             raise ParseException(instring, loc, self.errmsg, self)
   3815         elif loc == len(instring):

ParseException: Expected end of text, found '*'  (at char 1), (line:1, col:2)

During handling of the above exception, another exception occurred:

ParsingException                          Traceback (most recent call last)
<ipython-input-12-c3a50f695328> in <module>
----> 1 formulate.from_numexpr('a**2')

~/anaconda3/envs/rkq37/lib/python3.7/site-packages/formulate/parser.py in to_expression(self, string)
    244             exception = ParsingException()
    245             exception.__context__ = None
--> 246             raise exception
    247         else:
    248             return result

ParsingException:

Any ideas?

Test against ROOT and numexpr

Add support for dividing Expression objects

Drop all dependencies except pyparsing

Add support for TMath::Functions with variable numbers of arguments

See backends/ROOT.py for a list.

Allow mathematical operations on the Expression object

Get exact values of constants from ROOT

Return numexpr expression for given container

Would it be possible to return the numexpr for instance for a dict of numpy arrays / a numpy record array?

For instance like this

arrays = {"X_PX": array(....), "X_PY": array(....), "X_PZ": array(....)}
momentum = formulate.from_root('TMath::Sqrt(X_PX**2 + X_PY**2 + X_PZ**2)')
momentum.to_numexpr(arrays)
which returns
'sqrt(((arrays["X_PX"] ** 2) + (arrays["X_PY"] ** 2) + (arrays["X_PZ"] ** 2)))'

Could it be evaluated by numexpr then ?

Matt

Release tagging

Tags should have vX.Y.Z format, not X.Y.Z (only 0.1.0 is this way - GitHub's own UI gives this recommendation), and a GitHub release should always be made when tagging, so it shows up in the UI. @mayou36, perhaps you can can fix?

more flexible to_string conversion to support ternary operators

First of, thanks a lot for this great package!

Numexpr supports where whereas ROOT supports an (equivalent) ternary operator from C++ (Expression1 ? Expression2 : Expression3). While the former is already implemented and works fine with the current function registry, the latter cannot (AFAIK) be supported with the current construction of joining the arguments withing brackets with ,.

Two ideas (using as an example sqrt):

To enable support for this conversion, I would propose a to_string method that can be registered with the function in PFunction. Defaults to the current conversion. This takes the name of the method and arguments.
Disadvantage: we have too much freedom (e.g. for sqrt:
('sqrt', 1, lambda f, args: f + "(" + args[0] + ")")
or duplicate the name
('sqrt', 1, lambda args: 'sqrt' + "(" + args[0] + ")").
use string formatting: require the signature to be contained in the string definition: '(sqrt({})', 1).
Disadvantage: arbitrary number of arguments?

I would propose to go for the first solution.

scikit-hep / formulate Goto Github PK

formulate's Issues

Recommend Projects

Recommend Topics

Recommend Org