Hello Trevor, thanks for your fantastic gp Tool. I am starting to us

You can simplify GPlearn expressions with sympy as follows: <div class="highlight

Thanks for the feedback <a class="user-mention notranslate" data-hovercard-type="user"

Okay it looks like srepr does the reverse. <div c

Integration with sympy about gplearn HOT 7 CLOSED

jamartinh commented on June 19, 2024 3

Integration with sympy

from gplearn.

Comments (7)

MilesCranmer commented on June 19, 2024 2

You can simplify GPlearn expressions with sympy as follows:

...
model = SymbolicRegressor(**kwargs)
...

import sympy
from sympy import sympify, simplify
locals = {
    'sub': lambda x, y : x - y,
    'div': lambda x, y : x/y,
    'mul': lambda x, y : x*y,
    'add': lambda x, y : x + y,
    'neg': lambda x    : -x,
    'pow': lambda x, y : x**y,
    'cos': lambda x    : sympy.cos(x)
}
simplify(sympify(model._program, locals=locals))

This is helpful for simplification + display of equations.

Though, I wonder if there is a way to convert the sympy expression back into gplearn using the same locals dictionary? Then it could be done automatically at each step to trim individual expressions.

Credit: https://stackoverflow.com/questions/48404263/how-to-export-the-output-of-gplearn-as-a-sympy-expression-or-some-other-readable

from gplearn.

MilesCranmer commented on June 19, 2024 1

To answer, a couple years ago I decided it would be easier to just write an SR package from scratch - https://github.com/MilesCranmer/PySR. This one does some simplification throughout the search but I found doing it too often prevents the genetic algorithm from exploring effectively, since redundant unsimplified expressions can act as a stepping stone to more optimal ones.

from gplearn.

trevorstephens commented on June 19, 2024

Thanks for the feedback @jamartinh ... I'll have to look into sympy, first concern would be the requirement in GP for closure as we use "safe division", strange concoctions of square roots/logs, and so on, to avoid infs sneaking into the results. I'll leave the issue open and take a peek at what sympy is capable of though!

from gplearn.

MilesCranmer commented on June 19, 2024

Okay it looks like srepr does the reverse.

srepr(simplify(sympify(model._program, locals=locals)))

converts it back into the print(model._program) format. Though it's slightly different:

Add(Mul(Float('0.13100000000000001', precision=53), Pow(Symbol('X0'), Integer(2))), Mul(Integer(-1), Pow(Symbol('X1'), Integer(2))), Symbol('X1'))

So here's some regex to put it in the same format:

print('GPLearn:', model._program)

>>> GPLearn: sub(X1, sub(mul(X1, X1), mul(mul(X0, 0.131), X0)))

print('sympy:', simplify(sympify(model._program, locals=locals)))

>>> sympy: 0.131*X0**2 - X1**2 + X1

sympy_string = srepr(simplify(sympify(model._program, locals=locals)))

sympy_string = (
re.sub(r"x([0-9]+?)", r"X\1",
re.sub(r"Float\('([\-0-9\.]+)', precision=[0-9]+?\)", r"\1",
re.sub(r"Integer\(([\-0-9]+)\)", r"\1",
re.sub(r"Symbol\('(.+?)'\)", r"\1",
    sympy_string
))).lower())
)

print('srepr:', sympy_string)

>>> srepr: add(mul(0.13100000000000001, pow(X0, 2)), mul(-1, pow(X1, 2)), X1)

Can I pass this string back into GPLearn somehow? Then it would be easy to auto-simplify with sympy every loop.

from gplearn.

hwulfmeyer commented on June 19, 2024

Each individual in gplearn is internally saved in prefix notation in a list. So what you simply have to do is convert your string, which already seems to be in prefix notation, to a list. I also began experimenting with that here: https://github.com/wulfihm/gplearn_ba/blob/master/gplearn/_program.py

Some issues I encountered was that simplify can fail, take too long to simply the expression (I am talking here about 30mins) or never finish the simplification at all. My expressions were I think sometimes 150 symbols long, so that could be why. Maybe I also did something wrong, I am not sure anymore. As far as I remember I also got invalid individuals sometimes and couldn't figure out why. So definitely make sure to implement some safeguards there.

Anyways, from a genetic programming standpoint I would advice to also experiment with simplifying every N loops. Simplifying every loop could have the effect of destroying important "genetic" information and making it harder to get to a good solution and may also negatively impact the effect of the genetic operators. Another Idea I just had was to create simplifying similar to subtree mutation. But instead of mutating the subtree it is simplified with a chance of Ps. That saves on computation time, does not completely "destroy" the individuals and may be more beneficial for the whole genetic programming process.
Simplifying in GP is also not novel, there are definitely papers out there where people experiment with that. So maybe that could be some source of something people already did.

from gplearn.

MilesCranmer commented on June 19, 2024

Awesome, thanks for sharing this advice! Very useful to know.

Good point - I agree that in practice maybe just having a short cutoff time for simplification could be enough, otherwise the expression stays the same. I recall mathematica's FullSimplify has some timeout parameter which is also used to pick faster/less complete simplification strategies; maybe sympy has that too.

from gplearn.

jerabaul29 commented on June 19, 2024

Could this kind of functionality (simplifying with sympy and converting back to a tree) be provided through a helper function to reduce the complexity of applying it? :) I wonder also if having a parameter in the SymbolicRegressor to apply Sympy periodically could be helpful (possibly asynchronously / with some timeout so that it is "unfactorization-proof").

from gplearn.

Integration with sympy about gplearn HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent