Comments (14)
So, as evident I did not get around to doing this.
I do not know if I will ever have the motivation to do it at all as writing it in the first place took some time already. Essentially what I already did was implement ParetoGP and Eplex in my own fork of gplearn (which I created for my bachelor thesis), but which also includes other adaptions I made, which is why a clean PR from this fork is currently not possible. Also I am not sure if my implementations there are programmatically satisfactory for you at all as I did not intend the code to be very well maintainable by other people at all. I didn't even expect anyone to read it in the first place.
See: https://github.com/wulfihm/gplearn_ba
Maybe I will get around to doing a clean PR for this but maybe also not, so in the meantime if anyone wants to do it, feel free. And if you have any questions to my code above I am able to answer them if necessary.
Eplex and ParetoGP are defined here: https://github.com/wulfihm/gplearn_ba/blob/master/gplearn/selection.py
While the ParetoFront is created here: https://github.com/wulfihm/gplearn_ba/blob/master/gplearn/genetic.py
I tried doing NSGA2 but it really did not work at all.
I also implemented other stuff like geometric semantic crossover and mutation, simplification of solutions of gplearn (not finished) and another complexity measure I named 'Kommenda' from M. Kommenda et al. and adding the R2 score for regression.
I also changed the math operators to be more "precise".
Another note, everything above I only implemented with regression in mind. I completely ignored the symbolicTransformer.
If you are interested at all here is my bachelor thesis:
https://www.researchgate.net/publication/335842681_Genetic_Programming_for_Automotive_Modeling_Applications
from gplearn.
That's awesome! I'm really looking forward to trying it out this week.
Thanks for putting this online and offering assistance in configuring it.
Cheers,
Miles
from gplearn.
Epsilon Lexicase
from gplearn.
Be great if you could provide a bit more detail @echo66 ...
from gplearn.
It's already implement in https://github.com/lacava/few/ .
More info at:
- https://github.com/lacava/emo-lex
- https://github.com/lacava/epsilon_lexicase
- http://williamlacava.com/research/lexicase
from gplearn.
I have privately made my own additions to gplearn which includes ParetoGP and EPLEX. See also #33
I would be more than happy to build a PR from my work. It will need some time though since I am currently in the process of finishing up my thesis.
from gplearn.
This would be most welcome @wulfihm 👍
from gplearn.
Beginning this now. Not sure when I will be done.
from gplearn.
No worries @wulfihm take your time, no rush. I'll be interested to see what you come up with. Abstracting out the selector as its own class possible?
from gplearn.
Really appreciate you sharing this @wulfihm , if someone wants to take up the torch, it'd be very cool to see these added. Otherwise, maybe I'll take a few rainy weekend days this winter to play with your code 😄
from gplearn.
+1 for this!
@wulfihm do you have any docs on how to use those techniques in your code? Or even a jupyter notebook with a simple example maybe? Anything helps.
I'm trying to switch to GPLearn from Eureqa lately and I'm also very interested in this. For context, I have a recent paper on converting neural networks into analytic equations to discover new physical laws: https://arxiv.org/abs/2006.11287.
We use the following Pareto front technique where we look for the sharpest drop in log-error over length. It seems to work pretty well in a range of noisy datasets rather than jointly optimizing loss and length. But I'd also be interested in trying out these others.
from gplearn.
@MilesCranmer What exactly do you mean with "how to use these techniques"? Programmatically, Theoretically? :D
from gplearn.
I mean programmatically - i.e., how I can configure those methods for GPlearn's .fit() loop for a particular problem if I were to use your fork.
from gplearn.
I added additional hyperparameters/options:
complexity => 'kommenda' (for the kommenda complexity)
selection => 'eplex'
paretogp => 'True' or 'False'
paretogp_lengths => (a, b)
ParetoGP works by selecting the first parent randomly from the Paretofront (The Archive). The second parent is selection via the selection mechanism (can be anything, i.e. tournament or eplex) from the normal population. See: https://doi.org/10.1007/0-387-23254-0_17
paretogp_lengths
is to limit the size of the solutions in the archive, since there is no penalty parameter anymore the individuals could be infinitely large. paretogp_lengths = (5,250)
seems large enough to me. Keep the lower limit above 3 or 4, or else it may cause issues.
I used the code here: https://github.com/wulfihm/ba_code/blob/master/main.py works via command line arguments.
The elitism_size
command could be interesting to you if you use no ParetoGP. The original GPlearn has the possibility that your population gets worse, since it does not retain the previous generation i.e. the next generation replaces the old one. Elitism also is only in effect if ParetoGP disabled.
from gplearn.
Related Issues (20)
- how programs converge technically and use less time in later generations
- normalization for input? HOT 1
- Is it possible to access programs inside make_fitness? HOT 1
- Solution to avoid dividing by zero when substructing two Feature Names HOT 3
- Auto-Save function HOT 3
- [Question] How to use gplearn in comparison to neural networks? HOT 1
- Is there any way to get the formula expresssion of each individual? Thanks. HOT 4
- Check transformer supports pandas dataframe
- const_range error HOT 6
- Use of raw_fitness vs. penalized fitness HOT 2
- how to run gplearn by multi process ?
- Use logging instead of print HOT 1
- Would there be a way to produce the equivalent Python code for the program coming from the symbolic regress or HOT 1
- question about _weighted_pearson HOT 1
- Matrix shaped features issue HOT 1
- SymbolicClassifier doesn't classify tasks with more than 2 classes. HOT 1
- Optimal Population Size HOT 1
- POWER function still overflows HOT 1
- Implement elitism
- Bugs with _Program.build_program method
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gplearn.