Giter VIP home page Giter VIP logo

Comments (6)

jamartinh avatar jamartinh commented on June 19, 2024

Hello, I have also experienced the same issue, and can't run my experiments
for many iterations.

I tough it was a problem with garbage collection.

I think one can have this as a parameter. num_generations_history

Cheers,
Jose A.

2016-02-11 16:08 GMT+01:00 guyko81 [email protected]:

Hi Trevor,

it's a very nice implementation - I was searching for such solution for a
long time. So really thank you!

I got only 1 issue that with long term of evolution (generations =
some_huge_number; or population_size = some_huge_number + generations =
some_number) the program runs out of memory. I checked the code and it
saves every iteration's population. Do you think it's necessary? In my
understanding we only need the current population and the best of the
previous in the beginning.

What do you think, can the code be changed some way to make
self._programs = []
before every iteration and just save the previous one in a
self._programs_prev (or something)?


Reply to this email directly or view it on GitHub
#5.

/ .- .-.. .-.. / -.-- --- ..- / -. . . -.. / .. ... / .-.. --- ...- .
José Antonio Martín H. (PhD) E-Mail: [email protected]
Computer Science Faculty Phone: (+34) 91 3947650
Complutense University of Madrid Fax: (+34) 91 3947527
C/ Prof. José García Santesmases,s/n 28040 Madrid, Spain
web: http://www.dacya.ucm.es/jam/
LinkedIn: http://www.linkedin.com/in/jamartinh (Let's connect)
.-.. --- ...- . / .. ... / .- .-.. .-.. / .-- . / -. . . -..

from gplearn.

trevorstephens avatar trevorstephens commented on June 19, 2024

Thanks for the report! I'll look into your hypothesis @guyko81 but suspect the issue is more likely with numpy arrays being stored as the equations are recursively evaluated. These /should/ be garbage collected by Python as they are never stored in the object, but I'll check that out as well @jamartinh

I have seen this issue as well, and was thinking that a eval_size parameter might help by evaluating fewer samples at once, rather than the whole dataset. I've been meaning to work on a v0.2 for a while now. This should be top of the list.

For now, you might find using n_jobs=1 more stable (fewer evaluations at once) or ramping up the parsimony to keep the programs smaller.

from gplearn.

guyko81 avatar guyko81 commented on June 19, 2024

Thanks Trevor! Can't tell more, so thank you :)

from gplearn.

trevorstephens avatar trevorstephens commented on June 19, 2024

I've located the main culprit. It is due almost entirely to saving the indices of X & y used for evaluating a programs fitness in the case of using max_samples. These indices are also retained for no under-sampling. I am working on a fix now, and can still retain all prior populations for inspecting the lineage of a final program.

from gplearn.

trevorstephens avatar trevorstephens commented on June 19, 2024

I have also added a check at each evolution to see whether older generations are still relevant, ie whether any of their "dna" exists in the current generation. Any irrelevant programs will be removed from the old generation's population by marking them as None. This results in a massive reduction of the number of programs stored and should help significantly with memory use.

from gplearn.

trevorstephens avatar trevorstephens commented on June 19, 2024

Mostly fixed by #19 ... Please re-open if problems still persist in the master branch or the next release.

from gplearn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.