Giter VIP home page Giter VIP logo

Comments (15)

ankane avatar ankane commented on August 15, 2024

Hey Luca, 9 hours is a really long time. What does your search_data method look like? Are you doing any N + 1 queries?

from searchkick.

codker avatar codker commented on August 15, 2024

I index only 3 string fields, nothing heavy. Also I noticed that indexing is slow in general, even with another model, it can just do about 1000 records every 5-7 seconds.

Using find_in_batches in the console is much much faster than that.

from searchkick.

ankane avatar ankane commented on August 15, 2024

Is Elasticsearch installed locally? (want to rule of the possibility to network latency)

from searchkick.

codker avatar codker commented on August 15, 2024

Yeah, with the default configuration

from searchkick.

ankane avatar ankane commented on August 15, 2024

How long are the string fields? Really long fields would increase indexing time.

from searchkick.

ankane avatar ankane commented on August 15, 2024

Also, does indexing start out slow, or does it become slow after a while? (possible memory issue)

from searchkick.

codker avatar codker commented on August 15, 2024

The fields are 80 chars max and the indexing starts on about 1000 every 5 secs, slowing down to 1000 every 7-8 secs.

The bigger problem is the used memory, grows constantly to over 6gb. By using find_in_batches this shouldn't happen.

I will try to replicate this behavior with a small app and put it on github.

from searchkick.

codker avatar codker commented on August 15, 2024

I made a new app with the bare minimum to test searchkick, with the same model and fields of my main app...and creating and indexing 100.000 entries it's a lot faster.

Maybe there's some other gem in my app that mess things up. I will do some tests on them and report back…argh πŸ˜”

from searchkick.

ankane avatar ankane commented on August 15, 2024

One thing I've noticed is tire calls to_hash when indexing documents - regardless of search_data. Might help with the investigation...

from searchkick.

codker avatar codker commented on August 15, 2024

Yeah I noticed it too. I profiled the importing, take a look at the graph and the stack, seems like Carrierwave it's the culprit too (i have two uploaders in the model)

from searchkick.

codker avatar codker commented on August 15, 2024

I overloaded to_hash to return only the data I needed and the importing flies! That was the problem, the default one puts every field in the hash πŸ˜“

from searchkick.

codker avatar codker commented on August 15, 2024

I tried to reindex with that change and it takes only 1 hour now, but the RAM usage is still high: 2,4GB.

from searchkick.

codker avatar codker commented on August 15, 2024

Ok, I can close this. It's a gem (i need to find which) that i have in the development group in Gemfile that was causing the RAM issue. Tried to comment all the group and now it's going smooth, 250.000 entries and only 148mb of RAM πŸ˜„

For others having the same problem, watch out! :)

Thanks for your help @ankane!

from searchkick.

ankane avatar ankane commented on August 15, 2024

Let me know what you find. I'll see what I can do about to_hash.

from searchkick.

codker avatar codker commented on August 15, 2024

The problem is the Bullet gem, I think it keeps some kind of cache of the objects, but I haven't checked the sources.

Without it and by redefining to_hash, the indexing is fast. Done in 45 minutes using only 150MB of memory :)

from searchkick.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.