Giter VIP home page Giter VIP logo

Comments (7)

sleimanx2 avatar sleimanx2 commented on July 1, 2024

Having the data fetched from the relational database will help with the consistency of the data however we might want to look at the drawbacks that this approach might cause.

1- at each search request we are calling both elasticsearch and rdb which is avoidable in the majority of the scenarios.

2- there are a lot of scenarios when you want to store a relationship value with the parent object in elastic search for instance a post tags. In those scenarios we won't be able eager load the relationships so we will face n+1 problem.

3- in a read only application where the data is stored by another application only in elasticsearch. In this case we won't be able to use the package because we will try to hit the rdb !

I think having something like that could help

$result = Book::search()->match('title','foo')->get();

$resultFresh = $result->hits()->fresh()->with('tags','author')->get();

what do you think ?

from plastic.

hipsterjazzbo avatar hipsterjazzbo commented on July 1, 2024

Those are good points. My thoughts:

  1. IMHO, selecting from the DB isn't a big deal — Laravel itself does it all the time (there's even a trait for it), and selecting by ID is nice and fast.

  2. Regarding relationships, we absolutely could handle that if I understand you correctly. Given your example, if the raw elastic results include a tags property, we could simply use lazy eager loading:

    // ... looping through elastic result properties
    if ($property instanceof Relationship)
        // Eager load relationship from DB in a single query
        $model->load($property);
    }
  3. Again, IMHO, that fact the Plastic always returns Model instances implicitly links them to the DB. I believe the correct approach would be to fetch from the DB by default, and to continue to offer raw results on demand, via the direct access to the DSL that Plastic offers.

    In other words, I don't believe that read-only application example is valid, because the application would already have to have Model classes defined which would by definition break without a DB anyway.

Again, if you agree on these points, I'm more than happy to contribute a PR :)

from plastic.

sleimanx2 avatar sleimanx2 commented on July 1, 2024

@hipsterjazzbo I think we are getting closer to a solution.

The lazy-eager loading idea could solve a lot of performance issues when re-fetching the document from the DB.

On the other side the performance difference can still be significant the SerializesModels that you mentioned is used in queues which totally makes sense in that context as for search requests where you need your app to be as interactive as possible you need to reduce unnecessary IO take advantage of the speed of ES.

Moreover, since we are going to hit the database another time why not give the user the ability to add additional query filters before executing the query?

What are your concerns for the proposed API?

As as side note the models dont break unless you try to persist so read-only is okey and convenient. I know a couple of projects where they use it as read only.

Finally the most important thing is that in production most probably you will have ES clustered as opposed to MYSQL where you might have only one node. if you force retrieval from DB you are discarding all the benefits of having a cluster (Load, Speed, Max Requests Number ...).

from plastic.

hipsterjazzbo avatar hipsterjazzbo commented on July 1, 2024

I still feel like it wouldn't be an issue, in terms of hitting db. It'd be one fast select statement.

Also, the whole reason this came up is that my models are in fact broken — I customise the data I store in elastic for performance reasons. One of the things I don't store is a team_id column, which means I can no longer load that relationship for models that come from Plastic.

from plastic.

nmkr avatar nmkr commented on July 1, 2024

+1 for an option to fetch fresh values..
i have a trait and a custom method for outputting an image from storage, it gets detected as realtionship and we cannot output the right format.

from plastic.

hipsterjazzbo avatar hipsterjazzbo commented on July 1, 2024

I've been thinking about this more.

The core issue here is that hits() returns model instances that are not real model instances. If one customises the data that is stored in the index, instead of just using the default, the models are then broken.

I propose this:

  1. hits() returns elastic documents directly, Basically just the json_decode()-ed results.
  2. A new method on the results class, models(), which does similar to my suggestion above: Fetches the models on demand from the DB, in a single query.

This keeps the advantages of the current implementation, in that is stays fast, and doesn't force models to be selected, while also 1. making models available very easily if needed, and 2. not presenting models that don't actually relate to the database like they should.

Thoughts, @sleimanx2?

from plastic.

hipsterjazzbo avatar hipsterjazzbo commented on July 1, 2024

Further thoughts:

Doing it this way also opens up the door for future, more advanced usage. For example:

In additon to hits() and models(), we could add something like query() which would return a Illuminate\Database\Eloquent\Query like I put above:

$model->newQueryWithoutScopes()->whereIn($model->getQualifiedKeyName(), $ids);

Which could then be built on if needed.

from plastic.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.