Our application need partial update a field(rank field) of 200million documents dail

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Any further questions on this topic <a class="user-mention notranslate" data-hovercard

partial Update: elasticsearch/solr vs vespa about vespa HOT 8 CLOSED

vespa-engine commented on April 28, 2024 2

partial Update: elasticsearch/solr vs vespa

from vespa.

Comments (8)

bratseth commented on April 28, 2024 5

200M a day is about 2k per second. That should work fine for any kind of field even on a single node.

from vespa.

jobergum commented on April 28, 2024 2

Just the fields that you want to update. That is why we call it a partial update. Numeric fields like byte, int, float etc is faster then string. Agree with @bratseth, 200M updates a day should be no match for even a single core machine.

from vespa.

jobergum commented on April 28, 2024

Vespa supports partial updates of existing indexed documents, fastest is for fields defined with 'attribute' and of type numeric. See http://docs.vespa.ai/documentation/reference/document-json-update-format.html for update json syntax.

from vespa.

zhuxiang1981 commented on April 28, 2024

vespa’s partial update just reindex the updated fields ? es will reindex all the fields

from vespa.

ddorian commented on April 28, 2024

@zhuxiang1981 solr has in-place-updates but with some caveats (non indexed etc) https://lucene.apache.org/solr/guide/6_6/updating-parts-of-documents.html#UpdatingPartsofDocuments-In-PlaceUpdates

from vespa.

jobergum commented on April 28, 2024

Any further questions on this topic @zhuxiang1981 ? Thanks

from vespa.

vandit-thakkar commented on April 28, 2024

We recently saw that 16k updates/sec were successful in one of our experiments with a cluster having 3 nodes, although all were integer updates. It's a good enough for now. We want to achieve 100k/sec updates which we would horizontally scale and achieve. Though we found that update throughput got very low (4k/sec) after we simultaneously ran benchmarking and hit the system with lots of queries. Any suggestions ?

from vespa.

baldersheim commented on April 28, 2024

In order to tell wether your numbers makes sense, I need need to know the machine config you are using. Also your search definition and services file would be helpful. There are some tricks that can be applied to push it even further up in some cases. Feed performance will go down during query load, how much depends on number of threads on your machine. As it is a search engine it is designed to favour queries over feed. It can be tuned, but that has not been done very often, so it must be experimented in each case. I also do not remeber how well documented it is.

…

On Tue, May 22, 2018 at 10:47 PM, Vandit Thakkar ***@***.***> wrote: We recently saw that 16k updates/sec were successful in one of our experiments with a cluster having 3 nodes, although all were integer updates. It's a good enough for now. We want to achieve 100k/sec updates which we would horizontally scale and achieve. Though we found that update throughput got very low after we simultaneously ran benchmarking and hit the system with lots of queries. Any suggestions ? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#4154 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AS8BfFSxSkDzI9LJBWbhBXjRmVRSg5tQks5t1HlNgaJpZM4QgFWB> .

from vespa.

partial Update: elasticsearch/solr vs vespa about vespa HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent