Comments (5)
Thank you for the detailed response.
I've been using a hand-rolled search function until the the index is built. But the actual data set I'm working with is even larger than the tests I've been running so indexing is taking as much as 30 seconds. I guess my use case might just not be a good fit for miniSearch. But i'll keep experimenting. Thanks again.
from minisearch.
Hi @yet-another-dev ,
The index is stored in the browser memory, so that’s essentially the limitation.
Search performance shouldn’t be an issue even with extremely large indexes, as the algorithms used scale independently of the index size (the exception being fuzzy search, but also this shouldn’t be a problem unless one is using large fuzziness factors). In short, you shouldn’t be limited by search performance, if the index fits in memory.
Indexing performance depends on the number of documents and their size, so it can get slower with huge collections. Still, I find that re-indexing client side on page load is in most cases the right approach, and serializing/caching the index is only reserved to corner cases.
Even in challenging use cases where indexing takes 2 or 3 seconds (this would be the case only on huge collections of documents), I often solve this at the UI level: I perform indexing asynchronously with addAllAsync
and show the search field as “loading” while in progress. Normally, by the time the user chooses to interact with the search, it is ready.
I routinely use MiniSearch for production applications indexing tens of thousands of documents on the fly on page reload (e.g. products in a product search). Performance has never occurred as an issue, and there is no noticeable lag compared to smaller use cases. These apps are used often on mobile browsers, including rather old smartphones, and we never received issues there either.
In sum, I’d say memory is the main limit, but even that is quite farther than one would expect, thanks to the compact index data structure.
I hope this provides the info you need.
from minisearch.
I will close this issue for now, but feel free to comment on it if you have more questions or doubts.
from minisearch.
Hi,
I am running a test with 5000 400-word documents and indexing with addAllAsync is taking over 10 seconds. Is that the type of performance you would expect indexing a collection of that size or does it suggest that I may be doing something wrong?
Thanks.
from minisearch.
Hi @dustfoxer ,
Over 10 seconds sounds a bit too slow, even though not completely impossible for large documents. I would start by verifying that you are not performing some slow operation during indexing. For example, I saw one application trying to create a map of ID to document with a loop like the following:
// this is slow, because it creates a new hash
// on each iteration:
const docById = documents.reduce((byId, document) => (
{ …byId, [document.id]: document }
), {})
On that app it looked like indexing was very slow, but it was this loop instead taking most of the time.
That said, consider that addAllAsync
is slower than addAll
(but has the advantage of not blocking the rendering). I would recommend experimenting with different batch sizes to find the right compromise between indexing speed and UI responsiveness: the larger the batch, the faster indexing can be, but each batch might block the UI.
Finally, in some applications, the specific way one designs the UI can make the difference too, even without making the indexing faster: sometimes in one-page apps one can render everything and just temporarily disable the search until the indexing is done, and often that is good enough for most users, who might not need to initiate a search immediately.
from minisearch.
Related Issues (20)
- Search inside a worker? HOT 2
- Tree-shaking consideration? HOT 4
- Add a boost to recently updated docs? HOT 2
- Searching HTML contents HOT 4
- How to search for parts of a word? HOT 2
- searchTokenize(...).flatMap HOT 1
- Prefix search enabled/disabled per search field HOT 3
- Switch to stronger typings HOT 2
- Barebones, framework agnostic example HOT 8
- Is it possible to make autoSuggest suggest the entire title of my blogs instead of just one word? HOT 2
- case-sensitive dynamic selection during search HOT 2
- `fuzzy` predicate function? HOT 2
- how to index nested field with its value is an array HOT 6
- about search result HOT 8
- Minimum should match HOT 4
- Any way to search across multiple vitepress sites? HOT 4
- How to have a search at least as good as `includes` HOT 3
- How to prevent treating terms separately? HOT 2
- Can `loadJSON` be added as an instance method which merges indices? HOT 3
- Any notification on status of data loading? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from minisearch.