Comments (28)
What you have now still consumes a lot of operations (as you need to re-push all the records to a tmp index on each push). Switching to algolia-indexing would drastically reduce this usage.
I tried to make the package as easy as possible to use (there is one method to call with credentials, settings and records, everything else is automated), as to reduce the amount of effort needed for a switch, but I'd be interested in knowing how I could make this even easier.
Or maybe we're talking about the same thing with different names. Maybe what you call live diff is what I call full atomic :)
from gatsby-plugin-algolia.
I've made a pull request. It now supports a generic hash version that will only update objects that have changed. Works well on Netlify as long as the cache persists. Once cache is removed it updates everything again.
from gatsby-plugin-algolia.
Yeh, that looks good.
from gatsby-plugin-algolia.
Yes sure, I would really want this to work so can help out :)
from gatsby-plugin-algolia.
@Haroenv That approach will still try to index every object on every environment/machine that this website is built on. For common deployment targets like Netlify that periodically clear the build cache anyways you're going to be making excessive calls routinely. Algolia ought to offer a way of making this easier.
@u12206050 For what it's worth, I just went with another approach that updates Algolia via an external process instead of using this plugin. In hindsight, trying to couple indexing with build didn't make sense for a structured object search like I have anyways; if you're in a similar situation, that may be much less work.
from gatsby-plugin-algolia.
@Haroenv I think the algolia-indexing project would be the best place to start. It is still a beta and heavy work in progress, but it does solve a few of the issues mentioned in this thread. It uses the Algolia indexes and records themselves to do a smart diff between what is already in the index and what is about to be pushed to reduce the number of operations used.
As a full disclosure, I no longer work at Algolia, but I intend to keep working on algolia-indexing
when time permits, to improve it even further. The version currently can be greatly improved (see the issues for an explanation)
from gatsby-plugin-algolia.
This would be great to get in! I also had to move to an external process due to excess records being indexed when they were basically all the same.
from gatsby-plugin-algolia.
Good call. I would suggest using algolia-indexing
instead of atomic-algolia
(used in production on TalkSearch and with better test suite).
algolia-indexing
currently only implements what I call "full atomic" indexing. This will make sure to usz as few operations as possible (by only applying a diff of changes), but to do so in an atomic way, it requires a plan that can hold twice the number of records actually used.
I planned on implementing another mode, called "live diff" that will be similar, the only difference being that it won't be atomic (making the diff live on the production index), still using as few operations as possible, but not needing a large plan.
Both modes have their merits, it's all a question of trade-offs. Considering that your current implementation already requires a plan that can hold twice the number of records, I think going with a full atomic can only be an improvement and implementing the live diff can wait.
from gatsby-plugin-algolia.
What I did now is already fairly close to "full atomic" I think, but taking the generated objects as source of truth (create temp index and switch), so not exactly worth the "effort" to switch. But it would be nice if a live diff (since everything always has hashes in graphQL this would be possible to leverage). Is this something you have the bandwidth for to collaborate on @pixelastic?
from gatsby-plugin-algolia.
Any update on this? I am using way to many operations between builds and 99% of all the data indexed is still the same. Any info on how I could manually implement this "algolia-indexing" you are talking about? links or docs could be helpful :) Thanks though for the plugin.
from gatsby-plugin-algolia.
There was no update because nobody commented here in months, so I worked on other things. Are you interested in contributing here? I can give some pointers where to start.
from gatsby-plugin-algolia.
- Find out where long-term storage can be done (somewhere in gatsby’s cache / somewhere on the file system / in a different Algolia index)
- When indexing, compute a hash of each object
- Before indexing, compare the computed hash with the next coming hash (Map with objectId: oldHash)
4 only index deleted / added / modified objects
from gatsby-plugin-algolia.
I tried to search algolia-indexing and came to mainly to the Algolia Docs, but I wouldn't know how to or where to start doing changes within this plugin to accomplish what you guys mentioned. I am running my builds on Netlify so the only thing I can use it netlify cache to keep track of indexed objects.
from gatsby-plugin-algolia.
Ok so from the sounds of it I need some external key:hash storage space that I can query to check before indexing objects since Netlify's cache gets cleared. I'll see if I can first implement a fork that uses Netlify's cache or as a function that is optional whereby anyone can give the hash for a given object key.
from gatsby-plugin-algolia.
Am I correct in assuming that in the current state of the plugin if I simply filter out what has changed it then only adds those objects to the ${indexName}_tmp and then overwrites the existing index once done meaning that only the changed objects will actually be in Algolia and everything else that didn't change will be lost?
Meaning I have to remove that piece of code and update the main index directly?
from gatsby-plugin-algolia.
If you do it that way, there will be a flash of wrong or no results
from gatsby-plugin-algolia.
Thanks, I have removed my previous pull request and made a new one using Algolia to check for updates. It compares specified fields to see if an object should be updated, inserted, removed or just ignored.
from gatsby-plugin-algolia.
Thanks, I have removed my previous pull request and made a new one using Algolia to check for updates. It compares specified fields to see if an object should be updated, inserted, removed or just ignored.
Did this get pushed?
from gatsby-plugin-algolia.
It has not been published yet (sorry), but as far as I can tell @u12206050 has published his fork on npm: https://yarnpkg.com/en/package/gatsby-plugin-algolia-search
from gatsby-plugin-algolia.
@Haroenv thanks for the quick update - i followed the instructions, but can see my operations are increasing with every build - the idea of this was that it would only need to update changed records right?
from gatsby-plugin-algolia.
@danvernon Have you tried this: gatsby-plugin-algolia-search)
from gatsby-plugin-algolia.
@u12206050 yes thats why I just implemented - its doing about 800 actions per build. I have 628 records. Heres my code.
{
resolve: `gatsby-plugin-algolia-search`,
options: {
appId: process.env.GATSBY_ALGOLIA_APP_ID,
apiKey: process.env.ALGOLIA_ADMIN_KEY,
queries,
chunkSize: 10000, // default: 1000
enablePartialUpdates: true, // default: false
matchFields: ['slug', 'modified'], // Array<String> default: ['modified']
},
}
const productQuery = `{
products: allShopifyProduct {
edges {
node {
objectID: id
title
handle
description
images {
originalSrc
}
variants {
price
}
}
}
}
}`
const flatten = arr =>
arr.map(({ node: { ...rest } }) => ({
...rest,
}))
const settings = {
attributesToSnippet: [`description:20`],
}
const queries = [
{
query: productQuery,
transformer: ({ data }) => flatten(data.products.edges),
indexName: `Products`,
settings,
matchFields: ['slug', 'modified'], // Array<String> overrides main match fields, optional
},
]
module.exports = queries
from gatsby-plugin-algolia.
It needs both the slug
and modified
field for comparing, if you don't have those fields change the matchFields
in options to something like updated
and then fetch the updated
field from your source:
const productQuery = `{
products: allShopifyProduct {
edges {
node {
objectID: id
title
updated
handle
description
images {
originalSrc
}
variants {
price
}
}
}
}
}```
from gatsby-plugin-algolia.
@u12206050 i dont have slug, so i can just change for - matchFields: ['handle', 'updatedAt']
yeah?
from gatsby-plugin-algolia.
@u12206050 hrmm not sure this is still working as intended - it seemed to work when i pushed a build from code, when the hook from changing 1 product - it seemed to take up the 800 actions again.
from gatsby-plugin-algolia.
Hmm that is strange, I can assure you it should work though as we have been using this for months now without fail. We check a date field modified
and if/when that value changes then only that post gets updated. One thing it could be is that if you are using the url to check, make sure it doesn't change between development and production environments or just remove it from the matchFields
from gatsby-plugin-algolia.
It could also be that you are modifying every object on build
from gatsby-plugin-algolia.
This has been implemented in 0.8.0 as enablePartialUpdates
, thanks @u12206050 :)
from gatsby-plugin-algolia.
Related Issues (20)
- Question re: Algolia Search HOT 6
- Unreachable hosts Error HOT 6
- Algolia transformer returns nulls for images that are correctly displayed in built site HOT 1
- Support for Gatsby 5 HOT 2
- Gatsby 4 build broken gatsby-plugin-algolia v1.0.0 HOT 2
- failed to index to Algolia Cannot read properties of undefined (reading 'contentDigest') HOT 3
- Plugin breaking in Gatsby 5 HOT 5
- JavaScript heap out of memory HOT 5
- Add indexName to the gatsby cache key HOT 4
- Why partial partial update is no more optional ? HOT 6
- Enrich your records with Google Analytics data HOT 1
- Request for skipIndexing Option to Avoid Unwanted API Access During CI Builds HOT 2
- Extracting chunks of long text as its own record HOT 2
- Adding headers to graphql query HOT 3
- Error: cannot apply the batch operation on a replica index
- Plugin is deleting Sanity document from index if edited. HOT 3
- Adding new data to existing records is ignored HOT 3
- Gatsby internal.contentDigest is a poor source for deciding whether content needs to be updated HOT 6
- Do we still need the MatchFields and enablePartialUpdate options in config? HOT 2
- Partially update not updating existing object in index HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gatsby-plugin-algolia.