Giter VIP home page Giter VIP logo

Comments (28)

pixelastic avatar pixelastic commented on May 24, 2024 2

What you have now still consumes a lot of operations (as you need to re-push all the records to a tmp index on each push). Switching to algolia-indexing would drastically reduce this usage.

I tried to make the package as easy as possible to use (there is one method to call with credentials, settings and records, everything else is automated), as to reduce the amount of effort needed for a switch, but I'd be interested in knowing how I could make this even easier.

Or maybe we're talking about the same thing with different names. Maybe what you call live diff is what I call full atomic :)

from gatsby-plugin-algolia.

u12206050 avatar u12206050 commented on May 24, 2024 2

I've made a pull request. It now supports a generic hash version that will only update objects that have changed. Works well on Netlify as long as the cache persists. Once cache is removed it updates everything again.

from gatsby-plugin-algolia.

u12206050 avatar u12206050 commented on May 24, 2024 2

Yeh, that looks good.

from gatsby-plugin-algolia.

u12206050 avatar u12206050 commented on May 24, 2024 1

Yes sure, I would really want this to work so can help out :)

from gatsby-plugin-algolia.

coreyward avatar coreyward commented on May 24, 2024 1

@Haroenv That approach will still try to index every object on every environment/machine that this website is built on. For common deployment targets like Netlify that periodically clear the build cache anyways you're going to be making excessive calls routinely. Algolia ought to offer a way of making this easier.

@u12206050 For what it's worth, I just went with another approach that updates Algolia via an external process instead of using this plugin. In hindsight, trying to couple indexing with build didn't make sense for a structured object search like I have anyways; if you're in a similar situation, that may be much less work.

from gatsby-plugin-algolia.

pixelastic avatar pixelastic commented on May 24, 2024 1

@Haroenv I think the algolia-indexing project would be the best place to start. It is still a beta and heavy work in progress, but it does solve a few of the issues mentioned in this thread. It uses the Algolia indexes and records themselves to do a smart diff between what is already in the index and what is about to be pushed to reduce the number of operations used.

As a full disclosure, I no longer work at Algolia, but I intend to keep working on algolia-indexing when time permits, to improve it even further. The version currently can be greatly improved (see the issues for an explanation)

from gatsby-plugin-algolia.

fraserisland avatar fraserisland commented on May 24, 2024 1

This would be great to get in! I also had to move to an external process due to excess records being indexed when they were basically all the same.

from gatsby-plugin-algolia.

pixelastic avatar pixelastic commented on May 24, 2024

Good call. I would suggest using algolia-indexing instead of atomic-algolia (used in production on TalkSearch and with better test suite).

algolia-indexing currently only implements what I call "full atomic" indexing. This will make sure to usz as few operations as possible (by only applying a diff of changes), but to do so in an atomic way, it requires a plan that can hold twice the number of records actually used.

I planned on implementing another mode, called "live diff" that will be similar, the only difference being that it won't be atomic (making the diff live on the production index), still using as few operations as possible, but not needing a large plan.

Both modes have their merits, it's all a question of trade-offs. Considering that your current implementation already requires a plan that can hold twice the number of records, I think going with a full atomic can only be an improvement and implementing the live diff can wait.

from gatsby-plugin-algolia.

Haroenv avatar Haroenv commented on May 24, 2024

What I did now is already fairly close to "full atomic" I think, but taking the generated objects as source of truth (create temp index and switch), so not exactly worth the "effort" to switch. But it would be nice if a live diff (since everything always has hashes in graphQL this would be possible to leverage). Is this something you have the bandwidth for to collaborate on @pixelastic?

from gatsby-plugin-algolia.

u12206050 avatar u12206050 commented on May 24, 2024

Any update on this? I am using way to many operations between builds and 99% of all the data indexed is still the same. Any info on how I could manually implement this "algolia-indexing" you are talking about? links or docs could be helpful :) Thanks though for the plugin.

from gatsby-plugin-algolia.

Haroenv avatar Haroenv commented on May 24, 2024

There was no update because nobody commented here in months, so I worked on other things. Are you interested in contributing here? I can give some pointers where to start.

from gatsby-plugin-algolia.

Haroenv avatar Haroenv commented on May 24, 2024
  1. Find out where long-term storage can be done (somewhere in gatsby’s cache / somewhere on the file system / in a different Algolia index)
  2. When indexing, compute a hash of each object
  3. Before indexing, compare the computed hash with the next coming hash (Map with objectId: oldHash)
    4 only index deleted / added / modified objects

from gatsby-plugin-algolia.

u12206050 avatar u12206050 commented on May 24, 2024

I tried to search algolia-indexing and came to mainly to the Algolia Docs, but I wouldn't know how to or where to start doing changes within this plugin to accomplish what you guys mentioned. I am running my builds on Netlify so the only thing I can use it netlify cache to keep track of indexed objects.

from gatsby-plugin-algolia.

u12206050 avatar u12206050 commented on May 24, 2024

Ok so from the sounds of it I need some external key:hash storage space that I can query to check before indexing objects since Netlify's cache gets cleared. I'll see if I can first implement a fork that uses Netlify's cache or as a function that is optional whereby anyone can give the hash for a given object key.

from gatsby-plugin-algolia.

u12206050 avatar u12206050 commented on May 24, 2024

Am I correct in assuming that in the current state of the plugin if I simply filter out what has changed it then only adds those objects to the ${indexName}_tmp and then overwrites the existing index once done meaning that only the changed objects will actually be in Algolia and everything else that didn't change will be lost?

Meaning I have to remove that piece of code and update the main index directly?

from gatsby-plugin-algolia.

Haroenv avatar Haroenv commented on May 24, 2024

If you do it that way, there will be a flash of wrong or no results

from gatsby-plugin-algolia.

u12206050 avatar u12206050 commented on May 24, 2024

Thanks, I have removed my previous pull request and made a new one using Algolia to check for updates. It compares specified fields to see if an object should be updated, inserted, removed or just ignored.

from gatsby-plugin-algolia.

danvernon avatar danvernon commented on May 24, 2024

Thanks, I have removed my previous pull request and made a new one using Algolia to check for updates. It compares specified fields to see if an object should be updated, inserted, removed or just ignored.

Did this get pushed?

from gatsby-plugin-algolia.

Haroenv avatar Haroenv commented on May 24, 2024

It has not been published yet (sorry), but as far as I can tell @u12206050 has published his fork on npm: https://yarnpkg.com/en/package/gatsby-plugin-algolia-search

from gatsby-plugin-algolia.

danvernon avatar danvernon commented on May 24, 2024

@Haroenv thanks for the quick update - i followed the instructions, but can see my operations are increasing with every build - the idea of this was that it would only need to update changed records right?

from gatsby-plugin-algolia.

u12206050 avatar u12206050 commented on May 24, 2024

@danvernon Have you tried this: gatsby-plugin-algolia-search)

from gatsby-plugin-algolia.

danvernon avatar danvernon commented on May 24, 2024

@u12206050 yes thats why I just implemented - its doing about 800 actions per build. I have 628 records. Heres my code.

{
      resolve: `gatsby-plugin-algolia-search`,
      options: {
        appId: process.env.GATSBY_ALGOLIA_APP_ID,
        apiKey: process.env.ALGOLIA_ADMIN_KEY,
        queries,
        chunkSize: 10000, // default: 1000
        enablePartialUpdates: true, // default: false
        matchFields: ['slug', 'modified'], // Array<String> default: ['modified']
      },
    }
const productQuery = `{
  products: allShopifyProduct {
    edges {
      node {
        objectID: id
        title
        handle
        description
        images {
          originalSrc
        }
        variants {
          price
        }
      }
    }
  }
}`

const flatten = arr =>
  arr.map(({ node: { ...rest } }) => ({
    ...rest,
  }))

const settings = {
  attributesToSnippet: [`description:20`],
}

const queries = [
  {
    query: productQuery,
    transformer: ({ data }) => flatten(data.products.edges),
    indexName: `Products`,
    settings,
    matchFields: ['slug', 'modified'], // Array<String> overrides main match fields, optional
  },
]

module.exports = queries

from gatsby-plugin-algolia.

u12206050 avatar u12206050 commented on May 24, 2024

It needs both the slug and modified field for comparing, if you don't have those fields change the matchFields in options to something like updated and then fetch the updated field from your source:

const productQuery = `{
  products: allShopifyProduct {
    edges {
      node {
        objectID: id
        title
        updated
        handle
        description
        images {
          originalSrc
        }
        variants {
          price
        }
      }
    }
  }
}```

from gatsby-plugin-algolia.

danvernon avatar danvernon commented on May 24, 2024

@u12206050 i dont have slug, so i can just change for - matchFields: ['handle', 'updatedAt'] yeah?

from gatsby-plugin-algolia.

danvernon avatar danvernon commented on May 24, 2024

@u12206050 hrmm not sure this is still working as intended - it seemed to work when i pushed a build from code, when the hook from changing 1 product - it seemed to take up the 800 actions again.

from gatsby-plugin-algolia.

u12206050 avatar u12206050 commented on May 24, 2024

Hmm that is strange, I can assure you it should work though as we have been using this for months now without fail. We check a date field modified and if/when that value changes then only that post gets updated. One thing it could be is that if you are using the url to check, make sure it doesn't change between development and production environments or just remove it from the matchFields

from gatsby-plugin-algolia.

Haroenv avatar Haroenv commented on May 24, 2024

It could also be that you are modifying every object on build

from gatsby-plugin-algolia.

Haroenv avatar Haroenv commented on May 24, 2024

This has been implemented in 0.8.0 as enablePartialUpdates, thanks @u12206050 :)

from gatsby-plugin-algolia.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.