Giter VIP home page Giter VIP logo

Comments (1)

JorenSix avatar JorenSix commented on May 31, 2024

Interesting to hear your experiences. I have indeed tested Olaf with 800GB of music and have not experimented with larger datasets or non-music (less information rich) signals.

Some insights/pointers/possible optimisations:

  • Depending on how similar the reference and query audio is and how long a query is allowed to be the configuration can be optimised: if you want to support short (1s) noisy queries you need a lot of fingerprints, if you can deal with long (20s+), clear queries, the system can be configured to use much less fingerprints.
  • To speed up db creation you might want to look at lmdb bulk import which expects sorted keys. The idea here is to extract fingerprints, sort them externally and do a single bulk import. This is not supported by olaf by default but is a potential performance improvement.
  • There will be a lot of duplicate fingerprints in your dataset. I suspect the slowdown and size increase to be related to these hash matches. A B+tree essentially becomes an inefficient list if there are many hash collisions.
  • To further reduce fingerprints you might want to look at a silence treshold. Now Olaf also extracts fingerprints from quiet parts which perhaps can be skipped for your use case.
  • Olaf can be distributed: perhaps you can maintain 10 seperate instances and put an api in front of Olaf to distribute queries and return results.

Good luck with your project

from olaf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.