Giter VIP home page Giter VIP logo

addictedcs / soundfingerprinting Goto Github PK

View Code? Open in Web Editor NEW
910.0 73.0 187.0 61.85 MB

Open source audio fingerprinting in .NET. An efficient algorithm for acoustic fingerprinting written purely in C#.

Home Page: https://emysound.com

License: MIT License

C# 99.95% Batchfile 0.03% Shell 0.02%
audio fingerprints algorithm acoustic-fingerprints recognition locality-sensitive-hashing nearest-neighbor-search shazam c-sharp audio-processing

soundfingerprinting's Introduction

Audio/Video fingerprinting and recognition in .NET

Join the chat at https://gitter.im/soundfingerprinting/Lobby .NET Core MIT License NuGet

soundfingerprinting is a C# framework designed for companies, enthusiasts, researchers in the fields of digital signal processing, data mining and audio/video recognition. It implements an efficient algorithm which provides fast insert and retrieval of acoustic and video fingerprints with high precision and recall rate.

Documentation

Full documentation is available on the Wiki page.

Below code snippet shows how to extract acoustic fingerprints from an audio file and later use them as identifiers to recognize unknown audio query. These fingerprints will be stored in a configurable datastore.

private readonly IModelService modelService = new InMemoryModelService(); // store fingerprints in RAM
private readonly IAudioService audioService = new SoundFingerprintingAudioService(); // default audio library

public async Task StoreForLaterRetrieval(string file)
{
    var track = new TrackInfo("GBBKS1200164", "Skyfall", "Adele");

    // create fingerprints
    var avHashes = await FingerprintCommandBuilder.Instance
                                .BuildFingerprintCommand()
                                .From(file)
                                .UsingServices(audioService)
                                .Hash();
								
    // store hashes in the database for later retrieval
    modelService.Insert(track, avHashes);
}

Querying

Once you've inserted the fingerprints into the datastore, later you might want to query the storage in order to recognize the song those samples you have. The origin of query samples may vary: file, URL, microphone, radio tuner, etc. It's up to your application, where you get the samples from.

public async Task<TrackData> GetBestMatchForSong(string file)
{
    int secondsToAnalyze = 10; // number of seconds to analyze from query file
    int startAtSecond = 0; // start at the begining
	
    // query the underlying database for similar audio sub-fingerprints
    var queryResult = await QueryCommandBuilder.Instance.BuildQueryCommand()
                                         .From(file, secondsToAnalyze, startAtSecond)
                                         .UsingServices(modelService, audioService)
                                         .Query();
    
    return queryResult.BestMatch.Track;
}

Fingerprints Storage

The default storage, which comes bundled with soundfingerprinting NuGet package, is a plain in-memory storage, available via InMemoryModelService class. If you plan to use an external persistent storage for fingerprints Emy is the preferred choice. Emy provides a community version which is free for non-commercial use. More about Emy can be found on wiki page.

Supported audio/video formats

Read Supported Media Formats page for details about processing different file formats or realtime streams.

Video fingerprinting support since version 8.0.0

Since v8.0.0 video fingerprinting support has been added. Similarly to audio fingerprinting, video fingerprints are generated from video frames, and used to insert and later query the datastore for exact and similar matches. You can use SoundFingerprinting to fingerprint either audio or video content or both at the same time. More details about video fingerprinting are available here.

Version Matrix

If you are using FFmpegAudioService as described in the wiki, follow the below version matrix.

SoundFingerprinting SoundFingerprinting.Emy FFmpeg
8.x 8.x 4.x
9.x 9.x 5.x
10.x 10.x 6.x

FAQ

  • Can I apply this algorithm for speech recognition purposes?

No. The granularity of one fingerprint is roughly ~1.46 seconds.

  • Can the algorithm detect exact query position in resulted track?

Yes.

  • Can I use SoundFingerprinting to detect ads in radio streams?

Yes. Actually this is the most frequent use-case where SoundFingerprinting was successfully used.

  • How many tracks can I store in InMemoryModelService?

100 hours of content with DefaultFingerprintingConfiguration will consume ~5GB of RAM.

Get it on NuGet

Install-Package SoundFingerprinting

How it works

Audio Fingerprinting.

Video Fingerprinting.

Demo

My description of the algorithm alogside with the demo project can be found on CodeProject. The article is from 2011, and may be outdated. The demo project is a Audio File Duplicates Detector. Its latest source code can be found here. Its a WPF MVVM project that uses the algorithm to detect what files are perceptually very similar.

Contribute

If you want to contribute you are welcome to open issues or discuss on issues page. Feel free to contact me for any remarks, ideas, bug reports etc.

License

The framework is provided under MIT license agreement.

© Soundfingerprinting, 2010-2024, [email protected]

soundfingerprinting's People

Contributors

addictedcs avatar gitter-badger avatar gviorel avatar jakoss avatar kuleshov-aleksei avatar lvaleriu avatar tristanhessell avatar vbota1 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

soundfingerprinting's Issues

Detecting a song within a stream of songs

With this current version 2.0, is it possible to detect a particular song within a single stream or a long audio file of various songs by knowing the start and end time of when the song was matched in the single stream of various songs

Querying Song From Android

Sound fingerprints are obtained for master list of songs, and stored in database using Fill database functionality.

To recognize the songs from Android, possible to match it through api ? How to compute audio samples from android and match against existing fingerprints in database.

Implement NAudio ReadMonoFromURL and RecordFromMicrophoneToFile methods.

The following 2 methods from NAudioService have to be implemented

public float[] ReadMonoFromURL(string urlToResource, int sampleRate, int secondsToDownload)
{
    throw new NotImplementedException("Use Bass.NET");
}

public float[] RecordFromMicrophoneToFile(string pathToFile, int sampleRate, int secondsToRecord)
{
     throw new NotImplementedException("Use Bass.NET");
}

Add support for a NoSQL data storage

Current backend, MSSQL, has its limits in terms of performance and scalability. Storing large number of songs in a database requires a data storage which is easily scalable horizontally, and has a good performance index.

After studying a large number of NoSQL solutions, would recommend MongoDB for the winner. REDIS would fit perfect for a caching layer, as it seems to perform not as well as other NoSQL database systems when stored set exceeds the amount of RAM available on dedicated machine. Possible amount of songs and fingerprints will definitely exceed RAM capacity. CouchDB seems very similar to Mongo. Because we don't really need master - master replication, as the data will be mostly inserted (with no frequent updates), would go for Mongo as well. Other database system has also been analyzed, though mainly these 3 have been considered as good candidates.

  • Use Aggregation capabilities in order to perform main query on HashTable - HashBin
  • Add complex index on HashTable, HashBin in order to burst performance
  • Map data models to the same structure as in MSSQL (connect entities via references)
  • Would store Album Title as a part of Track to minimize lookups for Album. In case if the album is not known, do not store untitled
  • Sharding (if required) will be performed over the HashTables (25 elements fit perfectly)

Query One Song with Hash Signature

Your code is awesome.

When i try to find a song, your algo uses Hamming Distance calculation which in turn uses fingerprints from original song (binaries). Is there a way to find the song accurately using hashes (int []). If so kindly provide code for it.

Thanks

Show similarities of 2 audio files

If we have 2 audio files A1 and A2 which could be drawed as following :
A1 : XXXXXYYYYZZZZZZZZZZZZZZ (where XXXXXXXXX, YYYYY, ZZZZZZZZZZZ indicate 3 audio regions of differents lengths)
A2 : SSSSSSSSSYYYYYTTTTTTTTT (where SSSSSSSSSS, YYYYY, TTTT indicate 3 audio regions of differents lengths)

then we could call a similarity fonction which will tell us that A1 and A2 have a "similar" region of length N seconds which starts at time t1 in A1 and t2 in A2.

For the moment I'm working myself on this, but feel free to help/contribute if you wish.

Thanks,
Valeriu

Fix the associated problem with SubFingerprint vs Fingerprint.

Fingerprint Service method should return strongly typed object (E.g. Fingerprint) instead of bool []. Don't know yet, whether this is critical.

public interface IFingerprintService
{
   List<bool[]> CreateFingerprints(float[] samples, IFingerprintingConfiguration fingerprintingConfiguration);
}

For FingerprintUnit

RunAlgorithm - return Fingerprints
RunAlgorithmWithHashing - return SubFingerprint

new version versus the one published in CodeProject article

Hi,

When I download the exe published in the code-project article (http://www.codeproject.com/Articles/206507/Duplicates-detector-via-audio-fingerprinting?msg=4829666#xx4829666xx) and evaluate on 250 songs, it is giving accurate results. But when I download the source code from here and test on the same songs, it is not giving the same results. Can I get the source code for the one published in code-project? I want to work on it...

Thanks,
Sateesh

Out of Memory Exception

Following code was throwing an memory exception in my application

[File Name : BassAudioService.cs]
// get re-sampled/mono data
int bytesRead = Bass.BASS_ChannelGetData(mixerStream, buffer, buffer.Length * 4);
if (bytesRead == 0)
{
break;
}

bytes read was returning -1 when the limit was reached but the while loop was running for long time and chunks.Add(chunk) was creating memory exception. Do i need to change

if (bytesRead == -1)
{
break;
}
Am i missing something ?

Show similarities of 2 audio files

If we have 2 audio files A1 and A2 which could be drawed as following :
A1 : XXXXXYYYYZZZZZZZZZZZZZZ (where XXXXXXXXX, YYYYY, ZZZZZZZZZZZ indicate 3 audio regions of differents lengths)
A2 : SSSSSSSSSYYYYYTTTTTTTTT (where SSSSSSSSSS, YYYYY, TTTT indicate 3 audio regions of differents lengths)

then we could call a similarity fonction which will tell us that A1 and A2 have a "similar" region of length N seconds which starts at time t1 in A1 and t2 in A2.

For the moment I'm working myself on this, but feel free to help/contribute if you wish.

Thanks,
Valeriu

Split soundfingerprinting repository into multiple git repos

soundfingerprinting repository has grown immensely over the last years. It's time to split it up according to responsabilities:

  • soundfingerprinting
    • soundfingerprinting.audio
    • soundfingerprinting.audio.naudio // as the default audio engine
    • soundfingerprinting.infrastructure
    • soundfingerprinting.math
    • soundfingerprinting.minhash.permutations
    • soundfingerprinting.fft.exocortex

List of dependent libraries on soundfingerprinting core (with their own release planning etc.):

  • soundfingerprinting.audio.bass
  • soundfingerprinting.sql
  • soundfingerprinting.mongodb
  • soundfingerprinting.neuralhasher
  • soundfingerprinting.soundtools
  • soundfingerprinting.duplicatesdetector

Dependent libraries should contain their own build cmdlets, as well as unit test suites.

Remove bindings for public interfaces

Public interfaces should not contain any default bindings s.t. they will be indadvertedly used without the caller knowing about it.

Only internal interfaces should have them.

Implement Timsort in FingerprintDescriptor class

2 adjacent fingerprints from the same audio snippet are usually very similar in terms of their top 200 wavelet locations. In order to take advantage of this characteristic, we can keep track of indexes of previously sorted array, and reassign them when FingerprintDescriptor.ExtractTopWavelets is called.

A good implementation of Timsort algorithm can be found here.

Below method has to be modified:

public bool[] ExtractTopWavelets(float[][] frames, int topWavelets)
{
       int rows = frames.GetLength(0); /*128*/
       int cols = frames[0].Length; /*32*/
       float[] concatenated = new float[rows * cols]; /* 128 * 32 */
       for (int row = 0; row < rows; row++)
       {
            Buffer.BlockCopy(frames[row], 0, concatenated, row * frames[row].Length * 4, frames[row].Length * 4);
       }

       int[] indexes = Enumerable.Range(0, concatenated.Length).ToArray();
       Array.Sort(concatenated, indexes, absComparator); // TODO Use Timsort
       bool[] result = EncodeFingerprint(concatenated, indexes, topWavelets);
       return result;
}

Error running build.cmd

Hi.
I'm trying to run the build.cmd but ends with the following error:

D:\Documentos\soundfingerprinting-master\build\Build.proj(33,2): error MSB3073: O comand
o "..\tools\MSTest.exe /testcontainer:Release\UnitTests\SoundFingerprinting.Tests.dll" concluiu com o código -532462766.

What can be the reason?
My system is windows 8.1 x64.

Thanks and sorry by the poor english.

Show similarities of 2 audio files

If we have 2 audio files A1 and A2 which could be drawed as following :
A1 : XXXXXYYYYZZZZZZZZZZZZZZ (where XXXXXXXXX, YYYYY, ZZZZZZZZZZZ indicate 3 audio regions of differents lengths)
A2 : SSSSSSSSSYYYYYTTTTTTTTT (where SSSSSSSSSS, YYYYY, TTTT indicate 3 audio regions of differents lengths)

then we could call a similarity fonction which will tell us that A1 and A2 have a "similar" region of length N seconds which starts at time t1 in A1 and t2 in A2.

For the moment I'm working myself on this, but feel free to help/contribute if you wish.

Thanks,
Valeriu

Substitute DirectSound with NAudio

DirectSound service has to be deleted from the project. It is useless in terms of its usability (reading only .wav files).
Substitute DirectSound with NAudio in unit test project and delete DirectSound (amen).

Denormalize Album class and embed it to Track entity

Album model can be safely embedded into Track as it represents mainly read only information (it will be updated only in case if an error is introduced, which is unlikely). Having a typical read only model, would suggest denormalizing it into Track:

  • Album name
  • Album release year (is it really useful?)

Mark those fields as nullable (in case when album information is not known).

Audio matching from smartphone records

Hi,

I tried to use your library to build a "Shazam like" app for myself. Only for testing purpose, I recorded an audio sample with my smartphone, imported it on my computer and query the database with it.

Unfortunately, with the default configurations, I never get a correct matching of the song. The record is not too bad, there's no significant noise since it was recorded at home in a quiet room from my main speakers system.

Still, default configurations work great to find songs duplicates, so I'm just wondering if there's any configuration parameter that can be specifically tune to reach my goal.

Thanks

Xavier Q.

Remove option of storing permutations into the database

Permutations don't have to be stored in the database. They rarely (actually never) get changed.

Remove database permutations related code from the SoundFingerprinting project. Create a patch to delete them from the database (as well as their associated stored procedures).

Hide methods from IModelService which are database specific

IModelService contains methods which expose underlying database structure. In order to seamlessly integrate NoSQL data support, underlying model structure has to be hidden.
IModelService has to contain only the following methods:

{
    IModelReference InsertTrack(TrackData track);
    void InsertHashDataForTrack(IEnumerable<HashData> hashes, IModelReference trackReference);
    IList<SubFingerprintData> ReadSubFingerprintDataByHashBucketsWithThreshold(long[] buckets, int threshold);
}

Rename ReadSubFingerprintDataByHashBucketsWithThreshold method to better depict underlying functionality.

Include permutations for MinHash algorithm in a compiled class

Currently, permutations are read either from database or from a local file. In order to have a resilient library, include a class with predefined permutations declared in a static array. This will allow deploying the dlls without accompanying files.

Attempting to test audio fingerprinting using v2

Unsure how to test v2 of this library. I attempted to test using the Nuget version, but that seems to be 1.0.1

How do I use version 2.0?

-- Just realised that the github code trunk is currently at version 2...

Empty HashTables in the Database

Hi, When generating fingerprints for a song. the subfingerprint table is getting filled correctly. But when querying for a best match, the code is looking for entries in the 25 Hash tables which are all empty. so a match is never found .

Is the code supposed to add rows to the 25 hash tables during creation of a fingerprint ?
Thanks in advance

Remove all StyleCop violations

Getting rid of all StyleCop violations will rather improve code readability. No violations has to be added to existent ones.

Profile MongoDb service

Profile MongoDbModelService. Time required to fingerprint data from an audio file is unfeasible. This is strongly related to groupby functionality used by the algorithm.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.