Giter VIP home page Giter VIP logo

Comments (5)

RetroCirce avatar RetroCirce commented on May 30, 2024 1

Get it! I will find a time to make it, possibly in the beginning of March.

from zero_shot_audio_source_separation.

RetroCirce avatar RetroCirce commented on May 30, 2024

Let me understand what you want to do: do you want to use 6 queries samples to separate the mixture audio 6 times separately instead of only separating 1 time with 6 queries average embeddings?

from zero_shot_audio_source_separation.

bubblegg avatar bubblegg commented on May 30, 2024

Yes i think that is corect.

from zero_shot_audio_source_separation.

RetroCirce avatar RetroCirce commented on May 30, 2024

In the config.py, the "inference_query" is a folder to get the average embedding from all query samples. Now you want to get each embedding, there are several steps you can do to realize your need:
(1) Take a look at main.py from line 232-246, where we send all query samples into the "AutoTaggingWrapper" to get their average embeddings. Here, you need to take a look at asp_model.py from line 692-702, the code to output the average embedding, you need to revise it to output each query embedding. Then you return to main.py, you could get the embedding of all query samples, such as "all_queries".
(2) From main.py line 257-266, we have a "SeparatorModel" and there is a parameter called "avg_at", now you need to make a for-loop to loop from line 257-266, each time you assign different avg_at from your "all_queries", then you can get different waveform output of your mixture waveform (i.e. the separation of the mixture file from different samples).
(3) One thing to be notice that, in the SeparatorModel class at line 878-887 in asp_model.py, we write the waveform to the system address, since now you need to output many separations, you cannot assign the same names to them, you need to create a new name for them, which is your preference. And you don't need to output the original waveform everytime, which can save your inference time.
(4) Of course, there is a better way to do that. You can directly modify the SeparatorModel class to directly support all queries, but it might involve much work than above. I suggest you can use the above (1),(2),(3) to get what you want, and read the code in detail and start to revise more to make it efficient.

from zero_shot_audio_source_separation.

bubblegg avatar bubblegg commented on May 30, 2024

(1) Take a look at main.py from line 232-246, where we send all query samples into the "AutoTaggingWrapper" to get their average embeddings. Here, you need to take a look at asp_model.py from line 692-702, the code to output the average embedding, you need to revise it to output each query embedding. Then you return to main.py, you could get the embedding of all query samples, such as "all_queries".

I apologize but my knowledge in reading and understanding code is pretty much 0. I doubt I can do it myself sadly :(.
I think I can only request if you would like to modify the code for solution (1) and paste it here if you would like to.

from zero_shot_audio_source_separation.

Related Issues (18)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.