During my testing, I found out that 6 samples in one sample folder versus 6 individual

Is it possible to have result sound kept for each individual sample ? about zero_shot_audio_source_separation HOT 5 CLOSED

retrocirce commented on May 30, 2024

Is it possible to have result sound kept for each individual sample ?

from zero_shot_audio_source_separation.

Comments (5)

RetroCirce commented on May 30, 2024 1

Get it! I will find a time to make it, possibly in the beginning of March.

from zero_shot_audio_source_separation.

RetroCirce commented on May 30, 2024

Let me understand what you want to do: do you want to use 6 queries samples to separate the mixture audio 6 times separately instead of only separating 1 time with 6 queries average embeddings?

from zero_shot_audio_source_separation.

bubblegg commented on May 30, 2024

Yes i think that is corect.

from zero_shot_audio_source_separation.

RetroCirce commented on May 30, 2024

In the config.py, the "inference_query" is a folder to get the average embedding from all query samples. Now you want to get each embedding, there are several steps you can do to realize your need:
(1) Take a look at main.py from line 232-246, where we send all query samples into the "AutoTaggingWrapper" to get their average embeddings. Here, you need to take a look at asp_model.py from line 692-702, the code to output the average embedding, you need to revise it to output each query embedding. Then you return to main.py, you could get the embedding of all query samples, such as "all_queries".
(2) From main.py line 257-266, we have a "SeparatorModel" and there is a parameter called "avg_at", now you need to make a for-loop to loop from line 257-266, each time you assign different avg_at from your "all_queries", then you can get different waveform output of your mixture waveform (i.e. the separation of the mixture file from different samples).
(3) One thing to be notice that, in the SeparatorModel class at line 878-887 in asp_model.py, we write the waveform to the system address, since now you need to output many separations, you cannot assign the same names to them, you need to create a new name for them, which is your preference. And you don't need to output the original waveform everytime, which can save your inference time.
(4) Of course, there is a better way to do that. You can directly modify the SeparatorModel class to directly support all queries, but it might involve much work than above. I suggest you can use the above (1),(2),(3) to get what you want, and read the code in detail and start to revise more to make it efficient.

from zero_shot_audio_source_separation.

bubblegg commented on May 30, 2024

(1) Take a look at main.py from line 232-246, where we send all query samples into the "AutoTaggingWrapper" to get their average embeddings. Here, you need to take a look at asp_model.py from line 692-702, the code to output the average embedding, you need to revise it to output each query embedding. Then you return to main.py, you could get the embedding of all query samples, such as "all_queries".

I apologize but my knowledge in reading and understanding code is pretty much 0. I doubt I can do it myself sadly :(.
I think I can only request if you would like to modify the code for solution (1) and paste it here if you would like to.

from zero_shot_audio_source_separation.

Is it possible to have result sound kept for each individual sample ? about zero_shot_audio_source_separation HOT 5 CLOSED

Comments (5)

Related Issues (18)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent