Giter VIP home page Giter VIP logo

Comments (4)

henrymaas avatar henrymaas commented on September 28, 2024 1

Certainly!
The expression 1e-2 represents 0.01, while 1e-3 is equal to 0.001. These are representations of exponential numbers in mathematics. If you prefer, you can fine-tune it using decimal notations. For instance: setting silence_threshold = 0.01 , or silence_threshold = 0.0963 (whatever value that better suits your audio).

Notice that you don't need an exact number, but an aproximation for what should match your audio track. For example, if you record just your voice, in your room with any microphone, and say some phrases, using the default parameters, it should slice it when finds the periods of silence.

In simpler terms, the lower the value of Y axis, observed in a 2d audio spectrogram, the lower the energy present. The combination between the silence window and the silence threshold determines the duration of "silence."

I plan to elucidate this topic through a Jupyter Notebook with visual aids, likely by this weekend, and I'll share the details with you then.

from audioslicer.

henrymaas avatar henrymaas commented on September 28, 2024

Could someone explain how to set the right threshold for the silence?

It depends on your audio track. For instance, if you're recording your voice in a quiet room with minimal environmental noise, you'll determine an optimal silence_threshold. However, this setting won't suit an audio interview conducted in a noisy street. It's essential to analyze your audio content to discern what constitutes "silence."

This silence_threshold signifies the minimal energy required to categorize an audio window as silent.You can try to visual analyze the audio spectrogram and try iterative experimentation (experimenting with different parameters to match your audio).

A brief explanation of slicing can be found in this thread without delving too deeply into the theoretical aspects: #7

I intend to elucidate this subject with illustrations and include it in the readme. I've observed numerous poeple attempting to train models using this code; maybe a brief explanation might help.

from audioslicer.

alvar036 avatar alvar036 commented on September 28, 2024

Thanks for explaining, but i still don't really understand it.
And since the input is 1e-2 are we suppose to change the 1e AND the 2? or how does that work...

Wish we could make a visual interface for it where u can preview the audio waveform and set a threshold just like a Gate setting would working inside a DAW lol.

from audioslicer.

alvar036 avatar alvar036 commented on September 28, 2024

that's awesome! thank you so much for the further explaining, and i will wait for you're notebook :)

from audioslicer.

Related Issues (8)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.