Giter VIP home page Giter VIP logo

farzadforoozanfar / speech-recognition Goto Github PK

View Code? Open in Web Editor NEW
25.0 1.0 0.0 7.66 MB

I recorded 10 voices with the same words from myself and compared them with another 10 words from another person. I was able to find a threshold level that acknowledges and recognizes my own voice.

License: MIT License

Jupyter Notebook 100.00%
distance dtw dtw-algorithm jupyter-notebook python3 speech-processing speech-recognition speech-to-text

speech-recognition's Introduction

Project Description:

In this project, I first converted my recorded sounds to text using librosa, dtw, and speech_recognition libraries, which can also be a way of recognizing speech, so-called text-dependency. A more reliable way is done using the coefficient of Mel or Capstral.

DTW (Dynamic time warping) :

1_BaDgYjm9WHd1Y6aelzPfZw

In time series analysis, dynamic time warping (DTW) is an algorithm for measuring similarity between two temporal sequences, which may vary in speed. For instance, similarities in walking could be detected using DTW, even if one person was walking faster than the other, or if there were accelerations and decelerations during the course of an observation. DTW has been applied to temporal sequences of video, audio, and graphics data โ€” indeed, any data that can be turned into a linear sequence can be analyzed with DTW. A well-known application has been automatic speech recognition, to cope with different speaking speeds. Other applications include speaker recognition and online signature recognition. It can also be used in partial shape matching applications.

In general, DTW is a method that calculates an optimal match between two given sequences (e.g. time series) with certain restriction and rules:

  • Every index from the first sequence must be matched with one or more indices from the other sequence, and vice versa

  • The first index from the first sequence must be matched with the first index from the other sequence (but it does not have to be its only match)

  • The last index from the first sequence must be matched with the last index from the other sequence (but it does not have to be its only match)

  • The mapping of the indices from the first sequence to indices from the other sequence must be monotonically increasing, and vice versa, i.e. if j>i are indices from the first sequence, then there must not be two indices l>k in the other sequence, such that index i is matched with index l and index j is matched with index k, and vice versa

1_xC66E1ENK6HO2Z_FRFv25A

My voice signals using the matplotlib.pyplot chart drawing function:

download

Another person voice signals using the matplotlib.pyplot chart drawing function:

download (1)

Compare two(me & another person) audio signals using dtw.plot():

download (2)

Calculate distance between my voices to each other with dtw.distance() :

table-chart

Calculate distance between my voices and another person voice to each other with dtw.distance() :

table-chart (1)

speech-recognition's People

Contributors

farzadforuozanfar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.