Giter VIP home page Giter VIP logo

spoken_number_recognition's Introduction

Spoken Number Recognition

Read Project_Report.pdf for more details.

Objective

To recognize a single spoken number from zero to nine. To be specific, as one speaks a number (0 - 9), the program will recognize the correct number.

Dataset

Thanks to the dataset from Pannous, http://pannous.net/files/spoken_numbers_pcm.tar.

The dataset includes 2850 .wav files of 15 different people (male and female) speaking number 0 - 9. Besides, 400 .wav files recorded by me and my roommate are added to the dataset.

Python Libraries Required

tensorflow-gpu, keras, librosa, numpy, matplotlib, pyaudio, h5py

Main Idea

Draw the spectrogram of each .wav file, and save as an image. In this way, the speech recognition problem is transfered into an image recognition problem.

Use CNN to build a classifier for the dataset. The CNN model includes 2 Dense (fully connected) layers and 5 Convolution layers, with Max-Pooling and BatchNormalization layers in it.

Evaluation

At the training stage, the acc reaches 100% after 20 epochs, and val_acc reaches 98%. Very nice!

At the real-time test stage, the error rate keeps very low. Nice!

References

[1] https://github.com/libphy/which_animal

[2] https://github.com/pannous/tensorflow-speech-recognition

[3] https://yerevann.github.io/2016/06/26/combining-cnn-and-rnn-for-spoken-language-identification/

spoken_number_recognition's People

Contributors

richardliuliu avatar

Stargazers

Aldo Tamariz avatar Malik Zharykov avatar alec avatar Imran Ashraf avatar yuyin5star avatar Wang Shiqi avatar Evan avatar AIFool avatar RChao Cai avatar  avatar igeng84 avatar zhushiyu avatar  avatar

spoken_number_recognition's Issues

issue in validation

hello
I after the train cnn network want to test the system with reacorder.py code but I can't get the class of inter speech just type start done in more time.
Who know this problem

dataset

hello! I have a problem about the data set recently. I cannot download the data set from the http://pannous.net. Can you give me some help and send the data set to me? Thanks a million!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.