Giter VIP home page Giter VIP logo

timit_tools's Introduction

Preparing the dataset

With the TIMIT dataset (.wav sound files, .wrd words annotations and .phn phones annotations):

  1. Encode the wave sound in MFCCs: run python mfcc_and_gammatones.py --htk-mfcc $DATASET/train and python mfcc_and_gammatones.py --htk-mfcc $DATASET/test producing the .mfc files with HCopy according to wav_config (.mfc_unnorm is no normalization)

  2. Adapt the annotations given in .phn in frames into nanoseconds in .lab run python timit_to_htk_labels.py $DATASET/train and
    python timit_to_htk_labels.py $DATASET/test producing the .lab files

  3. Replace phones according to the seminal HMM paper of 1989: "Speaker-independant phone recognition using hidden Markov models", phones number (i.e. number of lines in the future labels dictionary) should go from 61 to 39. run python substitute_phones.py $DATASET/train and python substitute_phones.py $DATASET/test

  4. run python create_phonesMLF_and_labels.py $DATASET/train and python create_phonesMLF_and_labels.py $DATASET/test

You can also do that with a make prepare dataset=DATASET_PATH.

You're ready for training with HTK (mfc and lab files)!

Training the HMM models

Train monophones HMM:

make train_monophones dataset_train_folder=PATH_TO_YOUR_DATASET/train
make test_monophones dataset_test_folder=PATH_TO_YOUR_DATASET/test

Or, train triphones:

TODO
make train_triphones dataset_train_folder=PATH_TO_YOUR_DATASET/train
make test_triphones dataset_test_folder=PATH_TO_YOUR_DATASET/test

Replacing the GMM by DBNs

  1. Do full states forced alignment of the .mlf files with make align.

  2. Do a first preparation of the dataset with src/timit_to_numpy.py or src/mocha_timit_to_numpy.py (depending on the dataset) on the above aligned .mlf files.

  3. Train the deep belief networks on it, either using DBN/DBN_timit.py or DBN/DBN_Gaussian_timit.py or DBN/DBN_Gaussian_mocha_timit.py (see inside these files for parameters). Save (pickle at the moment) the DBN objects and the states/indices mappings.

  4. Use the serialized DBN objects and states/indices mappings with viterbi.py, just cd to DBN and do:

    python ../src/viterbi.py output_dbn.mlf /fhgfs/bootphon/scratch/gsynnaeve/TIMIT/test/test.scp ../tmp_train/hmm_final/hmmdefs --d ../dbn_5.pickle ../to_int_and_to_state_dicts_tuple.pickle

timit_tools's People

Contributors

syhw avatar maverick0122 avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.