Giter VIP home page Giter VIP logo

deepvot's Introduction

Deprecated! Please use the following repo with a Python implementation: https://github.com/MLSpeech/Dr.VOT

DeepVOT - Automatic Measurement of Voice Onset Time and Prevoicing using Recurrent Neural Networks

Voice onset time (VOT) is defined as the time difference between the onset of the burst and the onset of voicing. When voicing begins preceding the burst, the stop is called prevoiced, and the VOT is negative. When voicing begins following the burst the VOT is positive. While most of the work on automatic measurement of VOT has focused on positive VOT mostly evident in American English, in many languages the VOT can be negative. We propose an algorithm that estimates if the stop is prevoiced, and measures either positive or negative VOT, respectively. More specifically, the input to the algorithm is a speech segment of an arbitrary length containing a single stop consonant, and the output is the time of the burst onset, the duration of the burst, and the time of the prevoicing onset with a confidence. Manually labeled data is used to train a recurrent neural network that can model the dynamic temporal behavior of the input signal, and outputs the events' onset and duration. Results suggest that the proposed algorithm is superior to the current state-of-the-art both in terms of the VOT measurement and in terms of prevoicing detection.

If you find our work useful please cite: [Automatic Measurement of Voice Onset Time and Prevoicing using Recurrent Neural Networks] (http://u.cs.biu.ac.il/~jkeshet/papers/AdiKeDmGo16.pdf)

@article{adi2016automatic,
  title={Automatic Measurement of Voice Onset Time and Prevoicing using Recurrent Neural Networks},
  author={Adi, Yossi and Keshet, Joseph and Dmitrieva, Olga and Goldrick, Matt},
  journal={Interspeech 2016},
  pages={3152--3155},
  year={2016}
}

Content

The repository contains code for VOT and prevoicing measurement, feature extraction and visualization tools.

  • back_end folder: contains the training algorithms, it can be used for training the model on new datasets or using different features.
  • front_end folder: contains the features extraction algorithm, it can be used for configuring different parameters for the feature extraction or just for visualization.
  • post_process folder: contains the post processing algorithms for extracting the measurements from the network probability distribution
  • visualization folder: contains features visualization tools.
  • run_all folder: contains the scripts and models to run the code end-to-end.

Installation

The code runs on MacOSX only.

Dependencies

The code uses the following dependencies:

  • Torch7 with RNN package
git clone https://github.com/torch/distro.git ~/torch --recursive
cd ~/torch; bash install-deps;
./install.sh 

# On Linux with bash
source ~/.bashrc
# On Linux with zsh
source ~/.zshrc
# On OSX or in Linux with none of the above.
source ~/.profile

# For rnn package installation
luarocks install rnn

Model Installation

Download the model from: [DeepVot Model] (https://drive.google.com/file/d/0B69m3kcUfbmPOUE0VkpzOVF2TzQ/view?usp=sharing). Then, move the model file to: run_all/lua_scripts/model/ inside the project directory.

Usage

For measurement just type from the run_all folder:

python predict.py "input wav file" "output text grid file" "start time to search" "end time to search"

Example

You can try our tool using the example file in the data folder and compare it to the manual annotation. cd into the run_all directory and type:

python predict.py test_data/orig/bun.wav test_data/prediction/bun.TextGrid 0.0 0.2

or

python predict.py test_data/orig/bag.wav test_data/prediction/bag.TextGrid 0.56 0.65

deepvot's People

Contributors

adiyoss avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

deepvot's Issues

Too many paths to consider

Hi,

While I love the effort, there are many dependencies and library paths that need to be set up to get this to work.

On OSX Catallina (and after having installed according to the instructions) i get this:

$ python predict.py test_data/orig/bun.wav test_data/prediction/bun.TextGrid 0.0 0.2

1) Extracting features and classifying ...
12:17:37.946 [VotFrontEnd2] INFO: Processing 1 instances.
12:17:37.964 [VotFrontEnd2] INFO: Features extraction completed.

==> processing options	
==> load test file	

==> loading model	
/Users/frkkan96/torch/install/bin/luajit: /Users/frkkan96/torch/install/share/lua/5.1/torch/File.lua:343: unknown Torch class <nn.LinearNoBias>
stack traceback:
	[C]: in function 'error'
	/Users/frkkan96/torch/install/share/lua/5.1/torch/File.lua:343: in function 'readObject'
	/Users/frkkan96/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
	/Users/frkkan96/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
	/Users/frkkan96/torch/install/share/lua/5.1/nn/Module.lua:192: in function 'read'
	/Users/frkkan96/torch/install/share/lua/5.1/torch/File.lua:351: in function 'readObject'
	/Users/frkkan96/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
	/Users/frkkan96/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
	/Users/frkkan96/torch/install/share/lua/5.1/nn/Module.lua:192: in function 'read'
	/Users/frkkan96/torch/install/share/lua/5.1/torch/File.lua:351: in function 'readObject'
	/Users/frkkan96/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
	...
	/Users/frkkan96/torch/install/share/lua/5.1/torch/File.lua:351: in function 'readObject'
	/Users/frkkan96/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
	/Users/frkkan96/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
	/Users/frkkan96/torch/install/share/lua/5.1/nn/Module.lua:192: in function 'read'
	/Users/frkkan96/torch/install/share/lua/5.1/torch/File.lua:351: in function 'readObject'
	/Users/frkkan96/torch/install/share/lua/5.1/torch/File.lua:409: in function 'load'
	classify_multi_class.lua:64: in main chunk
	[C]: in function 'dofile'
	...an96/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
	[C]: at 0x010c4e9b00

3) Extract Durations ...
Traceback (most recent call last):
  File "predict.py", line 57, in <module>
    predict(args.input_path, args.output_path, args.start_extract, args.end_extract)
  File "predict.py", line 38, in predict
    post_process(prob_file, predict_file)
  File "/Users/frkkan96/Documents/src/DeepVOT/run_all/post_process.py", line 12, in post_process
    with open(prob_file) as f:
IOError: [Errno 2] No such file or directory: '/Users/frkkan96/Documents/src/DeepVOT/run_all/tmp_files/tmp.prob'

It seems that torch cannot find the rnn library I installed. So, how do I make it find the library?

"Bad CPU type in executable" and other issues

Hi,

I am kinda rookie.

I am running Mac OS10.15: Catalina.

I tried running the test example using

python predict.py test_data/orig/bun.wav test_data/prediction/bun.TextGrid 0.0 0.2

I get the following errors:

`1) Extracting features and classifying ...
/bin/sh: sbin/sox: Bad CPU type in executable
00:24:20.532 [VotFrontEnd2] INFO: Processing 1 instances.
Errror: Could not open file tmp/tmp.wav for reading.

==> processing options
==> load test file

==> ERROR: cannot read file.
/Users/XXXX/torch/install/bin/lua: utils.lua:47: attempt to index global 'dims' (a nil value)
stack traceback:
utils.lua:47: in function 'load_data'
classify_multi_class.lua:53: in main chunk
[C]: in function 'dofile'
...rs02/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: in ?

  1. Extract Durations ...
    Traceback (most recent call last):
    File "predict.py", line 57, in
    predict(args.input_path, args.output_path, args.start_extract, args.end_extract)
    File "predict.py", line 38, in predict
    post_process(prob_file, predict_file)
    File "/Users/XXXX/DeepVOT/run_all/post_process.py", line 12, in post_process
    with open(prob_file) as f:
    IOError: [Errno 2] No such file or directory: '/Users/XXXX/DeepVOT/run_all/tmp_files/tmp.prob'`

Any pointer will be highly appreciated!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.