Giter VIP home page Giter VIP logo

speaker_clustering's Introduction

ZHAW deep voice

The ZHAW deep voice is a package of multiple neural networks that try resolving the speaker clustering task. The goal is to provide a uniform way of data-access, -preprocession and analysis fo the results.

Note that the suite needs the TIMIT Dataset to function at this point. This is a paid product from the LDC and can be obtained here. This data also needs to be processed using the sph2pipe tool and be put in the folder common/data/training/TIMIT

Using deep voice

If you simply want to use it, you can let docker do the work for you and let it import all needed packages.

In any way, whether you fork and pull the source code or let docker handle it for you, the whole suite is controllable over a one file interface, controller.py. It can be run from console with the following calling structure: controller.py [-h] [-setup] [-n network] [-train] [-test] [-plot] [-clear] [-debug] [-best] [-val# ne]

  • -help Display the help screen you are seeing here
  • -setup Create all
  • -n specifiy which network should be used. Available: 'pairwise_lstm', 'pairwise_kldiv', 'flow_me', 'luvo' and 'all' (without the single quotes)
  • -train Specify to train the chosen network
  • -test Specify to test the chosen network
  • -plot Specify to plot the results of the chosen network. If network is 'all', all results will be displayed in one single plot
  • -clear Clear the folder in experiments
  • -debug Set the logging level of Tensorflow to Debug
  • -best Just the best results of the networks will be used in -plot
  • -val# specify which speaker number you want to use (40, 60, 80) to test the networks

As an example, you want to train, and test but not plot the network pairwise_lstm. you would call:

controller.py -n pairwise_lstm -train -test

General remarks

Before you start with your training you should run the controller once with the setup flag. This can take a while, approximately around 10 minutes.

speaker_clustering's People

Contributors

ketharan avatar

Stargazers

Uncle Drew avatar  avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.