Giter VIP home page Giter VIP logo

sharc's Introduction

Supervised Hierarchical Clustering using Graph Neural Networks for Speaker Diarization

This is the implementation of the following paper:

  • Singh, Prachi, et. al. (2023)."Supervised Hierarchical Clustering using Graph Neural Networks for Speaker Diarization." Proceedings of ICASSP 2023. (paper)

Overview

Prerequisites

The following packages are required to run the code.

Pretrained Models

The following pretrained models are provided.

  • ETDNN x-vector model.
  • PLDA models for Voxconverse and AMI dataset.
  • SHARC models for Voxconverse and AMI.

Installation

Clone the repo and create a new virtual environment

  • clone the repo:
$ git clone [email protected]:prachiisc/SHARC.git
$ cd SHARC
  • Create the environment: We recommend running the recipes from a fresh virtual environment. Make sure to activate the environment before proceeding.
$ conda create --name SHARC --file requirements.txt
$ conda activate SHARC
$ local_dir="Full_path_of_cloned_repository"
$ Add "export KALDI_ROOT=/path_of_kaldi_directory/kaldi" in the first line of $local_dir/path.sh
  • Create Softlinks of necessary directories:
$ local_dir="Full_path_of_cloned_repository"
$ cd $local_dir
$ . ./path.sh
$ ln -sf $KALDI_ROOT/egs/wsj/s5/utils .  # utils dir
$ ln -sf $KALDI_ROOT/egs/wsj/s5/steps .  # steps dir
  • Check the data directories in tools_diar/data Change tools_diar/data/datasetname/wav.scp with your path of wavfiles.

Running the recipes

We include full recipes for reproducing the results for Voxconverse and AMI dataset:

Testing on the Voxconverse dataset

Step 1: X-vector extraction, groundtruth label creation and lists directory formation

   bash services/test_xvec_preprocess.sh <vox_set> nj

<vox_set> : vox_diar/vox_diar_test

nj : number of jobs [min(40,number of processors available)]

Step 2: Testing

   bash scripts/test_xvec_parallel.sh Vox

Testing on the AMI dataset

Step 1: X-vector extraction, groundtruth label creation and lists directory formation

   bash services/test_xvec_preprocess.sh <ami_set> nj

<ami_set> : ami_dev/ami_eval

nj : number of jobs [min(15,number of processors available)]

Step 2: Testing the SHARC model

   bash scripts/test_xvec_parallel.sh AMI

Training the model

Step 1: X-vector extraction, groundtruth label creation and lists directory formation

   bash services/test_xvec_preprocess.sh <train_set> <nj>

Step 2: Training for Voxconverse/AMI

   bash scripts/train_xvec.sh <Vox/AMI>

Cite

If you are using the resource, please cite as follows:

@INPROCEEDINGS{10095372,  
  author={Singh, Prachi and Kaul, Amrit and Ganapathy, Sriram},
  booktitle={2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},  
  title={Supervised Hierarchical Clustering Using Graph Neural Networks for Speaker Diarization},  
  year={2023}, 
  volume={}, 
  pages={1-5}, 
  doi={10.1109/ICASSP49357.2023.10095372}}
  

Contact

If you have any comment or question, please contact [email protected]

sharc's People

Contributors

prachiisc avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.