Giter VIP home page Giter VIP logo

dialectid's Introduction

Automatic Arabic Dialect Detection Task

This code reflects the work described in the InterSpeech'2016 paper on Automatic Dialect Detection in Arabic Broadcast Speech.

It also contains a baseline system for the VarDial'2017 shared task on Arabic Dialect Identification.

Requirements

Provided data:

  • We provide data for five Arabic dialects: Egyptian (EGY), Levantine (LAV), Gulf (GLF), North African (NOR), and Modern Standard Arabic (MSA).

  • The data comes from broadcast news.

VarDial'2017 shared task shared data, and features.

  • The baseline for VarDial'2017 is using data/train.vardial2017/ and data/dev.vardial2017/ for training and development default
  • For each dialect, there are two features files:
  • $dialect.words -- lexical features generated using LVCSR- generated using QCRI MGB-2 submission.
  • $dialect.ivec -- i-vector based on bottleneck features, with a fixed length of 400 per utterance.
  • wav.lst -- link to the original audio files; WAVE audio, Microsoft PCM, 16 bit, mono 16000 Hz.
  • Baseline-- bottleneck iVectors 57.28% accuracy and lexical features 48.43%.

InterSpeech'2016 paper shared data.

  • To reproduce the results in InterSpeech'2016, the script should point to data/train.IS2016/ and data/test.IS2016/ for training and testing.
  • $dialect.words -- lexical features generated using LVCSR;
  • $dialect.ivec -- i-vector based on bottleneck features, with a fixed length of 400 per utterance.
  • $dialect.phones -- phoneme sequence from an automatic phoneme recognition system.
  • $dialect.phone_duration -- phoneme sequence, and the duration in milliseconds for each phone, e.g., w_030 means phone w for 30 milliseconds.

Sample code

Run 'run.sh' for an example of the code and the data

  • features=phones -- you can use words, phones or ivectors;
  • context=6 -- for some features, less context might be enough;
  • NOTE 1: The regularization parameters can be optimized for better performance.
  • NOTE 2: System combination can be explored as well.

Citing

This data and the baseline system are described in this paper:

@inproceedings{ali2016automatic,
  author={Ali, Ahmed and Dehak, Najim and Cardinal, Patrick and Khurana, Sameer and Yella, Sree Harsha and Glass, James and Bell, Peter and Renals, Steve},
  title={Automatic Dialect Detection in Arabic Broadcast Speech},
  booktitle={Interspeech},
  address={San Francisco, CA, USA}
  pages={2934--2938},
  year={2016}
}

dialectid's People

Contributors

disooqi avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.