Giter VIP home page Giter VIP logo

sg_streets's Introduction

SG_streets

An evaluation set for speech recognition experiments used in [1]. This set contains 6 interview sessions (close talk microphone) of Singaporean students reading passages about Singapore streets. Each interview sessions consists of 2 recordings corresponding to interviewer and interviewee.

No. Recording id Street names Speaker gender Duration # Words
1 mml-14-feb-2018-a-session4 Balestier, Ang Mo Kio Female 00:09:57 1,479
2 mml-15-dec-2017-session4 Kreta ayer, Aljunied Male 00:10:25 1,207
3 mml-19-dec-2017-b-session4 Marsiling, Bedok Male 00:07:24 862
4 mml-19-jan-2018-b-session4 Boon Lay Female 00:09:30 1,695
5 mml-24-jan-2018-a-session4 Bukit Batok Male 00:07:41 1,081
6 mml-24-jan-2018-b-session4 Sembawang Female 00:09:03 1,018

Other notes:

  • Compound street names are combined using underscore symbol, i.e. 'boon lay' -> 'boon_lay'.
  • To simulate the scenario where these street names are rare words, ensure that they are absent or appear a few times (1-3) in the train set.
  • The main application of the evaluation set is to correctly recognize these street names while preserving the WER.
  • List of named entities are given in file named_entities.txt.
  • List of other Singapore street names can be found in: https://geographic.org/streetview/singapore/
  • The pronunciation lexicon can be obtained from G2P models such as http://www.speech.cs.cmu.edu/tools/lextool.html. E.g. for word "boon_lay", first generate pronunciation lexicon of "boon" and "lay" separately, and then combine all pronunciation variations.
  • The readers are Singaporeans (accent is different from English speakers in other countries), and thus train set containing Singapore English is recommended, e.g. https://www2.imda.gov.sg/NationalSpeechCorpus. (Note: in the paper we used strong Singapore English ASR trained on our own dataset).

References

[1] Khassanov, Y., Zeng, Z., Pham, V.T., Xu, H., Chng, E.S. (2019) Enriching Rare Word Representations in Neural Language Models by Embedding Matrix Augmentation. Proc. Interspeech 2019, 3505-3509, DOI: 10.21437/Interspeech.2019-1858..

sg_streets's People

Contributors

khassanoff avatar

Stargazers

 avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.