Giter VIP home page Giter VIP logo

awesome-speech's Introduction

awesome-speech

this is a treasure-house of speech

目录

语音识别

page

Xingyu Na

Language Processing and Pattern Recognition in University of Aachen

Fernando de la Calle Silos

open source library/toolbox/code

HTK

Py2HTK

parallel-htk

HTK_C_MATLAB_tools

Kaldi:

Kaldi官方文档(中文版)

Kaldi models

Corpus Phonetics Tutorial

py-kaldi-asr

Dan's DNN implementation:

pytorch-kaldi

kaldi-lstm

kaldi-ctc

keras-kaldi

python wrapper for kaldi-online-decoder

Kaldi+PDNN

tfkaldi

Kaldi_CNTK_AMI

kaldi-io-for-python

kaldi-pyio

kaldi-tree-conv

kaldi-ivector

kaldi-yesno-tutorial

Kaldi nnet3 教程

Josh Meyer's Website

Adapting your own Language Model for Kaldi

Some Kaldi Notes

kaldi_tutorial

Online decoder for Kaldi NNET2 and GMM speech recognition models with Python bindings

ResNet-Kaldi-Tensorflow-ASR

Kaldi ASR: Extending the ASpIRE model

FastCGI support for Kaldi ASR

alignUsingKaldi

kaldi-readers-for-tensorflow

kaldi-iot

lattice-info

lattice-char-to-word

lattice-word-length-distribution

kaldi-lattice-word-index

kaldi-decoders

lattice-remove-ctc-blank

kaldi-lattice-search

htk2kaldi

parallel-kaldi

kaldi 在线中文识别系统搭建

kaldi-docker

CSLT-Sparse-DNN-Toolkit

featxtra

Sphinx

OpenFst

MIT Spoken Language Systems

Julius

Bavieca

Simon

SIDEKIT

SRILM

awd-lstm-lm

ISIP

MIT Finite-State Transducer (FST) Toolkit

MIT Language Modeling (MITLM) Toolkit

OpenGrm

RNNLM

faster-rnnlm

CUED-RNNLM Toolkit

Using RNNLM rescoring a sentence in Chinese ASR system

KenLM

rwthlm

word-rnn-tensorflow

tensorlm

SpeechRecognition

SpeechPy

Aalto

google-cloud-speech

apiai

https://pypi.org/project/apiai/

wit

Nabu

asr-study

dejavu

uSpeech

Juicer

PMLS

dragonfly

SPTK

pysptk

RWTH ASR

Palaver

Praat

Speech Recognition Grammar Specification

Automatic_Speech_Recognition

speech-to-text-wavenet

tensorflow-speech-recognition

tensorflow_end2end_speech_recognition

tensorflow_speech_recognition_demo

AVSR-Deep-Speech

TTS and ASR

CTC + Tensorflow Example for ASR

tensorflow-ctc-speech-recognition

speechT

end2endASR

NADU

DTW (Dynamic Time Warping) python module

Various scripts and tools for speech recognition model building

基于深度学习的语音识别系统,使用CNN、LSTM和CTC实现的中文语音识别系统

tacotron_asr

ASR_Keras

Kaggle Tensorflow Speech Recognition Challenge

Speech recognition script for Asterisk that uses google's speech engine

Libraries and scripts for manipulating and handling ASR output/n-bests/etc

Some scripts and commands for working with ASR

PySpeechGrammar

Python module for evaluating ASR hypotheses

edit-distance

dataset

VoxForge

ASR Audio Data Links

The CMU Pronouncing Dictionary

TIMIT

GlobalPhone Language Models

1 Billion Word Language Model Benchmark

DaCiDian-Develop

CC-CEDICT

TED-LIUM

open-asr-lexicon

Tutorial

University of Edinburgh ASR2017-18

stanford CS224s

NYU asr12

Speech Recognition with Neural Networks

语音合成

page

CSTR-Edinburgh

open source library/toolbox

WORLD

HTS

Tacotron

Tacotron2

Merlin

mozilla TTS

Flite

Speect

Festival

eSpeak

nnmnkwii

Ossian

gTTS

gnuspeech

supercollider

sc3-plugins

Neural_Network_Voices

pggan-pytorch

cainteoir-engine

loop

nnmnkwii

TTS and ASR

musa_tts

marytts(JAVA)

声纹识别

open source library/toolbox

Alize

speaker-recognition-py3

openVP

Gender recognition by voice and speech analysis

对话系统

pages

NTU

Tsung-Hsien Wen

open source library/toolbox

PyDial

alex

ROS 语音交互系统

结合ROS框架的中文语音交互系统

前端

Speech Processing

madmom

pydub

kapre: Keras Audio Preprocessors

BTK

EspNet

Signal-Processing

pyroomacoustics

librosa

REAPER

MSD_split_for_tagging

VOICEBOX

liquid-dsp

ffts

mir_eval

aupyom

Pitch Detection

TFTB

maracas

SRMRpy

ssp

iss

asr_preprocessing

asrt

Audio super resolution using NN

RNN training for noise reduction in robust asr

RNN for audio noise reduction

muda

Efficient sample rate conversion in python

Smarc audio rate converter

Python scripts to computes f0s of a wave file

Audio I/O

PortAudio

audiolab

pytorch audio

Digital Speech Decoder

audioread

audacity.py

Sound Source Separation

HARK

Deep RNN for Source Separation

nussl

DNN for Music Source Separation in Tensorflow

Alexey Ozerov

University of Surrey CVSSP

source separation using CNN

Feature Extraction

openSMILE

veles.sound_feature_extraction

vamp-plugin-sdk

Yaafe

py_bank

AuditoryFilterbanks

python_speech_features

VAD

rVAD

Aurora 2 VAD

IsraelCohen

Python interface to the WebRTC Voice Activity Detector

资源

###

code/tool/data

cmusphinx

julius-speech

OpenSLR

List of speech recognition software

KTH

VERBIO

timeview

Speech at CMU Web Page

CMU Robust Speech Group

Speech Software at CMU

Aalto Speech Research

CMU Festvox Project

CSTR

Xiph

Brno University of Technology Speech Processing Group

SoX

STRAIGHT

Idiap Research Institute

Transcriber

Amirsina Torfi

The Speech Recognition Virtual Kitchen

Sparse Representation & Dictionary Learning Algorithms with Applications in Denoising, Separation, Localisation and Tracking

Audacity

beetbox

CAQE

UCL Speech Filing System

Ryuichi Yamamoto

Kyubyong Park

Hideyuki Tachibana

Colin Raffel

Paul Dixon

smacpy

c4dm

Matt Shannon

Keunwoo Choi

ADASP

uchicago Speech and Language @ TTIC

justin salamon

COLEA

openAUDIO

Praat

librosa

Essentia

timmahrt

Lefteris Zafiris

audio-to-audio and audio-to-midi alignment

DNN based hotword and wake word detection toolkit

free-spoken-digit-dataset

中文语言资源联盟

Institute of Formal and Applied Linguistics – Dialogue Systems Group

tutorial

DL for Computer Vision, Speech, and Language

臺大數位語音處理概論

IISc Speech Information Processing

paper

states of the arts and recent results (bibliography) on speech recognition

主页

Dan Povey

cmusphinx

CMU Language Technologies Institute

CMU SPEECH@SV

Mitsubishi Electric Research Laboratorie

MIT Spoken Language Systems

Brno University of Technology Speech Processing Group

IISc

uchicago Speech and Language @ TTIC

RWTH Aachen University

TOKUDA and NANKAKU LABORATORY

Institute of Formal and Applied Linguistics – Dialogue Systems Group

Ohio State University speech separation

LEAP Laboratory

Hainan Xu

Mark Gales

Karen Livescu

Shubham Toshniwal

Adrien Ycart

Ron Weiss

Yajie Miao

Scott T Wisdom

Alan W Black

Amirsina Torfi

Liang Lu

Zhizheng WU

justin salamon

Karen Livescu

Shubham Toshniwal

Keith Vertanen

Aviv Gabbay

Mehryar Mohri

Jonathan LE ROUX

Suyoun Kim

DeepSound

Lei Xie

awesome-speech's People

Contributors

mxer avatar

Watchers

James Cloos avatar paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.