Topic: speech-processing Goto Github

Some thing interesting about speech-processing

👇 Here are 551 public repositories matching this topic...

akojimaslp / beamforming-for-speech-enhancement

speech-processing,simple delaysum, MVDR and CGMM-MVDR

User: akojimaslp

speech-enhancement beamforming delay-sum mvdr cgmm-mvdr speech-recognition speech-processing signal-processing python

arjo129 / uspeech

speech-processing,Speech recognition toolkit for the arduino

User: arjo129

Home Page: https://arjo129.wordpress.com/experiments/%C2%B5speech/

arduino speech-recognition speech-processing signal

audio-westlakeu / fullsubnet

speech-processing,PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

Organization: audio-westlakeu

Home Page: https://fullsubnet.readthedocs.io/en/latest/

speech-enhancement speech-processing speech-separation pytorch pretrained-model paper full-band sub-band single-channel noise-reduction

breizhn / dtln

speech-processing,Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.

User: breizhn

noise-reduction deep-learning audio real-time-audio audio-processing noise-suppression tensorflow dns-challenge dtln-model speech-denoising

coqui-ai / open-speech-corpora

speech-processing,💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

Organization: coqui-ai

tts stt speech-to-text text-to-speech speech-recognition speech-synthesis speech-processing voice-recognition voice-activity-detection voice-cloning

ddlbojack / speech-resources

speech-processing,语音方向实验室/公司/资源/实习等，欢迎推荐或自荐

User: ddlbojack

speech speech-processing

speech-processing,Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.

Organization: digitalphonetics

text-to-speech toolkit speech-synthesis deep-learning speech-processing tts pytorch speech

drethage / speech-denoising-wavenet

speech-processing,A neural network for end-to-end speech denoising

User: drethage

machine-learning deep-learning neural-networks speech-denoising speech wavenet end-to-end speech-processing

fgnt / pb_bss

speech-processing,Collection of EM algorithms for blind source separation of audio signals

Organization: fgnt

speech-processing speech-enhancement source-separation beamforming bss em-algorithm multi-channel

gemengtju / tutorial_separation

speech-processing,This repo summarizes the tutorials, datasets, papers, codes and tools for speech separation and speaker extraction task. You are kindly invited to pull requests.

User: gemengtju

speech-separation speech-processing speech-analysis deep-learning deep-neural-networks signal-processing

gionanide / speech_signal_processing_and_classification

speech-processing,Front-end speech processing aims at extracting proper features from short- term segments of a speech utterance, known as frames. It is a pre-requisite step toward any pattern recognition problem employing speech or audio (e.g., music). Here, we are interesting in voice disorder classification. That is, to develop two-class classifiers, which can discriminate between utterances of a subject suffering from say vocal fold paralysis and utterances of a healthy subject.The mathematical modeling of the speech production system in humans suggests that an all-pole system function is justified [1-3]. As a consequence, linear prediction coefficients (LPCs) constitute a first choice for modeling the magnitute of the short-term spectrum of speech. LPC-derived cepstral coefficients are guaranteed to discriminate between the system (e.g., vocal tract) contribution and that of the excitation. Taking into account the characteristics of the human ear, the mel-frequency cepstral coefficients (MFCCs) emerged as descriptive features of the speech spectral envelope. Similarly to MFCCs, the perceptual linear prediction coefficients (PLPs) could also be derived. The aforementioned sort of speaking tradi- tional features will be tested against agnostic-features extracted by convolu- tive neural networks (CNNs) (e.g., auto-encoders) [4]. The pattern recognition step will be based on Gaussian Mixture Model based classifiers,K-nearest neighbor classifiers, Bayes classifiers, as well as Deep Neural Networks. The Massachussets Eye and Ear Infirmary Dataset (MEEI-Dataset) [5] will be exploited. At the application level, a library for feature extraction and classification in Python will be developed. Credible publicly available resources will be 1used toward achieving our goal, such as KALDI. Comparisons will be made against [6-8].

User: gionanide

speech-processing mfcc linear-prediction-coefficients classifier speech-utterance feature-extraction support-vector-machines gaussian-mixture-models long-short-term-memory principal-component-analysis

haoheliu / voicefixer

speech-processing,General Speech Restoration

User: haoheliu

Home Page: https://haoheliu.github.io/demopage-voicefixer/

speech-processing speech-synthesis speech-enhancement speech-analysis speech tts declipping dereverberation denoise super-resolution

haoheliu / voicefixer_main

speech-processing,General Speech Restoration

User: haoheliu

Home Page: https://haoheliu.github.io/demopage-voicefixer/

speech-processing speech-enhancement speech-analysis speech-synthesis machine-learning tts speech-to-text speech

haoxiangsnr / a-convolutional-recurrent-neural-network-for-real-time-speech-enhancement

speech-processing,A minimum unofficial implementation of the "A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement" (CRN) using PyTorch

User: haoxiangsnr

cnn rnn pytorch speech-enhancement speech-processing cnn-rnn real-time

haoxiangsnr / wave-u-net-for-speech-enhancement

speech-processing,Implement Wave-U-Net by PyTorch, and migrate it to the speech enhancement.

User: haoxiangsnr

Home Page: https://arxiv.org/abs/1806.03185

speech-enhancement unet pytorch speechenhancement speech-processing wave-u-net wave-unet

huawei-noah / speech-backbones

speech-processing,This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.

Organization: huawei-noah

speech-synthesis speech-recognition speech-processing

jtkim-kaist / speech-enhancement

speech-processing,Deep neural network based speech enhancement toolkit

User: jtkim-kaist

speech-enhancement speech-processing

kahne / nonautoreggenprogress

speech-processing,Tracking the progress in non-autoregressive generation (translation, transcription, etc.)

User: kahne

natural-language-processing natural-language-generation artificial-intelligence machine-translation speech-recognition speech-processing

kahne / speechtransprogress

speech-processing,Tracking the progress in end-to-end speech translation

User: kahne

natural-language-generation speech-translation artificial-intelligence spoken-language-processing machine-translation natural-language-processing speech-processing spoken-language-translation

linto-ai / whisper-timestamped

speech-processing,Multilingual Automatic Speech Recognition with word-level timestamps and confidence

Organization: linto-ai

deep-learning speech speech-recognition speech-to-text asr machine-learning python python3 pytorch attention-is-all-you-need

microsoft / torchscale

speech-processing,Foundation Architecture for (M)LLMs

Organization: microsoft

Home Page: https://aka.ms/GeneralAI

computer-vision machine-learning multimodal natural-language-processing pretrained-language-model speech-processing transformer translation

microsoft / unispeech

speech-processing,UniSpeech - Large Scale Self-Supervised Learning for Speech

Organization: microsoft

pytorch speech-recognition speech-processing speech diarization speech-separation speech-diarization speaker-verification

midas-research / audino

speech-processing,Open source audio annotation tool for humans

Organization: midas-research

audio-processing speech-processing machine-learning annotation-tool audio-annotation python datasets

mravanelli / sincnet

speech-processing,SincNet is a neural architecture for efficiently processing raw audio samples.

User: mravanelli

deep-learning audio waveform filtering cnn convolutional-neural-networks speaker-recognition speaker-verification speaker-identification speech-recognition

nanahou / awesome-speech-enhancement

speech-processing,A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.

User: nanahou

speech-enhancement speech-processing signal-processing deep-neural-networks machine-learning-algorithms

novoic / surfboard

speech-processing,Novoic's audio feature extraction library

Organization: novoic

Home Page: https://novoic.com

feature-extraction audio machine-learning audio-processing python speech-processing healthcare signal-processing alzheimers-disease parkinsons-disease

nvidia / cleanunet

speech-processing,Official PyTorch Implementation of CleanUNet (ICASSP 2022)

Organization: nvidia

speech-denoising speech-enchacement noise-reduction speech-processing

pliang279 / awesome-multimodal-ml

speech-processing,Reading list for research topics in multimodal machine learning

User: pliang279

multimodal-learning machine-learning representation-learning natural-language-processing computer-vision speech-processing robotics healthcare reading-list deep-learning

pliang279 / multibench

speech-processing,[NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning

User: pliang279

machine-learning multimodal-learning robotics natural-language-processing computer-vision deep-learning healthcare representation-learning speech-processing

pyannote / pyannote-audio

speech-processing,Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Organization: pyannote

Home Page: http://pyannote.github.io

pytorch speech-processing speaker-diarization speech-activity-detection speaker-change-detection speaker-embedding voice-activity-detection pretrained-models overlapped-speech-detection speaker-recognition

r9y9 / deepvoice3_pytorch

speech-processing,PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

User: r9y9

Home Page: https://r9y9.github.io/deepvoice3_pytorch/

tts speech-synthesis end-to-end speech-processing machine-learning pytorch python multi-speaker

r9y9 / nnmnkwii

speech-processing,Library to build speech synthesis systems designed for easy and fast prototyping.

User: r9y9

Home Page: https://r9y9.github.io/nnmnkwii/latest/

machine-learning speech-synthesis voice-conversion python text-to-speech speech-processing

r9y9 / pysptk

speech-processing,A python wrapper for Speech Signal Processing Toolkit (SPTK).

User: r9y9

Home Page: http://pysptk.readthedocs.io/en/latest/

python-wrapper speech-processing python speech-synthesis speech dsp sptk digital-signal-processing

r9y9 / ttslearn

speech-processing,ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)

User: r9y9

Home Page: https://r9y9.github.io/ttslearn/

tts text-to-speech python speech-processing deep-learning neural-networks dnn digital-signal-processing book python-tts

r9y9 / wavenet_vocoder

speech-processing,WaveNet vocoder

User: r9y9

Home Page: https://r9y9.github.io/wavenet_vocoder/

wavenet speech-synthesis speech-processing pytorch python wavenet-vocoder neural-vocoder speech

resemble-ai / resemble-enhance

speech-processing,AI powered speech denoising and enhancement

Organization: resemble-ai

Home Page: https://huggingface.co/spaces/ResembleAI/resemble-enhance

denoise speech-denoising speech-enhancement speech-processing

rishikksh20 / vocgan

speech-processing,VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network

User: rishikksh20

vocoder gan melgan vocgan speech-synthesis text-to-speech speech-processing

ryuk17 / speechalgorithms

speech-processing,Speech Algorithms

User: ryuk17

speech-processing

santi-pdp / pase

speech-processing,Problem Agnostic Speech Encoder

User: santi-pdp

deep-learning waveform-analysis pytorch unsupervised-learning multi-task-learning speech-processing self-supervised-learning

seanwood / gcc-nmf

speech-processing,Real-time GCC-NMF Blind Speech Separation and Enhancement

User: seanwood

speech-separation speech-enhancement gcc-nmf nmf real-time real-time-processing speech speech-processing cross-correlation generalized-cross-correlation

sforaidl / neural-voice-cloning-with-few-samples

speech-processing,This repository has implementation for "Neural Voice Cloning With Few Samples"

Organization: sforaidl

deep-learning mel-spectogram saidl speaker-adaptation speaker-encodings speech-processing tts voice voice-cloning voice-synthesis

sharad24 / neural-voice-cloning-with-few-samples

speech-processing,Implementation of Neural Voice Cloning with Few Samples Research Paper by Baidu

User: sharad24

voice-cloning speech-synthesis speech-processing speaker-encodings encodings speech speaker-embeddings mel-spectrogram

sp-nitech / sptk

speech-processing,A suite of speech signal processing tools

Organization: sp-nitech

Home Page: http://sp-tk.sourceforge.net

sptk cpp unix-command dsp audio-processing cepstrum lpc lsp mfcc speech

speechbrain / speechbrain

speech-processing,A PyTorch-based Speech Toolkit

Organization: speechbrain

Home Page: http://speechbrain.github.io

asr audio audio-processing deep-learning huggingface language-model pytorch speaker-diarization speaker-recognition speaker-verification speech-enhancement speech-processing speech-recognition speech-separation speech-to-text speech-toolkit speechrecognition spoken-language-understanding transformers voice-recognition

speechbrain / speechbrain.github.io

speech-processing,The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.

Organization: speechbrain

beamforming deep-learning deeplearning librispeech neural-network neural-networks speaker-identification speaker-recognition speaker-verification speech speech-analysis speech-api speech-emotion-recognition speech-processing speech-recognition speech-recognizer speech-separation speech-to-text speechrecognition timit

superkogito / spafe

speech-processing,:sound: spafe: Simplified Python Audio Features Extraction

User: superkogito

Home Page: https://superkogito.github.io/spafe/

python dsp audio music audio-analysis music-information-retrieval features-extraction mfcc filterbank signal-processing

swasun / vq-vae-speech

speech-processing,PyTorch implementation of VQ-VAE + WaveNet by [Chorowski et al., 2019] and VQ-VAE on speech signals by [van den Oord et al., 2017]

User: swasun

vq-vae vq-vae-wavenet wavenet speech speech-processing pytorch

wq2012 / awesome-diarization

speech-processing,A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

User: wq2012

Home Page: https://wq2012.github.io/awesome-diarization/

speaker-diarization awesome awesome-list machine-learning speech-recognition speech-processing deep-learning

yuan-manx / audio-development-tools

speech-processing,This is a list of sound, audio and music development tools which contains machine learning, audio generation, audio signal processing, sound synthesis, spatial audio, music information retrieval, music generation, speech recognition, speech synthesis, singing voice synthesis and more.

User: yuan-manx

audio audio-processing music signal-processing speech-processing deep-learning dsp speech artificial-intelligence audio-generation

zycv / awesome-keyword-spotting

speech-processing,This repository is a curated list of awesome Speech Keyword Spotting (Wake-Up Word Detection).

User: zycv

speech-recognition keyword-spotting awesome-lists speech-processing hotword-detection

Topic: speech-processing Goto Github

👇 Here are 551 public repositories matching this topic...

akojimaslp / beamforming-for-speech-enhancement

arjo129 / uspeech

audio-westlakeu / fullsubnet

breizhn / dtln

coqui-ai / open-speech-corpora

ddlbojack / speech-resources

digitalphonetics / ims-toucan

drethage / speech-denoising-wavenet

fgnt / pb_bss

gemengtju / tutorial_separation

gionanide / speech_signal_processing_and_classification

haoheliu / voicefixer

haoheliu / voicefixer_main

haoxiangsnr / a-convolutional-recurrent-neural-network-for-real-time-speech-enhancement

haoxiangsnr / wave-u-net-for-speech-enhancement

huawei-noah / speech-backbones

jtkim-kaist / speech-enhancement

kahne / nonautoreggenprogress

kahne / speechtransprogress

linto-ai / whisper-timestamped

microsoft / torchscale

microsoft / unispeech

midas-research / audino

mravanelli / sincnet

nanahou / awesome-speech-enhancement

novoic / surfboard

nvidia / cleanunet

pliang279 / awesome-multimodal-ml

pliang279 / multibench

pyannote / pyannote-audio

r9y9 / deepvoice3_pytorch

r9y9 / nnmnkwii

r9y9 / pysptk

r9y9 / ttslearn

r9y9 / wavenet_vocoder

resemble-ai / resemble-enhance

rishikksh20 / vocgan

ryuk17 / speechalgorithms

santi-pdp / pase

seanwood / gcc-nmf

sforaidl / neural-voice-cloning-with-few-samples

sharad24 / neural-voice-cloning-with-few-samples

sp-nitech / sptk

speechbrain / speechbrain

speechbrain / speechbrain.github.io

superkogito / spafe

swasun / vq-vae-speech

wq2012 / awesome-diarization

yuan-manx / audio-development-tools

zycv / awesome-keyword-spotting

Recommend Projects

Recommend Topics

Recommend Org