Topic: speech-processing Goto Github
Some thing interesting about speech-processing
Some thing interesting about speech-processing
speech-processing,simple delaysum, MVDR and CGMM-MVDR
User: akojimaslp
speech-processing,Speech recognition toolkit for the arduino
User: arjo129
Home Page: https://arjo129.wordpress.com/experiments/%C2%B5speech/
speech-processing,PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
Organization: audio-westlakeu
Home Page: https://fullsubnet.readthedocs.io/en/latest/
speech-processing,Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.
User: breizhn
speech-processing,💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
Organization: coqui-ai
speech-processing,语音方向实验室/公司/资源/实习等,欢迎推荐或自荐
User: ddlbojack
speech-processing,Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.
Organization: digitalphonetics
speech-processing,A neural network for end-to-end speech denoising
User: drethage
speech-processing,Collection of EM algorithms for blind source separation of audio signals
Organization: fgnt
speech-processing,This repo summarizes the tutorials, datasets, papers, codes and tools for speech separation and speaker extraction task. You are kindly invited to pull requests.
User: gemengtju
speech-processing,Front-end speech processing aims at extracting proper features from short- term segments of a speech utterance, known as frames. It is a pre-requisite step toward any pattern recognition problem employing speech or audio (e.g., music). Here, we are interesting in voice disorder classification. That is, to develop two-class classifiers, which can discriminate between utterances of a subject suffering from say vocal fold paralysis and utterances of a healthy subject.The mathematical modeling of the speech production system in humans suggests that an all-pole system function is justified [1-3]. As a consequence, linear prediction coefficients (LPCs) constitute a first choice for modeling the magnitute of the short-term spectrum of speech. LPC-derived cepstral coefficients are guaranteed to discriminate between the system (e.g., vocal tract) contribution and that of the excitation. Taking into account the characteristics of the human ear, the mel-frequency cepstral coefficients (MFCCs) emerged as descriptive features of the speech spectral envelope. Similarly to MFCCs, the perceptual linear prediction coefficients (PLPs) could also be derived. The aforementioned sort of speaking tradi- tional features will be tested against agnostic-features extracted by convolu- tive neural networks (CNNs) (e.g., auto-encoders) [4]. The pattern recognition step will be based on Gaussian Mixture Model based classifiers,K-nearest neighbor classifiers, Bayes classifiers, as well as Deep Neural Networks. The Massachussets Eye and Ear Infirmary Dataset (MEEI-Dataset) [5] will be exploited. At the application level, a library for feature extraction and classification in Python will be developed. Credible publicly available resources will be 1used toward achieving our goal, such as KALDI. Comparisons will be made against [6-8].
User: gionanide
speech-processing,General Speech Restoration
User: haoheliu
Home Page: https://haoheliu.github.io/demopage-voicefixer/
speech-processing,General Speech Restoration
User: haoheliu
Home Page: https://haoheliu.github.io/demopage-voicefixer/
speech-processing,A minimum unofficial implementation of the "A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement" (CRN) using PyTorch
User: haoxiangsnr
speech-processing,Implement Wave-U-Net by PyTorch, and migrate it to the speech enhancement.
User: haoxiangsnr
Home Page: https://arxiv.org/abs/1806.03185
speech-processing,This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
Organization: huawei-noah
speech-processing,Deep neural network based speech enhancement toolkit
User: jtkim-kaist
speech-processing,Tracking the progress in non-autoregressive generation (translation, transcription, etc.)
User: kahne
speech-processing,Tracking the progress in end-to-end speech translation
User: kahne
speech-processing,Multilingual Automatic Speech Recognition with word-level timestamps and confidence
Organization: linto-ai
speech-processing,Foundation Architecture for (M)LLMs
Organization: microsoft
Home Page: https://aka.ms/GeneralAI
speech-processing,UniSpeech - Large Scale Self-Supervised Learning for Speech
Organization: microsoft
speech-processing,Open source audio annotation tool for humans
Organization: midas-research
speech-processing,SincNet is a neural architecture for efficiently processing raw audio samples.
User: mravanelli
speech-processing,A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.
User: nanahou
speech-processing,Novoic's audio feature extraction library
Organization: novoic
Home Page: https://novoic.com
speech-processing,Official PyTorch Implementation of CleanUNet (ICASSP 2022)
Organization: nvidia
speech-processing,Reading list for research topics in multimodal machine learning
User: pliang279
speech-processing,[NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning
User: pliang279
speech-processing,Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Organization: pyannote
Home Page: http://pyannote.github.io
speech-processing,PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
User: r9y9
Home Page: https://r9y9.github.io/deepvoice3_pytorch/
speech-processing,Library to build speech synthesis systems designed for easy and fast prototyping.
User: r9y9
Home Page: https://r9y9.github.io/nnmnkwii/latest/
speech-processing,A python wrapper for Speech Signal Processing Toolkit (SPTK).
User: r9y9
Home Page: http://pysptk.readthedocs.io/en/latest/
speech-processing,ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)
User: r9y9
Home Page: https://r9y9.github.io/ttslearn/
speech-processing,WaveNet vocoder
User: r9y9
Home Page: https://r9y9.github.io/wavenet_vocoder/
speech-processing,AI powered speech denoising and enhancement
Organization: resemble-ai
Home Page: https://huggingface.co/spaces/ResembleAI/resemble-enhance
speech-processing,VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network
User: rishikksh20
speech-processing,Problem Agnostic Speech Encoder
User: santi-pdp
speech-processing,Real-time GCC-NMF Blind Speech Separation and Enhancement
User: seanwood
speech-processing,This repository has implementation for "Neural Voice Cloning With Few Samples"
Organization: sforaidl
speech-processing,Implementation of Neural Voice Cloning with Few Samples Research Paper by Baidu
User: sharad24
speech-processing,A suite of speech signal processing tools
Organization: sp-nitech
Home Page: http://sp-tk.sourceforge.net
speech-processing,A PyTorch-based Speech Toolkit
Organization: speechbrain
Home Page: http://speechbrain.github.io
speech-processing,The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
Organization: speechbrain
speech-processing,:sound: spafe: Simplified Python Audio Features Extraction
User: superkogito
Home Page: https://superkogito.github.io/spafe/
speech-processing,PyTorch implementation of VQ-VAE + WaveNet by [Chorowski et al., 2019] and VQ-VAE on speech signals by [van den Oord et al., 2017]
User: swasun
speech-processing,A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
User: wq2012
Home Page: https://wq2012.github.io/awesome-diarization/
speech-processing,This is a list of sound, audio and music development tools which contains machine learning, audio generation, audio signal processing, sound synthesis, spatial audio, music information retrieval, music generation, speech recognition, speech synthesis, singing voice synthesis and more.
User: yuan-manx
speech-processing,This repository is a curated list of awesome Speech Keyword Spotting (Wake-Up Word Detection).
User: zycv
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.