dongsig Goto Github PK
Name: dyang
Type: User
Company: Tencent
Bio: Speech
Location: Shanghai
Name: dyang
Type: User
Company: Tencent
Bio: Speech
Location: Shanghai
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, speaker embedding
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196
Quality-Net: An End-to-End Non-intrusive Speech Quality Assessment Model based on BLSTM. (Interspeech, 2018, with Travel Grants)
Real-ESRGAN aims at developing Practical Algorithms for General Image Restoration.
Detects fake voices in YouTube videos with 94% accuracy and alerts the user to prevent misinformation [Hack the North 2019]
Implementation of the "Reconstructing Faces from Voices" paper.
Recurrent neural network for audio noise reduction
Algorithms to align 1D signals via cross correlation and likelihood maximization.
Keyword Spotting using Sliding DTW
关于语音信号声源定位DOA估计所用的一些传统算法
Baselines and Classifiers for speaker anti-spoofing detection
Speaker Identification System (upto 100% accuracy); built using Python 2.7 and python_speech_features library
Estimating the Age, Height, and Gender of a speaker with their speech signal.
speech-aligner,是一个从“人声语音”及其“语言文本”,产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription
The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
Deep neural network based speech enhancement toolkit
Verifying Deep Keyword Spotting Detection with Acoustic Word Embeddings
Speech Enhancement based on DNN (Spectral-Mapping, TF-Masking), DNN-NMF, NMF
In this repository, we explore using a hybrid system consisting of a Convolutional Neural Network and a Support Vector Machine for Keyword Spotting task.
Сlassification of the real speech and speech from device speakers
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
A UNIFIED SPEECH ENHANCEMENT FRONT-END FOR ONLINE DEREVERBERATION, ACOUSTIC ECHO CANCELLATION, AND SOURCE SEPARATION
Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
Voice Activity Detector
Voice Activity Detector in Python
Utterance-level Aggregation For Speaker Recognition In The Wild
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.