guoyang94 Goto Github PK

followers: 1.0 following: 1.0 repos: 40.0 gists: 0.0

Type: User

guoyang94's Projects

ai-research-code

athena

an open-source implementation of sequence-to-sequence based speech processing engine

attentions-in-tacotron

audiogpt

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

audiomentations

A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

autovc-wavrnn

voice conversion system

awesome-diffusion-models

A collection of resources and papers on Diffusion Models and Score-based Models, a darkhorse in the field of Generative Models

bark

🔊 Text-prompted Generative Audio Model

chinese_text_normalization

Chinese text normalization for speech processing

cross-lingual-voice-cloning

Tacotron 2 - PyTorch implementation with faster-than-realtime inference modified to enable cross lingual voice cloning.

cyclevae-vc-neuralvoco

deeplearningexamples

Deep Learning Examples

diffsinger

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

dns-challenge

This repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.

espnet

End-to-End Speech Processing Toolkit

fastspeech2

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"

flowtron

Flowtron is an auto-regressive flow-based generative network for text to speech synthesis with control over speech variation and style transfer

hifi-gan

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

lpcnet

Efficient neural speech synthesis

lpcnet_torch

torch version of LPCNet

mellotron

Mellotron: a multispeaker voice synthesis model based on Tacotron 2 GST that can make a voice emote and sing without emotive or singing training data

multilingual_text_to_speech

An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.

nemo

NeMo: a toolkit for conversational AI

neuralspeech

paralleltts

A fast parallel text-to-speech (tts) model. Work well for English, Mandarin, Japanese, Korean, Russian and Tibetan (so far). 快速并行语音合成模型，适用于英语、普通话、日语、韩语、俄语和藏语（当前已测试）。

parallelwavegan

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN) with Pytorch

pits

PITS: Variational Pitch Inference for End-to-end Pitch-controllable TTS without External Pitch Predictor

pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

pytorchwavenetvocoder

WaveNet-Vocoder implementation with pytorch.

resemblyzer

A python package to analyze and compare voices with deep learning

guoyang94 Goto Github PK

guoyang94's Projects

Recommend Projects

Recommend Topics

Recommend Org