wanghelin1997 Goto Github PK

followers: 139.0 following: 50.0 repos: 82.0 gists: 0.0

Name: Helin Wang

Type: User

Company: THU & PKU & JHU

Bio: PhD student at Johns Hopkins University, interested in AI for Audio & Speech Processing.

Location: Baltimore, US

Blog: https://wanghelin1997.github.io/helinwang

Helin Wang's Projects

adamwr

Implements https://arxiv.org/abs/1711.05101 AdamW optimizer, cosine learning rate scheduler and "Cyclical Learning Rates for Training Neural Networks" https://arxiv.org/abs/1506.01186 for PyTorch framework

asc_triplet

triplet loss on Acoustic Scene Classification-PyTorch

at-gcn

Pytorch implementation of the paper : Modeling Label Dependencies for Audio Tagging with Graph Convolutional Network

atresn-net

Capturing attentive temporal relations in semantic neighborhood for ASC

attention-based_atrous_cnn

Pytorch code for the paper 'Attention-based Atrous Convolutional Neural Networks: Visualisation and Understanding Perspectives of Acoustic Scenes', by Zhao Ren, Qiuqiang Kong, Jing Han, Mark Plumbley, Björn Schuller.

aty-tts

Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech

aty-tts-demo

audiocraft

audioset_raw

Download and create a tfreader for the audioset dataset

audioset_tagging_cnn

automatic_speech_annotator

Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automatic speech recognition

babycry-sound-detection

PyTorch implementations of neural network models for Babycry sound detection, including training process and test demo. Based on DCASE2017 Task2: Detection of rare sound events.

blip

bnm

code of Towards Discriminability and Diversity: Batch Nuclear-norm Maximization under Label Insufficient Situations (CVPR2020 oral)

clip-multilingual

Multilingual CLIP - Semantic Image Search in 100 languages

cmu-thesis

Code for Yun Wang's PhD Thesis: Polyphonic Sound Event Detection with Weak Labeling

cnn-model-and-visualization

A CNN model (RseNet) for image classification( CIFAR-10), including filter and output of layers visualization.

commonvoice

consingan

Official PyTorch implementation of "Improved Techniques for Training Single-Image GANs"

dcase-2020-task1a-code

A pytorch implementation of the paper : Acoustic Scene Classification with Multiple Decision Schemes.

dcase2019_1d

Dcase2019 Task1a using audio feature module.

dcase2019_task1

dcase2019_task1-1

dcase2019_task1_baseline

DCASE2019 Challenge Task 1 baseline system

dcase2019_task4

dcase2020-task6-pku

A Pytorch implementation of the DCASE2020 Task6 by PKU team : Automated Audio Captioning With Temporal Attention

dcase2021_task6_pku

This is the code of PKU team for DCASE 2021 Task 6.

deepinversion

du-n2dvc-demo

duta-vc

Source code and demo for INTERPSEECH 2023 paper: DuTa-VC: A Duration-aware Typical-to-atypical Voice Conversion Approach with Diffusion Probabilistic Model

wanghelin1997 Goto Github PK

Helin Wang's Projects

Recommend Projects

Recommend Topics

Recommend Org