Name: Shammur Absar
Type: User
Company: QCRI
Bio: Interested in analyzing and understanding human conversation. Main focus: speech overlaps, turn-takings, speech discourse, code-switching, explainability.
Blog: http://shammur.one/
Shammur Absar 's Projects
Dialect identification using Siamese network
The official repository of Dynamic-SUPERB.
Keras implementation of the dynamic memory networks from https://arxiv.org/pdf/1603.01417.pdf
E2E system with LF-MMI; word N-gram for Mandarin
End-to-End Neural Diarization
The phoneme classification code for EUSIPCO 2017 paper: Timbre Analysis of Music Audio Signals with Convolutional Neural Networks
This repository contains the code to reproduce the core results from the paper "Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data"
A comparison and discussion of different NLP methods for 5-class sentiment classification on the SST-5 dataset.
Interactive multimedia captioning with Keras
Automatic Speech Recognition Dataset Generation
IWSLT 2022 Dialectal Speech Translation Shared Task
Evaluate your speech-to-text system with similarity measures such as word error rate (WER)
An Android app that offers speech-to-text services and user interfaces to other apps
This is the official location of the Kaldi project.
Directory of tutorials and open-source code repositories for working with Keras, the Python deep learning library
Copy of Lium Speaker Diarization project with a new build script.
Language identification using Siamese network based on i-vector
This repo is meant to serve as a guide for Machine Learning/AI technical interviews.
Models and examples built with TensorFlow
Phoneme Recognition using pre-trained models Wav2vec2, HuBERT and WavLM. Throughout this project, we compared specifically three different self-supervised models, Wav2vec (2019, 2020), HuBERT (2021) and WavLM (2022) pretrained on a corpus of English speech that we will use in various ways to perform phoneme recognition for different languages with a network trained with Connectionist Temporal Classification (CTC) algorithm.
TensorFlow implementation of "Multimodal Speech Emotion Recognition using Audio and Text," IEEE SLT-18
Multimodal Sarcasm Detection Dataset
NeMo: a toolkit for conversational AI
End-to-end ASR/LM implementation with PyTorch
Data augmentation for NLP