Ankit Shah's Projects
Anticipating Accidents in Dashcam Videos (ACCV 2016)
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
A ns-2 Trace File Analyser.
A algebraic word problem dataset, with multiple choice questions annotated with rationales.
Kaggle | 1st place solution for Freesound Audio Tagging 2019
Deep Learning Computer Vision Algorithms for Real-World Use
repository to research & share the machine learning articles
asr2k
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
The PyTorch-based audio source separation toolkit for researchers
Pytorch implementation of the paper : Modeling Label Dependencies for Audio Tagging with Graph Convolutional Network
Interspeech2017 paper: Attention and Localization based on Convolutional Recurrent Model for Semi-supervised Audio Tagging
🖼️ Attend to You: Personalized Image Captioning with Context Sequence Memory Networks. In CVPR, 2017. Expanded : Towards Personalized Image Captioning via Multimodal Memory Networks. In IEEE TPAMI, 2018.
Source code for "On the Relationship between Self-Attention and Convolutional Layers"
Visual Question Answering Project with state of the art single Model performance.
Code/Model release for NIPS 2017 paper "Attentional Pooling for Action Recognition"
Landmark-based audio fingerprinting
A timeline of the latest AI models for audio generation, starting in 2023!
Temporal Sub-sampling of Audio Feature Sequences for Audio Captioning DCASE 2020 challenge
A module to classify audio samples.
🔊 Repository for our NAACL-HLT 2019 paper: AudioCaps
Dataset and baseline for the first Audiocaption task
Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)
audioLIME: Listenable Explanations Using Source Separation
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.
Collection of notebooks and scripts related to audio processing and machine learning.