Ankit Shah's Projects
Single and multichannel sound event detection using convolutional recurrent neural networks. DCASE 2017 real-life sound event detection winning method.
Official implementation of "SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers"
Implementation of Segformer, Attention + MLP neural network for segmentation, in Pytorch
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Baseline method for sound event localization task of DCASE 2022 challenge
Sound event localization and detection of overlapping sources in three dimensions using convolutional recurrent neural network
This is a collection of research papers for Self-Correcting Large Language Models with Automated Feedback.
The Udacity open source self-driving car project
A End to End CNN Model which predicts the steering wheel angle based on the video/image
A self-driving car simulator built with Unity
Self-labelling via simultaneous clustering and representation learning. (ICLR 2020)
Codes for paper "Learning Modality-Specific Representations with Self-Supervised Multi-Task Learning for Multimodal Sentiment Analysis"
Code for NeurIPS 2019 paper ``Self-Critical Reasoning for Robust Visual Question Answering''
Code for reproducing experiments in "How Useful is Self-Supervised Pretraining for Visual Tasks?"
http://www.ark.cs.cmu.edu/SEMAFOR
Automatically exported from code.google.com/p/semafor-semantic-parser
Nvidia Semantic Segmentation monorepo
Given a list of videos, output the semantic features for each video.
A Unified Semi-Supervised Learning Codebase (NeurIPS'22)
Implementations of different VAE-based semi-supervised and generative models in PyTorch
Transformer Network for Time-Series and Sensor Data
Repo for reproduction of sequential social dilemmas
Efficient few-shot learning with Sentence Transformers
Code for the paper "Improving Sound Event Classification by Increasing Shift Invariance in Convolutional Neural Networks".
TensorFlow Implementation of "Show, Attend and Tell"
Siamese Mask R-CNN model for one-shot instance segmentation
Optimization Examples with SigOpt
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple