zhuxuhan Goto Github PK
Type: User
Bio: Dream Chaser
Type: User
Bio: Dream Chaser
Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning
Bottom-Up Top-Down Attention for VQA Implementation accompanying Annotated BUTD Blog Post
A neural network architecture(CNN+LSTM) that automatically generates captions from the images. The model uses ResNet architecture to train the Encoder while DecoderRNN has to be trained with our choice of trainable parameters. I have trained the model on the Microsoft Common Objects in COntext (MS COCO) dataset and have tested the network on fictitious images!
Semantic Segmentation on PyTorch (include FCN, PSPNet, Deeplabv3, Deeplabv3+, DANet, DenseASPP, BiSeNet, EncNet, DUNet, ICNet, ENet, OCNet, CCNet, PSANet, CGNet, ESPNet, LEDNet, DFANet)
This repository focus on Image Captioning & Video Captioning & Seq-to-Seq Learning & NLP
A scientific and useful toolbox, which contains practical and effective long-tail related tricks with extensive experimental results
CVPR 2020 oral paper: Overcoming Classifier Imbalance for Long-tail Object Detection with Balanced Group Softmax.
An efficient PyTorch implementation of the winning entry of the 2017 VQA Challenge.
An PyTorch reimplementation of bottom-up-attention models
Bottom Up and Top Down Attention For Image Captioning
Cascaded Human-Object Interaction Recognition (CVPR2020)
Non-official implement of Paper:CBAM: Convolutional Block Attention Module
[CVPR 2021] Counterfactual VQA: A Cause-Effect Look at Language Bias
This repository contains code for the paper "Decoupling Representation and Classifier for Long-Tailed Recognition", published at ICLR 2020
Just playing with getting CLIP Guided Diffusion running locally, rather than having to use colab.
Code for ACMMM2024 paper COC.
All-in-one Toolbox for Computer Vision Research.
A library containing both highly optimized building blocks and an execution engine for data pre-processing in deep learning applications
[TPAMI20] Learning 3D Human Shape and Pose from Dense Body Parts
DenseASPP for Semantic Segmentation in Street Scenes
A real-time approach for mapping all human pixels of 2D RGB images to a 3D surface-based model of the body
End-to-End Object Detection with Transformers
特征提取/数据降维:PCA、LDA、MDS、LLE、TSNE等降维算法的python实现
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.