Giter VIP home page Giter VIP logo

swag's Introduction

SWAG: Supervised Weakly from hashtAGs

This repository contains SWAG models from the paper Revisiting Weakly Supervised Pre-Training of Visual Perception Models.

PWC
PWC
PWC
PWC
PWC

Requirements

This code has been tested to work with Python 3.8, PyTorch 1.10.1 and torchvision 0.11.2.

Note that CUDA support is not required for the tutorials.

To setup PyTorch and torchvision, please follow PyTorch's getting started instructions. If you are using conda on a linux machine, you can follow the following setup instructions -

conda create --name swag python=3.8
conda activate swag
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch

Model Zoo

We share checkpoints for all the pretrained models in the paper, and their ImageNet-1k finetuned counterparts. The models are available via torch.hub, and we also share URLs to all the checkpoints.

The details of the models, their torch.hub names / checkpoint links, and their performance on Imagenet-1k (IN-1K) are listed below.

Model Pretrain Resolution Pretrained Model Finetune Resolution IN-1K Finetuned Model IN-1K Top-1 IN-1K Top-5
RegNetY 16GF 224 x 224 regnety_16gf 384 x 384 regnety_16gf_in1k 86.02% 98.05%
RegNetY 32GF 224 x 224 regnety_32gf 384 x 384 regnety_32gf_in1k 86.83% 98.36%
RegNetY 128GF 224 x 224 regnety_128gf 384 x 384 regnety_128gf_in1k 88.23% 98.69%
ViT B/16 224 x 224 vit_b16 384 x 384 vit_b16_in1k 85.29% 97.65%
ViT L/16 224 x 224 vit_l16 512 x 512 vit_l16_in1k 88.07% 98.51%
ViT H/14 224 x 224 vit_h14 518 x 518 vit_h14_in1k 88.55% 98.69%

The models can be loaded via torch hub using the following command -

model = torch.hub.load("facebookresearch/swag", model="vit_b16_in1k")

Inference Tutorial

For a tutorial with step-by-step instructions to perform inference, follow our inference tutorial and run it locally, or Google Colab.

Replicate Demo and Docker Image

SWAG web demo and docker image is on Replicate. You can try out demo with all the checkpoints here Replicate.

Live Demo

SWAG has been integrated into Huggingface Spaces ๐Ÿค— using Gradio. Try out the web demo on Hugging Face Spaces.

Credits: AK391

ImageNet 1K Evaluation

We also provide a script to evaluate the accuracy of our models on ImageNet 1K, imagenet_1k_eval.py. This script is a slightly modified version of the PyTorch ImageNet example which supports our models.

To evaluate the RegNetY 16GF IN1K model on a single node (one or more GPUs), one can simply run the following command -

python imagenet_1k_eval.py -m regnety_16gf_in1k -r 384 -b 400 /path/to/imagenet_1k/root/

Note that we specify a 384 x 384 resolution since that was the model's training resolution, and also specify a mini-batch size of 400, which is distributed over all the GPUs in the node. For larger models or with fewer GPUs, the batch size will need to be reduced. See the PyTorch ImageNet example README for more details.

Citation

If you use the SWAG models or if the work is useful in your research, please give us a star and cite:

@inproceedings{singh2022revisiting,
      title={{Revisiting Weakly Supervised Pre-Training of Visual Perception Models}}, 
      author={Singh, Mannat and Gustafson, Laura and Adcock, Aaron and Reis, Vinicius de Freitas and Gedik, Bugra and Kosaraju, Raj Prateek and Mahajan, Dhruv and Girshick, Ross and Doll{\'a}r, Piotr and van der Maaten, Laurens},
      booktitle={CVPR},
      year={2022}
}

License

SWAG models are released under the CC-BY-NC 4.0 license. See LICENSE for additional details.

swag's People

Contributors

ak391 avatar chenxwh avatar mannatsingh avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.