Giter VIP home page Giter VIP logo

revisiting-at's Introduction

Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models

Naman D Singh, Francesco Croce, Matthias Hein

University of Tübingen

NeurIPS 2023

Abstract

While adversarial training has been extensively studied for ResNet architectures and low resolution datasets like CIFAR, much less is known for ImageNet. Given the recent debate about whether transformers are more robust than convnets, we revisit adversarial training on ImageNet comparing ViTs and ConvNeXts. Extensive experiments show that minor changes in architecture, most notably replacing PatchStem with ConvStem, and training scheme have a significant impact on the achieved robustness. These changes not only increase robustness in the seen $\ell_\infty$-threat model, but even more so improve generalization to unseen $\ell_1/\ell_2$-robustness.

readme_teaser

Code

Requirements (specific versions tested on):
fastargs-1.2.0 autoattack-0.1 pytorch-1.13.1 torchvision-0.14.1 robustbench-1.1 timm-0.8.0.dev0, GPUtil

Training

The bash script in run_train.sh trains the model model.arch. For clean training: adv.attack none and for adversarial training set adv.attack apgd.
For the standard setting as in the paper (heavy augmentations) set data.augmentations 1, model.model_ema 1 and training.label_smoothing 1.
To train models with Convolution-Stem (CvSt) set model.not_original 1.
The code does standard APGD adversarial training.
The file utils_architecture.py has model definitions for the new CvSt models, all models are built on top of timm imports.

Evaluating a model

The file runner_aa_eval runs AutoAttack(AA). Passing fullaa 1 runs complete AA whereas fullaa 0 runs the first two attacks (APGD-CE and APGD-T) in AA.

Checkpoints - ImageNet $\ell_{\infty} = 4/255$ robust models.

The link location includes weights for the clean model (the one used as initialization for Adversarial Training (AT)), the robust model, and the full-AA log for $\ell_{\infty}, \ell_2$ and $\ell_1$ attacks.
Note: the higher resolution numbers use the same checkpoint as for the standard resolution of 224 - only evaluation is done at the higher resolution mentioned.

Model-Name epochs res. Clean acc. AA - $\ell_{\infty}$ acc. Checkpoint (clean-init
and robust)
ConvNext-iso-CvSt 300 224 70.2 45.9 Link
ViT-S 300 224 69.2 44.0 Link
ViT-S-CvSt 300 224 72.5 48.1 Link
ConvNext-T 300 224 72.4 48.6 Link
ConvNext-T-CvSt 300 224 72.7 49.5 Link
ViT-M-CvSt 50 224 72.4 48.8 Link
ConvNext-S-CvSt 50 224 74.1 52.4 Link
ViT-B 50 224 73.3 50.0 Link
ConvNext-B 50 224 75.6 54.3 Link
ViT-B-CvSt 250 224 76.3 54.7 Link
ConvNext-B-CvSt 250 224 75.9 56.1 Link
ConvNext-B-CvSt* --- 256 76.9 57.3 Link
ConvNext-L-CvSt 100 224 77.0 57.7 Link
ConvNext-L-CvSt* --- 320 78.2 59.4 Link
*: increased resolution (only for evaluation) also leads to increased FLOPs.

Checkpoints along with accuracy and robustness logs for ImageNet models finetuned to be robust at $\ell_\infty = 8/255$ are available here: Link


Citation

If you use our code/models cite our work using the follwoing BibTex entry:

@inproceedings{singh2023revisiting,
  title={Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models},
  author={Singh, Naman D and Croce, Francesco and Hein, Matthias},
  booktitle={NeurIPS},
  year={2023}}

revisiting-at's People

Contributors

nmndeep avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

revisiting-at's Issues

adversarial finetuning recipe for downstream datasets?

Can you please provide in more detail the training recipe used for adversarial finetuning on downstream datasets(cifar10, cifar1000, flowers)? What optimizer, augmentations are used? Is the adversarial training done similar to how it is done on ImageNet or TRADES framework is used?

loading a pretrained chekpoint

Hello,

First of all - thanks for the great work!
After downloading it from the link, I am trying to load a given checkpoint.
However, the load command fails -
torch.load('convnext_b_cvst_robust.pt', map_location='cpu')
and outputs the following error:
RuntimeError: Expected hasRecord("version") to be true, but got false.
More details - I use the same torch version as required, and I have tried several different checkpoints.
How can I resolve this?

Thanks!

Different results between Table 1 and 2

Thank you for the great work. I am just wondering if you can explain the reason behind the difference between Table 1 and 2. For example, in Table 1, the ViT-S' performance is (60.3, 30.4), while in Table 2 its performance is (61.5, 31.8) where random init and basic augmentation are adopted. I think the difference is that model in Table 1 are pretrained with standard training for 100 epochs while Table 2 use rand init. But if that's the case, why Table 1 is worse than Table 2? Thank you very much!

imagenet1k and imagenet21k pre-train

Hi,
thank you for your nice work, it's really enlightening. Especially, using clean pre-trained checkpoints on ImageNet-1K and -21K.

I found that you use:
elif modelname == 'vit_s':
model = create_model('vit_small_patch16_224', pretrained=pretrained)

and
elif modelname == 'vit_s_21k':
model = create_model('deit3_small_patch16_224_in21ft1k', pretrained=pretrained)

to load pre-trained checkpoints.

Do you mean "vit_small_patch16_224" for ImageNet-1K pretraining? But I print the model.default_cfg, it shows the model is timm/vit_small_patch16_224.augreg_in21k_ft_in1k. So it actually loads a checkpoint pre-trained on ImageNet-21K?

RobustAcc for L∞=8/255 models.

Could authors publish the RobustAcc about L∞=8/255 imagenet models? I tried to test the robustAcc of the models. But they are very low.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.