locuslab / fast_adversarial Goto Github PK

View Code? Open in Web Editor NEW

424.0 424.0 93.0 1.22 MB

[ICLR 2020] A repository for extremely fast adversarial training using FGSM

Python 93.86% Shell 6.14%

fast_adversarial's People

Contributors

Stargazers

Watchers

Forkers

phymucs hyzcn a7b23 justcherie sbhadade vergangenheit prithv1 mastricker sadafgulshad1 hachreak george1ee uooga hellozhaojian yanghaozyh akshay107 fagan2888 mohannadcse amartya18x pingli00 wangjksjtu oliviamb xyishere minhna1112 wjm-1206 adarsh-kr xiaoanshi ayushmehta faibk mangoerya vickyqi7 jaiabhayk ajsanjoaquin hanyeh00 wh-forker yuezhixiong tchang1997 joejiong lynne294 rophen2333 harrywuhust2022 mhilmiasyrofi tang-agui wangaxe florianmerkle z-zawhtet-a saba96 huihuiqu stjordanis pdejorge bngabonziza bxz9200 gejulia liangjie15 mfazelnia jokeryan neorecrown lansatiankong mhuaaa grantorshadow marcozullich jinhaoduan mrh996 cswangle sunrise6513 siddharth9820 pantheon5100 rakeshsenthil allenwzc williamberrios softwaretoolsmonster helenr6 ghljh coder3344 yangerkun xmbition elaine919 rhaldarpurdue hnucs coordxyz minhhao97vn tarmas99 djawharaben guillaumecld imwildstone jhyuuu raideeen skcwgjdl winnie0825 zhuguohao1 cdgyp zhangkaibin0921

fast_adversarial's Issues

Include python/pytorch version for MNIST reproducibility

Hi! I am having a hard time reproducing the results (on MNIST, for example) and I have found that they differ when I change the pytorch version. I observe the following:

pytorch 1.12: when training with MNIST, training accuracy of 0.98 is achieved, but robust test accuracy is zero
pytorch 1.4: when training with MNIST, training accuracy of 0.95 is achieved, robust test accuracy is 0.88

I think the code was originally run with pytorch 1.0, I am trying to find out what is breaking the code in pytorch 1.12. It would be great to make it more clear which versions to use to reproduce the results

reproduce problem of imagenet on default set

In table 4. imagenet adversairal robustness result:

method	epsilon	pgd+1restart
FGSM	2/255	43.46%
FGMS	4/255	30.28%

Is this top5 accuracy or top1 accuracy?

About low and high value of uniform distribution in PGD attack (CIFAR-10)

Hi Eric,

Thank you for the code. It's awesome with all efficient training tricks.
I would like to ask your confirmation on the values of low and high value in CIFAR10/utils.py line 61, the delta is initialized in uniform distribution on each normalized channel.

delta[:, i, :, :].uniform_(-epsilon[i][0][0].item(), epsilon[0][0][0].item())

The high value is epsilon[0][0][0], wouldn't it be epsilon[i][0][0]?
I am new in this. Can you please confirm? If you specifically fix the high value, can you please explain me a little bit? Thank you for your valuable work again.

Parameters of training

Hello,

Thanks for your valuable work.

I would like to understand the methodology behind the division of epsilon and alpha values with standard deviation.

    epsilon = (args.epsilon / 255.) / std
    alpha = (args.alpha / 255.) / std
    pgd_alpha = (2 / 255.) / std

About PGD evaluation

Hi, thank you for the great work and opening the code.

However I have a question about PGD evaluation.

In the code, when attack_pgd is called, it seems that for some images in a batch, adversarial perturbation is gained with less steps than attack_iter.

During the iteration, update on perturbation 'delta' are performed to the images those are classified correctly only.
(index is the variable that indicate the images that are classified correctly and in delta, only delta[index[0]] is updated in the loop for _ in range(attack_iters):)

I understand that the Image that are classified correctly are not adversarial example, so more search in l-inf ball should be perform to seek adversarial perturbation.

However, I don't understand why the search should be stopped for the Images which are classified wrongly in the early step of PGD iteration.

I think it can be expected that more strong adversarial perturbation can be searched by performing more gradient descent iteration even if the images are adversarial already. In other word, I doubt that evaluation on PGD are performed with relatively weak adversarial examples.

These maybe the adversarial examples with less distant from original one(not exactly but approximately), but not strong adversarial examples. And I think the strength of adversarial example is crucial because the main claim of paper is that training with FGSM can build model that are robust to strong attack such as PGD.

I think that something like max_delta[all_loss >= max_loss] = delta.detach()[all_loss >= max_loss] in the loop for zz in range(restarts): should be performed in the loop for _ in range(attack_iters): to find the strongest adversarial example that can be achieved with attack_iter steps.

But of course, I may be missing something. So can you tell me the underlying idea about why the iteration stop when the image are classified wrongly while building PGD perturbation?

invalid key "/xff" when loading model.

Thank you for opening your technology to the open source. Bug when loading the imagenet model, error occurs.
"_pickle.UnpicklingError: invalid load key, '\xff'."
The loading method in your code cannot load the model correctly.
How should I load your model correctly?

adversarial attack

ffgsm

l2 norm PGD attack

Hi, does the FGSM perform as good as PGD even for adversarial training with l2 perturbation instead of l_infinity?

facing "nan" values during training the model

hi, during the training with my custom objective loss, I realized that sometimes the model went wrong and produce "nan" and become invalid; which I didn't face before with other training methods, is that because of the learning rate of the cyclic learning rate being too large and causing the loss to diverge as mentioned in the paper: For each method, we individually tune λ to be as large as possible without causing the training loss to diverge? or is it a bug?

I ran the original again with epochs=30 and also faced the same issue:

Probable gradient accumulation bug in mnist_train.py

I observed that the delta.grad accumulates gradients over the inner maximization steps in PGD. Isn't this a bug? Can you please clarify whether this was intensional or a bug?

Can't reproduce MNIST results using current codes

I just cloned this repo and try to run codes with provided instructions. (the code is not modified.)
Environment: cuda 11.3, python 3.9.6, pytorch 1.9.0, torchvision 0.10.0, installed via miniconda.

I run python train_mnist.py --fname ./new_result.pth to get a model,
and then run python evaluate_mnist.py --fname ./new_result.pth to evaluate the robustness.
and run python evaluate_mnist.py --fname ./new_result.pth --attack none to evaluate the clean accuracy.

The result shows that robustness=0.00% and accuracy=97.71%, meaning the trained model is not robust at all.

However, using your pretrained model in models/fgsm.pth brings a robust model. (robustness=88.38% and accuracy=98.50%)

Could you provide any comment on how to reproduce your pretrained results?

Reproduce the result of CIFAR-10 from the default setting

Hi,
I'm running the repo with the default configuration for CIFAR-10, however, here is the reported Accuracy I got from the trained model after 15 epochs:

Total train time: 6.7291 minutes
Test Loss        Test Acc        PGD Loss        PGD Acc
0.9252           **0.7003**          1.2217          **0.3784**

so the Accuracy is 70% and PGD Accuracy is only 37.84%?
Am I missing any detailed configurations?

When computing the perturbation, do we need to set model.eval()?

Hello Leslie Rice and Eric Wong,

Congratulations on your significant work！！

I found the model is always set to training mode during adversarial training period. However, I think when we compute the adversarial perturbation, we must set model.eval() to prevent the randomness, such as dropout, to affect the estimated gradients. So a correct approach is to add model.eval() before this line.

I'm curious about why you did not set model.eval() in your code. I guess the amp makes the gradient overflow in eval mode? How about the performance gap between these two different approaches?

Looking forward to your reply, thank you !

Parameter settings on CIFAR-100

Hi,

I tried to use this method on CIFAR-100 with the same parameter settings as CIFAR-10. But the results are terrible that the test adversarial accuracies are less than 2%. Do you have any suggestions on how to set up the parameters (epoch, learning rate, and batch size)for CIFAR100? Also, the auxiliary loss is widely used in natural training, do you think it will be helpful if used in fast adversarial training?

Best wishes,
Jia

torch.where API in MNIST and CIFAR10, ImageNet configuration files

Hi,

When we tried to run the codes for MNIST and CIFAR10. It throws the error like that:

index = torch.where(output.max(1)[1] == y)[0]
TypeError: where() missing 2 required positional argument: "input", "other"

We have checked the API docs for Pytorch 1.3, Pytorch 1.0, Pytorch 0.4.1 . It seems that the usuage is not standard. We also tried to run the experiment in ImageNet folder, but the configuration files used in the code are not there in the Github. Do you know how to fix this? Thank you very much.

Reproduce results

Hello,
thanks for the great work and open-sourcing the repository.
I reran the CIFAR10 experiments with the unmodified code (without arguments) provided and I got the following results:
python train_fgsm.py:

	Test Loss	Test Acc	PGD Loss	PGD Acc
My	0.6739	0.7930	1.0310	0.4531
Paper	-	0.8381	-	0.4606

python train_free.py:

	Test Loss	Test Acc	PGD Loss	PGD Acc
My	0.7544	0.7695	1.0670	0.4598
Paper	-	0.7838	-	0.4618

python train_pgd.py:

	Test Loss	Test Acc	PGD Loss	PGD Acc
My	0.7657	0.7664	1.0657	0.4725
Paper	-	0.8246	-	0.5069

Any hint how to close the performance gap between the reported results and the ones obtained with code (especially for train_fgsm.py)?

I also have an additional question about Table 3 in the paper. Why is the time for the seconds/epoch of PGD-7 (1456.22) so much greater than DAWNBench + PGD-7 (104.94). From what I read online the speed improvements of mixed precision are usually in the range of 20% to 30%. Here it seems to increase the speed much more drastically.

Thanks for your help

indices

Can anyone please let me know if it's necessary to just update the \deltas of those images that are not misclassified? Can't we just update all \detla s? Which also ensures its maximization.

fast_adversarial/MNIST/train_mnist.py

Line 95 in 54f7287

I = output.max(1)[1] == y

Inconsistent clamping behaviour between CIFAR and MNIST fgsm implementaitions

In the implemenation of fgsm for mnist, you do not clamp the initatial perturbation - meaning you calculate gradient based on out of bounds data points:

delta = torch.zeros_like(X).uniform_(-args.epsilon, args.epsilon).cuda()
delta.requires_grad = True
output = model(X + delta)
loss = F.cross_entropy(output, y)

This contrasts with the CIFAR implementation, where this clamping is done:

for j in range(len(epsilon)):
delta[:, j, :, :].uniform_(-epsilon[j][0][0].item(), epsilon[j][0][0].item())
delta.data = clamp(delta, lower_limit - X, upper_limit - X)

Is this intended? Why was this choice made?

Model overfits with low test accuracy for higher epsilon values

I'm using the FGSM approach to train a ResNet18 model on CIFAR10.

Using the values in the paper for epsilon=8/255 and alpha=10/255 works fine. But when I try to extend to an epsilon of 12 (and an alpha of 1.25*epsilon as outlined in the paper, so 15) to compare to other robust models, the model catastrophically overfits relatively early with very low clean example accuracy (50 to 60%). Has anyone had success using this approach with a higher epsilon than 8/255? Does alpha=1.25*epsilon not apply for other values of epsilon?

Thanks in advance for any help you can provide.

Why not using clean samples during training?

Hi, Does anyone can help me to understand why not using clean samples during training? Will it reduce the performance? Thanks~

Why do we need to do clamp(delta, lower_limit - X, upper_limit - X)？

Overwrite of variable i in nested for loop

In train_fgsm.py
... for i, (X, y) in enumerate(train_loader): ... for i in range(len(epsilon)): ... ...

Imagenet folder miss a lot of files

at least the main_free.py, utils and validations

could you please share the missing files?

Some questions about the robustness under other attacks

Hi, thanks for your code and idea. The results are very surprising and appealing.

I adopted your techniques (cycle LR and FGSM with random initialization) in my method (not AT but very similar to AT), and it worked very well when the attack is 'FGSM-type', including FGSM, PGD, and MI-FGSM. However, the adversarial robustness degrades shapely compared with the corresponding one solved with PGD when I evaluate the model under other types of attacks (e.g., CW and JSMA) on the MNIST dataset. Have you tried those attacks in your evaluation? Have you met the same problem?

Thanks for your work again and looking for your reply.

Yiming Li

Reproduce the results of Free adversarial training.

Hi. I find that free adversarial training in original paper choose multistep lr.
I trained 96/8 epochs for free adversarial training with multistep lr with 1 GPU. I just got 40.8% acc for PGD-20 (eps=8). Then I trained 205/8->26 epochs for free adversarial training and I just got 42.01% acc for PGD-20 (eps=8). My initial lr is 0.1 and lr decays at [1/2* lr_steps, 3/4*lr_steps]. The model is WRN34.
Could you please help me figure out what's wrong? I also find that cifar10_std = [0.2471, 0.2435, 0.2616] in your settings. Why not cifar10_std = [0.2023, 0.1994, 0.2010]?