This repository is the official implementation of DiffAttack. If you encounter any question, please feel free to contact us. You can create an issue or just send email to me [email protected]. Also welcome for any idea exchange and discussion.
[05/16/2023] Code is public.
[04/30/2023] Code cleanup done. Waiting to be made public.
- Abstract
- Requirements
- Crafting Adversarial Examples
- Evaluation
- Results
- Citation & Acknowledgments
- License
Many existing adversarial attacks generate
-
Hardware Requirements
- GPU: 1x high-end NVIDIA GPU with at least 16GB memory
-
Software Requirements
- Python: 3.8
- CUDA: 11.3
- cuDNN: 8.4.1
To install other requirements:
pip install -r requirements.txt
-
Datasets
-
Pre-trained Models
- We adopt
Stable Diffusion 2.0
as our diffusion model, you can load the pretrained weight by settingpretrained_diffusion_path="stabilityai/stable-diffusion-2-base"
in main.py. - For the pretrained weights of the adversarially trained models (Adv-Inc-v3, Inc-v3ens3, Inc-v3ens4, IncRes-v2ens) in Section 4.2.2 of our paper, you can download them from here and then place them into the directory
pretrained_models
.
- We adopt
To craft adversarial examples, run this command:
python main.py --model_name <surrogate model> --save_dir <save path> --images_root <clean images' path> --label_path <clean images' label.txt>
The specific surrogate models we support can be found in model_selection
function in other_attacks.py.
The results will be saved in the directory <save path>
, including adversarial examples, perturbations, original images, and logs.
For some specific images that distort too much, you can consider weaken the inversion strength by setting start_step
to a larger value, or leveraging pseudo masks by setting is_apply_mask=True
.
To evaluate the crafted adversarial examples on other black-box models, run:
python main.py --is_test True --save_dir <save path> --images_root <outputs' path> --label_path <clean images' label.txt>
The save_dir
here denotes the path to save only logs. The images_root
here should be set to the path of save_dir
in above Crafting Adversarial Examples.
Apart from the adversarially trained models, we also evaluate our attack's power to deceive other defensive approaches as displayed in Section 4.2.2 in our paper, their implementations are as follows:
- Adversarially trained models (Adv-Inc-v3, Inc-v3ens3, Inc-v3ens4, IncRes-v2ens): Run the code in Robustness on other normally trained models.
- HGD: Change the input size to 224, and then directly run the original code.
- R&P: Since our target size is 224, we reset the image scale augmentation proportionally (232~248). Then run the original code.
- NIPS-r3: Since its ensembled models failed to process inputs with 224 size, we run its original code that resized the inputs to 299 size.
- RS: Change the input size to 224 and set sigma=0.25, skip=1, max=-1, N0=100, N=100, alpha=0.001, then run the original code.
- NRP: Change the input size to 224 and set purifier=NRP, dynamic=True, then run the original code.
- DiffPure: Modify the original codes to evaluate the existing adversarial examples, not crafted examples again.
If you find this paper useful in your research, please consider citing:
@article{chen2023diffusion,
title={Diffusion Models for Imperceptible and Transferable Adversarial Attack},
author={Chen, Jianqi and Chen, Hao and Chen, Keyan and Zhang, Yilan and Zou, Zhengxia and Shi, Zhenwei},
journal={arXiv preprint arXiv:2305.08192},
year={2023}
}
Also thanks for the open source code of Prompt-to-Prompt. Some of our codes are based on them.
This project is licensed under the Apache-2.0 license. See LICENSE for details.