Giter VIP home page Giter VIP logo

dent's Introduction

Dent: Dynamic Defenses against Adversarial Attacks

This is the official project repository for Fighting Gradients with Gradients: Dynamic Defenses against Adversarial Attacks by Dequan Wang, An Ju, Evan Shelhamer, David Wagner, and Trevor Darrell.

Abstract

Adversarial attacks optimize against models to defeat defenses. We argue that models should fight back, and optimize their defenses against attacks at test-time. Existing defenses are static, and stay the same once trained, even while attacks change. We propose a dynamic defense, defensive entropy minimization (dent), to adapt the model and input during testing by gradient optimization. Our dynamic defense adapts fully at test-time, without altering training, which makes it compatible with existing models and standard defenses. Dent improves robustness to attack by 20+ points (absolute) for state-of-the-art static defenses against AutoAttack on CIFAR-10 at epsilon (L_infinity) = 8/255.

Example: Model Adaptation for Defense against AutoAttack on CIFAR-10

This example compares state-of-the-art adversarial training defenses, which are static, with and without our method for defense entropy minimization (dent), which is dynamic. We evaluate against white-box and black-box attacks and report the worst-case accuracy across attack types.

Result:

Dent improves adversarial/robust accuracy (%) by more than 30 percent (relative) against AutoAttack on CIFAR-10 while preserving natural/clean accuracy. Our dynamic defense brings adversarial accuracy within 90% of natural accuracy for the three most accurate methods tested (Wu 2020, Carmon 2019, Sehwag 2020). The static defenses alter training, while dent alters testing, and so this separation of concerns makes dent compatible with many existing models and defenses.

Model ID Paper Natural (static) Natural (dent) Adversarial (static) Adversarial (dent) Venue
Wu2020Adversarial_extra Adversarial Weight Perturbation Helps Robust Generalization 88.25 87.65 60.04 80.33 NeurIPS 2020
Carmon2019Unlabeled Unlabeled Data Improves Adversarial Robustness 89.69 89.32 59.53 82.28 NeurIPS 2019
Sehwag2020Hydra HYDRA: Pruning Adversarially Robust Neural Networks 88.98 88.60 57.14 78.09 NeurIPS 2020
Wang2020Improving Improving Adversarial Robustness Requires Revisiting Misclassified Examples 87.50 86.32 56.29 77.31 ICLR 2020
Hendrycks2019Using Using Pre-Training Can Improve Model Robustness and Uncertainty 87.11 87.04 54.92 79.62 ICML 2019
Wong2020Fast Fast is better than free: Revisiting adversarial training 83.34 82.34 43.21 71.82 ICLR 2020
Ding2020MMA MMA Training: Direct Input Space Margin Maximization through Adversarial Training 84.36 84.68 41.44 64.35 ICLR 2020

Usage:

python cifar10a.py --cfg cfgs/dent.yaml MODEL.ARCH Wu2020Adversarial_extra
python cifar10a.py --cfg cfgs/dent.yaml MODEL.ARCH Carmon2019Unlabeled
python cifar10a.py --cfg cfgs/dent.yaml MODEL.ARCH Sehwag2020Hydra
python cifar10a.py --cfg cfgs/dent.yaml MODEL.ARCH Wang2020Improving
python cifar10a.py --cfg cfgs/dent.yaml MODEL.ARCH Hendrycks2019Using
python cifar10a.py --cfg cfgs/dent.yaml MODEL.ARCH Wong2020Fast
python cifar10a.py --cfg cfgs/dent.yaml MODEL.ARCH Ding2020MMA

Correspondence

Please contact Dequan Wang, An Ju, and Evan Shelhamer at dqwang AT eecs.berkeley.edu, an_ju AT berkeley.edu, and shelhamer AT deepmind.com.

Citation

If the dent method or dynamic defense setting are helpful in your research, please consider citing our paper:

@article{wang2021fighting,
  title={Fighting Gradients with Gradients: Dynamic Defenses against Adversarial Attacks},
  author={Wang, Dequan and Ju, An and Shelhamer, Evan and Wagner, David and Darrell, Trevor},
  journal={arXiv preprint arXiv:2105.08714},
  year={2021}
}

Note: a workshop edition of this project was presented at the ICLR'21 Workshop on Security and Safety in Machine Learning Systems.

dent's People

Contributors

dequanwang avatar shelhamer avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.