Giter VIP home page Giter VIP logo

nad's Introduction

Neural Attention Distillation

This is an implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks in PyTorch.

Python 3.6 Pytorch 1.10 CUDA 10.0 License CC BY-NC

NAD: Quick start with pretrained model

We have already uploaded the all2one pretrained backdoor student model(i.e. gridTrigger WRN-16-1, target label 5) and the clean teacher model(i.e. WRN-16-1) in the path of ./weight/s_net and ./weight/t_net respectively.

For evaluating the performance of NAD, you can easily run command:

$ python main.py 

where the default parameters are shown in config.py.

The trained model will be saved at the path weight/erasing_net/<s_name>.tar

Please carefully read the main.py and configs.py, then change the parameters for your experiment.

Erasing Results on BadNets

Dataset Baseline ACC Baseline ASR NAD ACC NAD ASR
CIFAR-10 85.65 100.0 82.12 3.57

Training your own backdoored model

We have provided a DatasetBD Class in data_loader.py for generating training set of different backdoor attacks.

For implementing backdoor attack(e.g. GridTrigger attack), you can run the below command:

$ python train_badnet.py

This command will train the backdoored model and print clean accuracies and attack rate. You can also select the other backdoor triggers reported in the paper.

Please carefully read the train_badnet.py and configs.py, then change the parameters for your experiment.

Other source of backdoor attacks

Attack

CL: Clean-label backdoor attacks

SIG: A New Backdoor Attack in CNNS by Training Set Corruption Without Label Poisoning

## reference code
def plant_sin_trigger(img, delta=20, f=6, debug=False):
    """
    Implement paper:
    > Barni, M., Kallas, K., & Tondi, B. (2019).
    > A new Backdoor Attack in CNNs by training set corruption without label poisoning.
    > arXiv preprint arXiv:1902.11237
    superimposed sinusoidal backdoor signal with default parameters
    """
    alpha = 0.2
    img = np.float32(img)
    pattern = np.zeros_like(img)
    m = pattern.shape[1]
    for i in range(img.shape[0]):
        for j in range(img.shape[1]):
            for k in range(img.shape[2]):
                pattern[i, j] = delta * np.sin(2 * np.pi * j * f / m)

    img = alpha * np.uint32(img) + (1 - alpha) * pattern
    img = np.uint8(np.clip(img, 0, 255))

    #     if debug:
    #         cv2.imshow('planted image', img)
    #         cv2.waitKey()

    return img

Refool: Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks

Defense

MCR: Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness

Fine-tuning & Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks

Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks

STRIP: A Defence Against Trojan Attacks on Deep Neural Networks

Library

Note: TrojanZoo provides a universal pytorch platform to conduct security researches (especially backdoor attacks/defenses) of image classification in deep learning.

Backdoors 101 โ€” is a PyTorch framework for state-of-the-art backdoor defenses and attacks on deep learning models.

References

If you find this code is useful for your research, please cite our paper

@inproceedings{li2021neural,
  title={Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks},
  author={Li, Yige and Lyu, Xixiang and Koren, Nodens and Lyu, Lingjuan and Li, Bo and Ma, Xingjun},
  booktitle={ICLR},
  year={2021}
}

Contacts

If you have any questions, leave a message below with GitHub.

nad's People

Contributors

bboylyg avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.