Giter VIP home page Giter VIP logo

me-net's Introduction

ME-Net: Towards Effective Adversarial Robustness with Matrix Estimation

This repository contains the implementation code for paper ME-Net: Towards Effective Adversarial Robustness with Matrix Estimation (ICML 2019).

ME-Net is a preprocessing-based defense method against adversarial examples, which is both model-agnostic and attack-agnostic. Being model-agnostic means ME-Net can easily be embedded into existing networks, and being attack-agnostic means ME-Net can improve adversarial robustness against a wide range of black-box and white-box attacks. Specifically, we focus on the intrinsic global structures (e.g., low-rank) within images, and leverage matrix estimation (ME) to exploit such underlying structures for better adversarial robustness.

overview

Dependencies

The current code has been tested on Ubuntu 16.04. You can install the dependencies using

pip install -r requirements.txt

Main Files

The code provided in this repository is able to do the following tasks:

  • train_pure.py: Train a ME-Net model with standard SGD.
  • train_adv.py: Train a ME-Net model with adversarial training. We mainly focus on PGD-based adversarial training under L_infinity perturbation bound.
  • attack_blackbox.py: Perform black-box attacks on trained ME-Net model. We provide three kinds of black-box attacks, including transfer-based attack (i.e., using FGSM, PGD and CW), decision-based attack (i.e., Boundary attack), and score-based attack (i.e., SPSA).
  • attack_whitebox.py: Perform white-box attacks on trained ME-Net model. We mainly focus on white-box adversarial robustness against L_infinity bounded PGD attack.

Note: The current release is for the CIFAR-10 dataset. We also test ME-Net on MNIST, SVHN, and Tiny-ImageNet dataset. The main code framework is the same for different datasets, while the only difference is the dataloader. We will update code for the remaining datasets soon.

Train ME-Net

Matrix estimation is a well studied topic with a number of established ME techniques. We mainly focus on three different ME methods throughout our study:

Note that one could either view the three RGB channels separately as independent matrices or jointly by concatenating them into one matrix. While the main paper follows the latter approach, we here provide an argument --me-channel to choose how you want to operate on the channels for ME. We provide comparison between the two methods later.

As ME-Net uses different masked realizations of each image during training, we use the following method to generate masks with different observing probability: for each image, we select --mask-num masks in total with observing probability ranging from --startp to --endp with equal intervals.

Common Arguments

The following arguments are used by scripts for training ME-Net, including train_pure.py, and train_adv.py:

Paths

  • --data-dir: directory path to read data.
  • --save-dir: directory path to store/load models.

Hyper-parameters

  • --model: choose which model to use (default: ResNet18).
  • --mu: the nuclear norm minimization algorithm hyper-parameter (default: 1).
  • --svdprob: the USVT approach hyper-parameter (default: 0.8).
  • --startp: the start probability of mask sampling.
  • --endp: the end probability of mask sampling.
  • --batch-size: the mini-batch size for training (default: 256).
  • --mask-num: the number of sampled masks (default: 10).

ME parameters

  • --me-channel: view RGB channels separately as independent matrices, or jointly by concatenating: separate | concat (default: concat).
  • --me-type: choose which method to use for matrix estimation: usvt | softimp | nucnorm (default: usvt).

Train ME-Net with Standard SGD

To train a pure ME-Net model with SGD, for example, using nucnorm with probability from 0.8 -> 1 with concat channels:

python train_pure.py --data-dir <path> \
    --save-dir <path> \
    --startp 0.8 \
    --endp 1 \
    --me-channel concat \
    --me-type nucnorm \
    <optional-arguments>

Train ME-Net with Adversarial Training

To adversarially train a ME-Net model, for example, using usvt with probability from 0.4 -> 0.6 with concat channels, under 7 steps PGD attacks:

python train_adv.py --data-dir <path> \
    --save-dir <path> \
    --startp 0.4 \
    --endp 0.6 \
    --me-channel concat \
    --me-type usvt \
    --attack \
    --iter 7 \
    <optional-arguments>

Pre-generated Datasets

Since the first step for training a pure ME-Net model is to generate a new dataset (--mask-num times larger), which can be time-consuming for certain ME methods. We provide several pre-generated datasets with different observing probabilities and different ME methods (will update soon):

An example to load such pre-generated datasets:

class CIFAR10_Dataset(Data.Dataset):

    def __init__(self, train=True, target_transform=None):
        self.target_transform = target_transform
        self.train = train

        # Loading training data
        if self.train:
            self.train_data, self.train_labels = get_data(train)
            self.train_data = np.load('/path/to/training/data/')
        # Loading testing data
        else:
            self.test_data, self.test_labels = get_data()
            self.test_data = np.load('/path/to/testing/data/')

Evaluate ME-Net

Black-box Attacks

To perform a black-box attack on a trained ME-Net model, for example, using spsa attack with 2048 samples:

python attack_blackbox.py --data-dir <path> \
    --ckpt-dir <path> \
    --name <saved-ckpt-name> \
    --attack-type spsa \
    --spsa-sample 2048 \
    <optional-arguments>

The following arguments are commonly used to perform black-box attacks:

  • --data-dir: directory path to read data.
  • --ckpt-dir: directory path to load saved model checkpoints.
  • --name: the name of saved checkpoints.
  • --maskp: the probability of mask sampling (note that for ME-Net inference we simply use the average of masking probabilities during training; one can also play with other choices such as a randomly sampled one).
  • --source: the source model of transfer-based black-box attacks.
  • --attack-type: fgsm | pgd | cw | spsa | boundary.
  • --epsilon: the upper bound change of L-inf norm on input pixels (default: 8).
  • --iter: the number of iterations for iterative attacks (default: 1000).
  • --cw-conf: the confidence of adversarial examples for CW attack (default: 20).
  • --spsa-sample: the number of SPSA samples for SPSA attack (default: 2048).

White-box Attacks

To perform a white-box attack on a trained ME-Net model, for example, using 1000 steps PGD-based BPDA attack:

python attack_whitebox.py --data-dir <path> \
    --ckpt-dir <path> \
    --name <saved-ckpt-name> \
    --attack \
    --mode pgd \
    --iter 1000 \
    <optional-arguments>

The following arguments are commonly used to perform white-box attacks:

  • --data-dir: directory path to read data.
  • --ckpt-dir: directory path to load saved model checkpoints.
  • --name: the name of saved checkpoints.
  • --maskp: the probability of mask sampling (note that for ME-Net inference we simply use the average of masking probabilities during training; one can also play with other choices such as a randomly sampled one).
  • --attack: perform adversarial attacks (default: True).
  • --epsilon: the upper bound change of L-inf norm on input pixels (default: 8).
  • --iter: the number of iterations for iterative attacks (default: 1000).
  • --mode: use toolbox or pgd implementation: toolbox | pgd (default: pgd). Note that we provide the attack implementation using Foolbox, however it can not achieve as high attack success rate as the PGD implementation.

Pre-trained Models

We provide several pre-trained ME-Net models (with both purely and adversarially trained ones) on CIFAR-10 with USVT method. Note that for different attacks, models trained with different p values can perform differently (more details can be found in our paper):

Since the saved model contains no information about the ME-Net preprocessing, one should wrap the loaded model with ME layer. An example to load pre-trained models:

# black-box attacks
model = checkpoint['model']
menet_model = MENet(model)
menet_model.eval()

# white-box attacks
net = AttackPGD(menet_model, config)
net.eval()

Representative Results

Visualization of how ME affects the input images

menet_results_0

Images are approximately low-rank

menet_results_1

Qualitative and quantitative results against black-box attacks

menet_results_2

Adversarial robustness under PGD-based BPDA white-box attacks

menet_results_3

Acknowledgements

We use the implemetation in the fancyimpute package for part of our matrix estimation algorithms. We use standard adversarial attack packages Foolbox and CleverHans for evaluating our defense.

Citation

If you find the idea or code useful for your research, please cite our paper:

@inproceedings{yang2019menet,
  title={{ME-Net}: Towards Effective Adversarial Robustness with Matrix Estimation},
  author={Yang, Yuzhe and Zhang, Guo and Katabi, Dina and Xu, Zhi},
  booktitle={Proceedings of the 36th International Conference on Machine Learning (ICML)},
  year={2019},
}

me-net's People

Contributors

yyzharry avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

me-net's Issues

关于这篇文章的诸多问题

朋友你好,几个月前我读到了这篇文章并尝试做一些改良,然而最近我突然意识到我甚至没有复现出原文中的结果。打个比方来说,当时我尝试了一些cifar10的白盒攻击,我得到的模型在7步PGD下的acc是57.6,这个数据比较接近表11中的59.8因此当时我就以为复现成功了。但是近日我无意中发现该模型在面对20步PGD时远没有达到表11中的52.6而是仅有42.1,于是我尝试了MNIST、CIFAR10、SVHN三种数据集上的白盒攻击,发现结果于论文中提供的数据出入非常大,现在我把它们列在这里请您帮助我排查问题:
MNIST usvt p=0.3 超参数与表8一致,在白盒攻击下我的测试结果是:clean 96.6 PGD40 87.85 PGD100 81.57 (文章里对应表16的96.8 86.5 83.1),这一组还勉强比较一致。
Cifar10 usvt p=0.5,这个模型我没有用自己训练的而是采用你提供的checkpoint,我得到的测试结果是:
clean87.18 pgd7 57.60 pgd20 42.1 (文章里对应表11的 - pgd7 59.8 pgd20 52.6),pgd20远远达不到文中所述,这个问题比较大而且攻击代码和模型都是您提供的,所以我比较困惑.
SVHN usvt p=0.3 超参与表8一致,我的测试结果是clean 87.48 pgd7 78.00 pgd20 73.88 (对应表19的 clean88.3 pgd7 74.7 pgd20 61.4)同样在pgd20有很大的不同,更高迭代次数的攻击我没有对整个测试集进行评估因为比较费时,但是从随机选取的一小部分子集的表现来看,cifar的表现会更低而svhn会收敛在70多.
以上是我做的并与文章很不一样的地方,希望能得到你的回复,尤其是关于cifar的结果,几乎没有一点我自己的东西,模型和攻击都是您已经提供的,出现如此大的偏差实在非常奇怪。此外还有一些比较疑惑的地方希望您也一起解答了:
1.表5中adaptive attack的攻击效果甚至比不上普通的bpda,这显然是不合理的adaptive attack吧?
2.既然已经知道bpda攻击那想必您一定也知道eot的存在吧,为什么不做相关的评估呢,menet中的mask很显然引入了随机
3.表6和表7中的数据显示menet能提高对clean data的泛化性,然而我并没有观察到这一现象,事实上您所提供的checkpoint(cifar pure usvt 0.5)中acc似乎也只有80多根本没有94.9,我猜测这里或许是指top-5?但是文中并没有指明这一点。

关于USVT的实现

作者你好,
我注意到USVT原文中进行重建时需要的特征值是根据S := {i:si ≥ (2 + η)pnpˆ}这个集合获得的,而在你提供的训练代码中似乎是用了一个固定的值int(h*svdprob)来保留能量大的部分,请问这两者是一致的吗,还是说代码中的实现其实就是保留了固定数量的特征值利用PCA重建而已呢?

question

Hello, can you share the defense results of ME-Net? Or are the results already displayed in the Representative Results?

Could you provide hyperparameter for MNIST?

Im trying to reproduce this work but always cant get the result what the paper shows. For example, clean images should has top1-acc 96.8 on ME-Net(p:0.2-0.4) but what I get is only 92.2. As the configs are not clear so I guess this is caused by my improper implementation, could you provide the config informations for MNIST? Here are mine if it helps.

augment:True
batchsize:200
optimizer: Adam(model.parameters(),lr=0.0001)
svdprob :0.8
startp: 0.2
endp :0.4
epoch: 100
mask-num: 10
me-type :usvt
model:
self.conv1 = nn.Conv2d(1,32,5,padding=2)
self.conv2 = nn.Conv2d(32,64,5,padding=2)
self.fc1 = nn.Linear(6477, 1024)
self.fc2 = nn.Linear(1024, 10)

svd did not converge

Hi,
These days I have two new questions. The first one is that when I use softimp as me method , it always raise "svd did not converge" when preparing data but I dont understand what happened and how to solve
it. The second one is when I use nucnorm as me method, the acc rate of MNIST evaluation under fgsm0.3 is only nearly 20% when training after 20 epochs ,however ,usvt can acheive more than 80% after the same epochs. Is it a bug or I just should wait for training more epochs?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.