bethgelab / foolbox Goto Github PK
View Code? Open in Web Editor NEWA Python toolbox to create adversarial examples that fool neural networks in PyTorch, TensorFlow, and JAX
Home Page: https://foolbox.jonasrauber.de
License: MIT License
A Python toolbox to create adversarial examples that fool neural networks in PyTorch, TensorFlow, and JAX
Home Page: https://foolbox.jonasrauber.de
License: MIT License
For the Resnet model, shouldn't the gradients be passed through the preprocessing_input?
Hi, I read through the docs and issues, and couldn't find any information about this.
I want to generate adversarial examples for some non-image based problems. In my specific case, the inputs are fixed length sequence of integers which then goes into an embedding layer and into the network.
Is there a way to use foolbox in this scenario at the moment?
Thanks for your time!
Close only if there are no more TODO's in the code
This pre-print was recently published on an attack method for attacking networks with an embedding layer. I was hoping to do something similar in issue #80 , and so am making the feature request!
It would probably take more effort, but another thing the paper uses is attacking only a subset of the input. I think this could be interesting for a wider class of problems, and so would request that feature as well.
How create training samples from adversarially perturbed original training sampels?
To be simple, suppose I had 100 training images and wanted to use Deep Fool and FGSM to perturb these samples, I should now end up with 200 adversarial samples and 100 originals to train on. How to go about this in the most efficient way with this library?
Sample code very much appreciated! :D
It is compatible packages like these that effectively keep users from moving to Python 3. The unfortunately reality is that others won't move to Python 3 unless you force them to by breaking compatibility with Python 2. This will also let you expand your choice of APIs to include what is offered only in Python 3.
in this line https://github.com/bethgelab/foolbox/blob/master/foolbox/adversarial.py#L226, you check whether the adversarial image is within bound or not, but we should add the offset back first. Currently I constantly receive "error message about assert not strict or self.in_bounds(image)".
By the way, the tutorial of tensorflow uses "preprocessed = images - [123.68, 116.78, 103.94]
logits, _ = vgg.vgg_19(preprocessed, is_training=False)" to preprocess the input. Why don't we just use "preprocessing = (np.array([123.68, 116.78, 103.94]), 1)" and "model = foolbox.models.TensorFlowModel(inputs, logits, (0, 255), preprocessing=preprocessing)"? Though the second one does not work right now...
Currently foolbox.utils.imagenet_example() fails because the example.png is missing in the pypi package. For that reason the minimum example does not work!
Creating this issue for tracking progress and ideas related to adding support for batch_predictions, templated out in https://github.com/bethgelab/foolbox/blob/master/foolbox/models/base.py#L69
Hi, sorry to bombard you with this:)
I'm having trouble with specifying a target class with L-BFGS with the Resnet example.
It looks like FGSM is used to determine the closest class by default and then that class is chosen as the target class. This works well
But if the gradient attack fails, I get a random class, and this almost always fails to produce an adversarial with the targeted class. I get a zero (or nearly zero) image, without a warning.
Of course if I specify a target class, it also fails.
It looks like its related to this piece of code here in attacks/base.py
line 86
find = Adversarial(model, criterion, image, label)
assert find is not None
adversarial = find
if adversarial.distance.value == 0.:
warnings.warn('Not running the attack because the original image is already misclassified and the adversarial thus has a distance of 0.') #
I don't understand what this means fundamentally, since there's no true label passed in beyond the vanilla network prediction. Even if these images are misclassifed, the image can still be moved passed the value surface that it lies in. But I'm sure there is more here that I'm not seeing.
In the example code provided on GitHub the Foolbox model predicts class 282 for the example image using Keras and ResNet50.
If you let the underlying Keras model predict the class with
np.argmax(kmodel.predict(image[None,:])
it returns 287. I think this is beacause of the preprocessing done in the Foolbox model. Without said preprocessing both models return class 287.
Why is this example image used if ResNet50 seems to be unable to classify it correctly without applying some seemingly arbitrary preprocessing?
from #45:
Before closing this issue though we should extend pytests to include the different Keras backends (maybe even CNTK?). @jonasrauber : Can we run pytests on specific files with different environmental variables (we'd have to set os.environ['KERAS_BACKEND'] = backend)?
I think I briefly tried to get this to work, but it's not as easy to have multiple backends in one test suite. Could be necessary to have separate travis runs to achieve this (as we have for different python versions).
Hi,
I am doing adversarial training only using two classes. I just found I cannot apply boundary attack to the model which has classes instead of multiple classes.
Villa
In the following LBFGSAttack, the target class probability is not reached. The attack returns the following top 5 results when I specified class 781 (scoreboard) with 50% probability. Could be me using the attack incorrectly.
My code:
keras.backend.set_learning_phase(0)
kmodel = ResNet50(weights="imagenet")
preprocessing = (numpy.array([104, 116, 123]), 1)
fmodel = foolbox.models.KerasModel(kmodel, bounds=(0, 255), preprocessing=preprocessing)
image, label = foolbox.utils.imagenet_example()
label = numpy.argmax(fmodel.predictions(image[:,:,::-1]))
attack = LBFGSAttack(model=fmodel, criterion=TargetClassProbability(781, p=.5))
adversarial = attack(image[:,:,::-1], label)
adversarial = numpy.expand_dims(adversarial, axis=0)
preds = kmodel.predict(adversarial[:,:,::-1])
print "Top 5 predictions (adversarial: ", decode_predictions(preds, top=5)
Code returns:
Top 5 predictions (adversarial: [[(u'n02123159', u'tiger_cat', 0.20804052), (u'n02123394', u'Persian_cat', 0.16367689), (u'n02123045', u'tabby', 0.12409737), (u'n02127052', u'lynx', 0.11497518), (u'n02124075', u'Egyptian_cat', 0.047439657)]]
I spent a while trying to get foolbox to produce adversarial examples using a pytorch model, but it mostly returned adversaries that were still no problem for the network to solve. The solution was simply to put the model in eval mode instead of train mode, since otherwise dropout and batch norm will constantly change the network and make finding an adversary quite difficult (somewhat obvious in hindsight).
Would it be possible to do a check to make sure the model is in eval mode before trying to create adversarial examples, or at least make a note of it somewhere in the docs so that other people don't run into the same issue?
We implemented all attacks to perform some type of internal hyperparameter-tuning. For example, FGSM loops over many step-sizes to find the minimum perturbation. At the same time, some people might want to run the attack with fixed hyperparameters (as has been customary). Should we allow this scenario as well? How difficult/cumbersome would it be to support that?
I wonder why not from PIL import Image
in https://github.com/bethgelab/foolbox/blob/master/foolbox/utils.py#L76-L78
In major open sources, they use from PIL import Image
such as pillow official documents and pytorch/torchvision.
In addition to this, I encountered an error when using foolbox.utils.imagenet_example
caused by PIL.Image.open
.
Cleverhans directly searches for the logits output by tracking the node before the softmax:
https://github.com/tensorflow/cleverhans/blob/master/cleverhans/utils_keras.py
I am not sure whether the same trick works regardless of the backend (doubt it) but it would be interesting to take a look.
I'm wondering about the default parameters for the lbfgs attack.
When I use the example code to attack an image with lbfgs, I get much worse results in terms of distance from adversarial to real, compared to the images given in Exploring the space of adversarial images
I tried changing the initial C value within the optimization routine but it didn't help. Is the LBFGS attack supposed to make convincing adversarial images? I've also tried different methods of preprocessing, and also using a pytorch model.
Here's is the image comparison Images
For the paper lbfgs implementation I'm using this repo mentioned in the paper. Its just not very flexible and uses torch7.
Here's the test code I'm using for foolbox.
import ...
keras.backend.set_learning_phase(0)
kmodel = ResNet50(weights='imagenet')
preprocessing = (numpy.array([104, 116, 123]), 1)
fmodel = foolbox.models.KerasModel(kmodel, bounds=(0, 255), preprocessing=preprocessing)
image = scipy.misc.imread('bee.jpg')
image = scipy.misc.imresize(image, (224, 224))
label = numpy.argmax(fmodel.predictions(image))
# apply attack on source image
attack = foolbox.attacks.LBFGSAttack(fmodel)
adversarial = attack(image[:,:,::-1], label)
I'm running the latest master
Hi!
I just start to overcome you program)
So can you hep me with such question (I have been trying to solve it for week)
I want to see how this LocalSearch attack works with bunch of images(model/images) and model Inception_v3
Can you show how to realize it?
Everything i tried - failed
Very thanks in advance!
When the attack function can't get the adversarial, the None will be return.
But I think the original image should be return not the None since sometimes we want to get an image.
From what I have seen, the KerasModel wrapper does not support models with different behaviors at training and test time, such as dropout or batch normalization. This is because there is no opportunity to pass in keras.backend.learning_phase() as a parameter to the prediction functions (defined in line 70, 72 of foolbox/models/keras.py). Trying to predict using a model that depends on the learning phase yields an error such as
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'dropout_1/keras_learning_phase' with dtype bool
Code to reproduce error:
from keras.models import Sequential
from keras.layers import Dropout
import numpy as np
from foolbox.models import KerasModel
model = Sequential([Dropout(.1, input_shape = (1,))])
fmodel = KerasModel(model, (0,1))
fmodel.batch_predictions(np.ones((1,1)))
In the documentation we quite frequently refer to "images" but adversarials can be generated for any type of input. We should thus replace "images" with a more generic term like "input".
Right now LBFGS and SaliencyMapAttack use the GradientAttack to initialise the best target class in a misclassification setting. Due to the internal optimisation of the adversarial object it can happen that the best adversarial is actually found by the GradientAttack and not by LBFGS or SaliencyMap. We should reset the adversarial object to avoid this confound. I am currently testing a PR.
foolbox/foolbox/models/pytorch.py
Line 114 in 021f02a
Why do we need to have the grad divided by the std? I do not think it is necessary.
Paper link: https://arxiv.org/pdf/1706.06083.pdf
As I understood from the code (https://github.com/bethgelab/foolbox/blob/master/foolbox/models/tensorflow.py), the constructor of the TensorFlowModel tries to use the default session, otherwise it creates a new one, where all variables need to be reinitialised. That means that if there is a running session with initialised variables in it, then the TensorFlowModel should be created with this session being a default one:
with sess.as_default():
model = foolbox.models.TensorFlowModel(...)
Maybe that is obvious, but it took me some time to figure what was going on, so you guys may consider adding a note on that to the documentation :-)
I've been using foolbox for quite some time and to be thorough I've been doing comparisons with cleverhans, since that's what a lot of people in the community are going to be testing with.
I've noticed an issue, particularly with FGSM where cleverhans is somehow mounting a much stronger attack. On reviewing both implementations I can't find a suitable difference.
If I load identical models and run seemingly identical FGSM attack scripts on MNIST with an epsilon of 0.3.
I get these results:
test_foolbox.py
Clean accuracy: 0.998%
Attack success: 0.477%
Average Linf norm: 0.300000011921
test_tf.py
Clean accuracy: 0.9980
Adversarial accuracy: 0.2160
I created a test repo with two files, one for cleverhans, and one for foolbox, using the same keras model def and weights [here]. (https://github.com/neale/fgsm_test) The cleverhans test script is just a shortened version of one of their tutorials without the adversarial training. ,
Hi, guys! Your FoolBox is vary cool tool. But I have a question:
Is it possible to attack already trained model?
I have a KERAS model saved as "model.h5" and dataset in the form of "data.csv" on which this model was trained. Or model such as "model.ckpt"
So, can I load my model and attack with your tool?
thanks in advance!
Implementing a single pixel attack on a pretrained Keras ResNet (ImageNet dataset) model yields the following error:
/home/carnd/anaconda3/envs/carnd-term1/lib/python3.5/site-packages/foolbox/attacks/base.py:102: UserWarning: SinglePixelAttack did not find an adversarial, maybe the model or the criterion is not supported by this attack. warnings.warn('{} did not find an adversarial, maybe the model or the criterion is not supported by this attack.'.format(self.name())) # noqa: E501
I am not sure if this is bug is because i messed up using fool box or because I messed up restoring but when I try to restore the checkpoint file i get various errors that switch around depending on how i try to solve the previous one. I've been trying to follow the code (#12) . I guess what I am messing up or what i don't understand is what it means by replace with restorer in issue 12. I have read the tutorial but I still don't quite understand because the context seems to be different. Any help is greatly appreciated!
the error surrounds this
model.session.run( saver.restore( tf.Session(), "/tmp/models/convnet_maxpool.ckpt" ) )
This is the rest of the code:
import foolbox
import tensorflow as tf
import numpy as np
inputs = tf.placeholder(tf.float32, shape=(None, 784))
logits = tf.layers.dense(inputs, 10)
init_op = tf.global_variables_initializer()
np.random.seed(2)
W = np.random.rand(784, 10).astype(np.float32)
assign_op = tf.assign(tf.global_variables()[0], W)
np.random.seed(22)
example_input = np.random.rand(784)
with foolbox.models.TensorFlowModel(inputs, logits, (0, 1)) as model:
attack = foolbox.attacks.FGSM(model)
saver = tf.train.import_meta_graph('/tmp/models/convnet_maxpool.ckpt.meta')
model.session.run(init_op)
model.session.run( saver.restore(tf.Session(), "/tmp/models/convnet_maxpool.ckpt" ) ) # replace with restorer
example_label = np.argmax(model.predictions(example_input))
print(example_label)
adversarial = attack(example_input, example_label, unpack=False)
print(np.argmax(model.predictions(adversarial.image)))
print(adversarial.distance)
Just like the title "the successful rate of attacking example of ImagetNet datasets is so low with DeepFoolAttack" which is tested with the following code:
#get the dict of label
ff = open("./ILSVRC2015/ilsvrc_2012_val.txt")
mm = {}
for i in ff:
pt = i.strip().split(' ')
mm[pt[0]] = int(pt[1])
ff.close()
keras.backend.set_learning_phase(0)
kmodel = ResNet50(weights='imagenet')
preprocessing = (numpy.array([104, 116, 123]), 1)
fmodel = foolbox.models.KerasModel(kmodel, bounds=(0, 255), preprocessing=preprocessing)
attack = foolbox.attacks.DeepFoolAttack(fmodel)
valpath = r'./ILSVRC2015/rgbVal/' #the original image
imList = os.listdir(valpath)[:40]
width = 224
adv = np.zeros((len(imList),width,width,3)) #for saving the adv images
src = np.zeros((len(imList),width,width,3)) #for saving the original images
srcLabel = np.zeros(len(imList)) #for saving the label
for j in range(len(imList)):
image = Image.open(valpath + imList[j])
image = image.resize((width,width))
image = np.asarray(image, dtype="float32")
label = mm[imList[j]] #get the label
srcLabel[j] = label
src[j] = image
ans = attack(image[:,:,::-1], label)
if ans == None:
adv[j] = image
else:
adv[j] = ans[:,:,::-1]
x = np.expand_dims(adv[j], axis=0)
x = preprocess_input(x)
preds = kmodel.predict(x)
print('pre-label,',preds.argmax(),label)
num_classes = 1000
y_test = keras.utils.to_categorical(srcLabel, num_classes)
kmodel.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
score = kmodel.evaluate(src, y_test, verbose=0) #get the accuracy of original images
print('Test accuracy:', score[1])
score = kmodel.evaluate(adv, y_test, verbose=0) #get the accuracy of adv images
print('Test accuracy:', score[1])
#Finally, the accuracy of original images has no difference with the adv images.
The reference implementation (Python, PyTorch) is available at
Hello, I meet a problem when I install the package on Win10.
"running install
running bdist_egg
running egg_info
creating UNKNOWN.egg-info
writing UNKNOWN.egg-info\PKG-INFO
writing top-level names to UNKNOWN.egg-info\top_level.txt
writing dependency_links to UNKNOWN.egg-info\dependency_links.txt
writing manifest file 'UNKNOWN.egg-info\SOURCES.txt'
reading manifest file 'UNKNOWN.egg-info\SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'UNKNOWN.egg-info\SOURCES.txt'
installing library code to build\bdist.win-amd64\egg
running install_lib
warning: install_lib: 'build\lib' does not exist -- no Python modules to install
creating build
creating build\bdist.win-amd64
creating build\bdist.win-amd64\egg
creating build\bdist.win-amd64\egg\EGG-INFO
copying UNKNOWN.egg-info\PKG-INFO -> build\bdist.win-amd64\egg\EGG-INFO
copying UNKNOWN.egg-info\SOURCES.txt -> build\bdist.win-amd64\egg\EGG-INFO
copying UNKNOWN.egg-info\dependency_links.txt -> build\bdist.win-amd64\egg\EGG-INFO
copying UNKNOWN.egg-info\top_level.txt -> build\bdist.win-amd64\egg\EGG-INFO
zip_safe flag not set; analyzing archive contents...
creating dist
creating 'dist\UNKNOWN-0.6.0-py3.5.egg' and adding 'build\bdist.win-amd64\egg' to it
removing 'build\bdist.win-amd64\egg' (and everything under it)
Processing UNKNOWN-0.6.0-py3.5.egg
Removing c:\python35\lib\site-packages\UNKNOWN-0.6.0-py3.5.egg
Copying UNKNOWN-0.6.0-py3.5.egg to c:\python35\lib\site-packages
UNKNOWN 0.6.0 is already the active version in easy-install.pth
Installed c:\python35\lib\site-packages\unknown-0.6.0-py3.5.egg
Processing dependencies for UNKNOWN==0.6.0
Finished processing dependencies for UNKNOWN==0.6.0"
Hi,
In the example in tutorial.rst, there is from foolbox.criteria import TopKMisclassification
, but two lines later TargetClassProbability
is used instead.
Can we simplify the model testing such that for each model we only instantiate a toy example but then test all models in the same way? Right now the testing code (e.g. gradient testing) is duplicated across all model tests.
The name LBFGSBAttack appears in several places in the doc, while it should be LBFGSAttack (see https://github.com/bethgelab/foolbox/search?utf8=%E2%9C%93&q=LBFGSBAttack&type=).
Hi, after running an example in readme I get:
AttributeErrorTraceback (most recent call last)
<ipython-input-19-74364fe58d77> in <module>()
8 kmodel = ResNet50(weights='imagenet')
9 preprocessing = (numpy.array([104, 116, 123]), 1)
---> 10 fmodel = foolbox.models.KerasModel(kmodel, bounds=(0, 255), preprocessing=preprocessing)
11
12 # get source image and label
/usr/local/lib/python2.7/dist-packages/foolbox/models/keras.pyc in __init__(self, model, bounds, channel_axis, preprocessing, predicts)
59 predictions_are_logits = True
60
---> 61 shape = predictions.get_shape().as_list()
62 _, num_classes = shape
63 assert num_classes is not None
AttributeError: 'TensorVariable' object has no attribute 'get_shape'
My setup:
Keras==2.0.6
Theano=0.9.0
current foolbox from pip
When I used the keras model to get the adversarial examples.
There was an error happened on the line 96 of the file 'keras.py'.
The following is the details:
fmodel = foolbox.models.KerasModel(kmodel, bounds=(0, 255)) #, preprocessing=preprocessing)
File "/usr/lib/python2.7/site-packages/foolbox/models/keras.py", line 96, in init
[predictions, grad])
TypeError: Can not convert a NoneType into a Tensor or Operation.
So who can give me some advice on the matter?
Thank you very much.
For each model the preferred input_shape should be specified (e.g. on ImageNet that tends to vary between 224 and 299).
I get the following error when just trying to run the sample code. Any thoughts on how this can be fixed?
RuntimeWarning: overflow encountered in long_scalars
f = n * (max_ - min_)**2
Hi,
I tried following the instructions mentioned in the tutorialand examples sections, and create an attack for a VGG19 model. Downloaded the model's checkpoint from here
The code "runs" however it doesn't seem to be able to generate an adversarial example - the process never ends... What am I doing wrong?
Please advise
import tensorflow as tf
from tensorflow.contrib.slim.nets import vgg
import numpy as np
import foolbox
import matplotlib.pyplot as plt
from foolbox.attacks import LBFGSAttack
from foolbox.criteria import TargetClassProbability
images = tf.placeholder(tf.float32, shape=(None, 224, 224, 3))
preprocessed = images - [123.68, 116.78, 103.94]
logits, _ = vgg.vgg_19(images, is_training=False)
restorer = tf.train.Saver(tf.trainable_variables())
image, _ = foolbox.utils.imagenet_example()
with foolbox.models.TensorFlowModel(images, logits, (0, 255)) as model:
restorer.restore(model.session, "./vgg_19.ckpt")
print(np.argmax(model.predictions(image)))
target_class = 22
criterion = TargetClassProbability(target_class, p=0.01)
attack = LBFGSAttack(model, criterion)
label = np.argmax(model.predictions(image))
adversarial = attack(image=image, label=label)
plt.subplot(1, 3, 1)
plt.imshow(image)
plt.subplot(1, 3, 2)
plt.imshow(adversarial)
plt.subplot(1, 3, 3)
plt.imshow(adversarial - image)
Foolbox currently only supports finding adversarial examples on samples that are correctly classified to begin with. If one tries to run an attack on an incorrectly classified sample, the following warning results:
Not running the attack because the original image is already misclassified and the adversarial thus has a distance of 0.
However, in some contexts (such as Virtual Adversarial Training), it is useful to find adversarial examples on incorrectly classified inputs as well. That is, the training label is completely discarded and one instead only focuses on the label assigned by the classifier. These examples are called "virtual adversarial examples". Would it be possible to add support for them?
``` old-ufo@oldufo-UX303UB:~/dev$ sudo pip install foolbox
[sudo] password for old-ufo:
The directory '/home/old-ufo/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
The directory '/home/old-ufo/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
Collecting foolbox
Downloading foolbox-0.6.0.tar.gz (202kB)
100% |████████████████████████████████| 204kB 362kB/s
Running setup.py (path:/tmp/pip-build-32ku0y/foolbox/setup.py) egg_info for package foolbox produced metadata for project name unknown. Fix your #egg=foolbox fragments.
Installing collected packages: unknown
Running setup.py install for unknown ... done
Successfully installed unknown-0.6.0
old-ufo@oldufo-UX303UB:~/dev$ python
Python 2.7.12 (default, Nov 19 2016, 06:48:10)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import foolbox
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named foolbox
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.