hendrycks / robustness Goto Github PK

Corruption and Perturbation Robustness (ICLR 2019)

License: Apache License 2.0

Python 99.90% Shell 0.10%

imagenet robustness pytorch deep-learning convolutional-neural-networks computer-vision machine-learning domain-generalization ml-safety

robustness's Introduction

I'm Dan, a PhD student in ML at UC Berkeley.

See my webpage for my research.

See below for my code.

robustness's People

Contributors

Stargazers

Watchers

Forkers

ml-lab jdc08161063 nottombrown kevinniechen shubhampachori12110095 dongfengcan pengboxiangshang amirunpri2018 xiahaifeng1995 domainadaptation chenjun2hao vodelerk mangoerya lucasosouza lilinna collin-burns gucasbrg michaelisc llllss normster eeccxin dash29 yjhuangcd wangkua1 leedoyup stes sweaterr evgeniaar shaotengliu duoergun0729 pyknife sadafgulshad1 jongchan ningkp lliai frezaeix camgbus swansealeo tiagogmarques annopackage alfredlaugros kaixiao juryun tchang1997 pouyaramezani jaedukseo liy-project crisnvg songweige gkn1fexxx yahsieh37 mldl zueigung1419 yongchaolrde cordob shubhayudas sammy42779 axyzhao potatotian yfh-yufeihu littlefish12 bholdmanny rayna-kanapuram peternara woojulee24 belkhir-nacim 13301338176 xiaoxinggithub jadie1 logichen yangmeiyi ingdoo14 luisoala abeliuxl robbieearle waasem ikharitonov yunanzhu alexblck lc-john jinhaoduan ahmedhusskhalifa hit636 averysi224 pengfight lijuny readerrubic haoyang-219 asclepiusinformatica eyov7 thlarsen13 ehsanw42 dazory xinya-liuliu harsmac helenr6 alpha-luo seungjinlee1 rohitkeshari haozheliu-st

robustness's Issues

Frost missing after pip install

After pip installing, frost is not available

  File "/root/code/robustness/ImageNet-C/imagenet_c/imagenet_c/__init__.py", line 29, in corrupt
    x_corrupted = corruption_dict[corruption_name](Image.fromarray(x), severity)
  File "/root/code/robustness/ImageNet-C/imagenet_c/imagenet_c/corruptions.py", line 259, in frost
    x_start, y_start = np.random.randint(0, frost.shape[0] - 224), np.random.randint(0, frost.shape[1] - 224)
AttributeError: 'NoneType' object has no attribute 'shape'

Likely a path error?

Normalize raw corruption error table to calculate mCE for Cifar

Will it be possible to share the table for calculating normalized mCE per corruption, for Cifar10 and Cifar100?

Different Image Size

The reference test script operates on an img size of 224.
Something like the EfficientNet models scale the input resolution as the network is scaled. If I want to test the EfficientNet models against corruption would I only be able to test input resolution 224 or can I just scale the input resolution to the standard input size used by the different EfficientNet scales?

Note: there seems to be a discrepancy in how "clean" vs "distorted" images are loaded.
The clean images are resized to img size 256, then a 224 center crop happens.
The distorted images are just center cropped to 224 regardless of the input img size....
Unless, during the creation of the dataset, the images are all converted to img size 256?

Tiny-Imagenet-c original set without corruptions

Hi,

Where can I find Tiny Imagenet-C images without any corruption?

Detail result of AlexNet on ImageNet-C ?

Can you provide detail Flipping Prob of AlexNet on ImageNet-C?
So that we can set a better baseline

Crops are not consistent across corruptions

We realized that the crops across different corruptions in the ImageNet-C dataset are not the same.

For example contrast/1/n016443737ILSVRC2012_val_00023855.JPEG vs impulse_noise/1/n016443737ILSVRC2012_val_00023855.png

Apart from different file types, the impulse_noise image (right) seems to be only resized instead of cropped?

CIFAR10-P label problem

Thanks for the nice work.
I am trying to use your CIFAR10-C and -P data, but in -P data there is no 'label.npy' file.

Can you guys handle this problem??

Error when reading the IMAGENET-P videos

Error information:
VIDIOC_REQBUFS: Inappropriate ioctl for device

It seems to be a problem about FFmpeg video reader.
It's wear that it occurs at some time, and sometimes won't

Frost images are missing

Hey,

thank you for your work.
The ./frost*.png files specified in the frost distortion in make_imagenetc.py are not provided in this repo. Could you add them?

TypeError: endswith first arg must be str or a tuple of str, not list

File"/mnt/lustre/share/spring/conda_envs/miniconda3/envs/s0.3.3/lib/python3.6/sitepackages/torchvision/datasets/folder.py", line 19, in has_file_allowed_extension return filename.lower().endswith(extensions) TypeError: endswith first arg must be str or a tuple of str, not list

I find this error and solve this error by doing this:
Change line 18 from video_loader.py
from
super(VideoFolder, self).__init__(root, loader, ['.mp4'], transform=transform, target_transform=target_transform)
to
super(VideoFolder, self).__init__(root, loader, '.mp4', transform=transform, target_transform=target_transform)

Then the error will not occur and it works well.
I think it may be a problem about the version of torchvision.
What is your version of torchvision?
For me it is '0.6.0+cu90'

Where is 'calib_err' function?

I can not find 'calib_err' in cifar-p evaluation code.

Where is it?

CIFAR-100-C Zenodo download link not working

Dear authors,

Could you please kindly provide another link (Google Drive or Dropbox) for downloading the CIFAR-100-C?

Thanks in advance

Do you have clean images corresponding ImageNet-C data or transform method

I want to compute PSNR with refined corrupted images and clean images.
However, ILSVRC2012/val/images differ from ImageNet-C with scale and cropped.

Could you give the transformation method from ImageNet to ImageNet-C in the concept of scale/crop.
(Just conducting resize(256) and center crop(224) is not fit for imagenet-C :( )
Or could you give the validation datasets(clean images) corresponding to the ImageNet-C.

Package Versions Unclear

Do you know what package versions you were using for this?
I can't get the make_cifar_c code to run with Python 3.9.9 and PyTorch 1.10. Different corruptions currently output different data types (e.g. uint8 vs float64) which I am struggling to then convert to a reasonable image.

Additional four corruptions

https://github.com/hendrycks/robustness/blob/master/ImageNet-C/imagenet_c/imagenet_c/__init__.py#L5

In your papers (robustness, augmix), you used 15 corruptions.
Here, the above line of the code, you can see 19 corruptions listed in a tuple.
This might be un-clear to many people, including me. The last four corruptions should be removed so that one can test a model with corruptions as listed in your paper.

CIFAR-10-C state-of-the-art?

Hi, not an issue per-se but just wondering if this information is tracked anywhere? I couldn't find any baselines in the original paper. Is it fair to say that 76.9% test acc from https://arxiv.org/abs/1906.12340 represents the state-of-the-art? This work https://arxiv.org/abs/1906.02611 seems to be a close runner up with 96.2 base * 0.797 mCE = 76.7%.

Snow corruption issue in python 2.7

I get the following issue when testing in python 2.7

File "/root/anaconda3/envs/py27/lib/python2.7/site-packages/imagenet_c/__init__.py", line 29, in corrupt
    x_corrupted = corruption_dict[corruption_name](Image.fromarray(x), severity)
  File "/root/anaconda3/envs/py27/lib/python2.7/site-packages/imagenet_c/corruptions.py", line 278, in snow
    snow_layer = PILImage.fromarray((np.clip(snow_layer.squeeze(), 0, 1) * 255).astype(np.uint8), mode='L')
  File "/root/anaconda3/envs/py27/lib/python2.7/site-packages/PIL/Image.py", line 2504, in fromarray
    size = shape[1], shape[0]
IndexError: tuple index out of range

how to use clahe in preprocess?

hey my friend

in your paper, ROBUSTNESS ENHANCEMENTS ,'Histogram Equalization. Histogram equalization successfully standardizes speech data for robust speech recognition (Torre et al., 2005; Harvilla & Stern, 2012). For images, we find that preprocessing with Contrast Limited Adaptive Histogram Equalization (Pizer et al., 1987) is quite effective. Unlike our image denoising attempt (Appendix F), CLAHE reduces the effect of some corruptions while not worsening performance on most others, thereby improving the mCE. We demonstrate CLAHE’s net improvement by taking a pre-trained ResNet-50 and fine-tuning the whole model for five epochs on images processed with CLAHE. The ResNet-50 has a 23.87% error rate, but ResNet-50 with CLAHE has an error rate of 23.55%. On nearly all corruptions, CLAHE slightly decreases the Corruption Error. The ResNet-50 without CLAHE preprocessing has an mCE of 76.7%, while with CLAHE the ResNet-50’s mCE decreases to 74.5%.'

in your repo, I can't find clahe

I wanna know, how to use it?

I tried two usage:

clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
for i in range(3):
    img[:,:,i] = clahe.apply(img[:,:,i])

clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2LAB)
frame[:,:,0] = clahe.apply(frame[:,:,0])
frame = cv2.cvtColor(frame, cv2.COLOR_LAB2BGR)

for the one part, I don't know which usage is correct
for the other, I don't know which number clipLimit,tileGridSize should be set

Could you help me with the answer?
Thanks

What is fgsm function in corruptions.py file?

In the documentation of ImageNet-C Corruption Functions, There are 19 functions.
But in the implementation, there is another function called fgsm that has a weird implementation.
Anyone could explain it to me what is it?

Thanks.

Not all corruption methods are available in make_imagenet_64_c.py

Hello. Thank you for providing the code for corruption.

Are there any specific reasons some corruption methods are commented out where some are left in make_imagenet_64_c.py
?

For example:

# d['Motion Blur'] = motion_blur
# d['Zoom Blur'] = zoom_blur
# d['Snow'] = snow
d['Frost'] = frost
d['Fog'] = fog
d['Brightness'] = brightness
d['Contrast'] = contrast

Thank you so much!

Applying corruption in memory vs compressed images

The authors of https://arxiv.org/pdf/1901.10513.pdf (section 6, 2nd paragraph) claim that there is a significant difference between the distorted images provided by the link in the ReadMe and applying the corruptions in memory to create the ImageNet-C dataset due to jpeg compression.

Which one should be used now?
Has this problem been resolved?

Do you have clean images corresponding ImageNet-C data?

Thank you for the awesome dataset !

I want to find clean images corresponding ImageNet-C. Currently, I have downloaded ILSVRC2012_img_val, but I found that imgs with the same id in ILSVRC2012_img_val is different from that in imageNet-C. For example, the img Tiny-ImageNet-C/brightness/1/ is fish, but the img ILSVRC2012_img_val/ILSVRC2012_val_00001103.JPEG is a tower.

Why are the PixMix results not included in the ImageNet-C Leaderboard?

According to the paper, PixMix should outperform some of the methods on the ImageNet-C leaderboard. Is there a reason, why the method is not included there?

Code to generate noise?

Hi,
Thanks for the very interesting dataset! May I know if you could release the code used to generated different kinds of corruptions? I would like to try it on some other datasets. Thanks a lot!

The baseline results of AlexNet is not optimal

Hello,

I also train a baseline method and use AlexNet to evaluate corrupted images, but it's different from the AlexNet results in "BENCHMARKING NEURAL NETWORK ROBUSTNESS TO COMMON CORRUPTIONS AND PERTURBATIONS. "

We have similar performance on clean images, but I acquire better results on corrupted images than you. It's so weird.

CIFAR-100-P dataset?

Dear authors,

I can't seem to find CIFAR-100-P in this repository.
If it is already available here, could you point me to its location?
If not, do you plan to release it anytime soon?
I would very much like to use it for evaluation.

Thank you.

Differences between compressed images vs recreating the corrupted dataset

Hi there,

I notice considerable differences in accuracy for Tiny-ImageNet when using the provided JPEGs vs recreating the corrupted dataset with the function make_tinyimagenet_c.py. I have run make_tinyimagenet_c.py for the validation dataset of Tiny ImageNet and tested the accuracy of a trained model on Gaussian noise. For the compressed images and severity=3, I get an accuracy of around 20%, and if I recreate the dataset, the accuracy is around 31%. Since recreating the dataset saves the new corrupted images as JPEGs, the compression should not be the issue.

I am using a resnet-18 for my evaluation.

Do you know how that might happen?

Question about Relative Corruption Error

Hi,

May I ask a question question regarding the Relative Corruption Error?

Do I understand correctly that here we sum all the level errors first before subtracting E^{Network}_{Clean}? Or we subtract each level error by the clean error and then sum?

Thank you in advance for you answer!

Testing the images which is generated online

Hi, I wonder if we do not generated the corrputed images offline, how can we avoid the difference of evaluation results? In other words, if we do the evaluation and corrupt the images at the same time（ such as in dataloader simillar to data augmentation) , will the accuracy of a same model differ a lot in each evaluation?

Access to Imagenet-P dataset

Hi, thanks for your sharing the code and project. Through the urls in the readme, the speed of downloading the dataset is very slow, could you share the dataset through google drive. etc.

The subset 'Frost' of TinyImageNet-C has 1000 classes

Dear Hendrycks,

First I would like to thank you for the great contribution to the OOD datasets!

As stated in the title, in TinyImageNet-C, the corruption 'Frost' has more than 200 classes. In fact there exists 1000 classes.

About 'difficulty' in imagenet-p

Looking over the code https://github.com/hendrycks/robustness/blob/master/ImageNet-P/create_p/make_imagenet_p.py, I came up with some questions.

In make_imagenet_p.py, gaussian noise, shot noise and speckle noise have three different 'levels' which I guess is 'difficulty' mentioned in the paper. When calculating mFR or mT5D for imagenet-p, should the three different difficulties for the three noises be taken into account as well?
I guess the weights used for noises with difficulty levels vary from dataset to dataset.
For example, in make_imagenet_p.py, 0.03, 0.05 and 0.08 are used for gaussian noise weights, whereas in make_cifar_p.py, 0.02, 0.04 and 0.06 are used. Are these values determined with any rules?
If I want to make -p data with some new datasets, can I just use the weight values for imagenet-p?

Better ResNet50 Baseline

Hi, I got a 61% mCE and the same clean error for the resnet50 ImgNet-c baseline using this code. Much better than 76.7%. Did I get right?

Make ImageNet-C module pip-installable

We're interested in using the ImageNet-C transformations as a baseline attack in the Unrestricted Adversarial Examples Challenge: openphilanthropy/unrestricted-adversarial-examples#40

It would be easier for us to do this if the module included a setup.py and was available on pypi

Do we have a Tiny Imagenet-C leaderboard?

Do you have any information about papers that report TinyImageNet-C numbers?

Thanks.

What is the *-P class of datasets?

Hi, Thanks for publishing the original datasets.

I noticed that the links to the datasets here also have a P class of datasets. I am wondering what this is because I didn't remember it in the original paper and I can't find any exact reference to this.

Thanks

CIFAR-10-C Train or Test Set?

Thank you for making this repository available! I'm hoping to work with the CIFAR-10-C set and just downloaded the .tar available here (https://zenodo.org/record/2535967#.YrMov5DMLX0).

The docs note this is the CIFAR-10 test set. However, when I read in the labels.npy file (or other *.npy files, such as the Gaussian blur one), there appear to be 50,000 images. Which I think would be the train set instead?

The docs imply there would only be 20,000 imgs in the .npy (and 10,000 for the labels) if there was a different degree of corruption used for the test set.

I just wanted to check then if the .tar contains the CIFAR-10 train set? And/or how to get access to the corrupted test set? Thank you, and apologies if I've misinterpreted any of this!

Reason behind the different values of C for different dataset ?

def gaussian_noise(x, severity=1):
c = [.08, .12, 0.18, 0.26, 0.38][severity - 1]
x = np.array(x) / 255.
return np.clip(x + np.random.normal(size=x.shape, scale=c), 0, 1) * 255

The similar function is defined in multiple files but with a different C value. What is the reason behind it and how was it calculated ?

Do you have any CIFAR-10-C benchmarks?

Any papers/resources would be greatly helpful.

CIFAR-100-C?

Do you also have a copy of CIFAR-100-C that we could download? I didn't see one linked in the readme file. Thanks!

Test inception v3 model

Is there a reason why the imagenet c inception dataset (3x299x299) is available on the Google drive, but no inception model is available in the test.py function? Would it be possible to include it? :)

Labels for CIFAR-10P

The labels for the _p version of CIFAR-10 are not stored and are not made available in the tar file found here.
https://zenodo.org/record/2535967

About the label

Hi, we download the tiny imagenet-c, but the label of it is not corresponding to the label of the original label in imagenet. For example, the label of goldfish is n01443537, but in imagenet it is 916.
Would you please release the label of it?

Thanks!

do i need additional post-processing?

Is it correct to download Imagenet-c as a link? , Or do i need additional post-processing?

Cifar10 Corrupted downloaded from tfds does not drop accuracy

The dataset provided in https://www.tensorflow.org/datasets/catalog/cifar10_corrupted seems to have two variants:

Downloading the dataset directly without specifying the corruption e.g. tfds.load("cifar10_corrupted")
Downloading a specific corruption e.g. tfds.load("cifar10_corrupted/gaussian_noise_5")

The former variant does not seem to be affected by any corruption, for it does not lead to drop in the accuracy. Please confirm.

On a side note, I would also like to know more about how you calculated the classification error for Cifar datasets as reported in Table 1 of AugMix paper. Is it averaged over all the corruptions, or is there a separate Cifar10-C dataset that was created by combining all the corruptions at random?

cannot import wand.image and wand.api

Hi， I meet some errors when I try to use create_imagenet_c.py on my own dataset. The bug follows and it seems ' MagickWand shared library not found'.

Traceback (most recent call last):
File "", line 1, in
File "/home/fuyi02/anaconda3/envs/vos-temp/lib/python2.7/site-packages/wand/image.py", line 20, in
from .api import MagickPixelPacket, libc, libmagick, library
File "/home/fuyi02/anaconda3/envs/vos-temp/lib/python2.7/site-packages/wand/api.py", line 1394, in
traceback.format_exc())
ImportError: MagickWand shared library not found or incompatible
Original exception was raised in:
Traceback (most recent call last):
File "/home/fuyi02/anaconda3/envs/vos-temp/lib/python2.7/site-packages/wand/api.py", line 817, in
ctypes.c_void_p] # PixelWand color
File "/home/fuyi02/anaconda3/envs/vos-temp/lib/python2.7/ctypes/init.py", line 379, in getattr
func = self.getitem(name)
File "/home/fuyi02/anaconda3/envs/vos-temp/lib/python2.7/ctypes/init.py", line 384, in getitem
func = self._FuncPtr((name_or_ordinal, self))
AttributeError: libMagickWand.so.2: undefined symbol: DrawSetBorderColor

My system is centos6.4.
I download the ImageMagick6 in 'https://github.com/ImageMagick/ImageMagick6' and install with'
cd ImageMagick-6
./configure --prefix=/usr --with-modules --with-perl=/usr/bin/perl --with-jp2 --enable-shared --disable-static --without-magick-plus-pus
make
make install'
then ' yum install ImageMagick-devel '

Can you tell me how to fix it, Thank you!

Folders missing from the Zenodo files

Hi @hendrycks.

I downloaded the four different tars as provided here: https://zenodo.org/record/3565846.

I think I am missing four folders here although I ensured the tars are not corrupted:

As per ImageNet-P/test.py I expected to see all the perturbations:

['gaussian_noise', 'shot_noise', 'motion_blur', 'zoom_blur',
                             'spatter', 'brightness', 'translate', 'rotate', 'tilt', 'scale',
                             'speckle_noise', 'gaussian_blur', 'snow', 'shear']

Also, inside the noise perturbations, I did not see separate difficulties as indicated inside the code.

Am I missing out on something?

Question about the paper formula

Hello. Thank you for the great work.

Should the equation defining the corruption robustness be Ec∼C[P(x,y)∼D(f(c(x)) = y)] instead of Ec∼C[P(x,y)∼D(f(c(x) = y))]?

If it should not, would you mind explaining what it means by c(x) = y?

Thank you so much!

Maybe more detailed robustness results on AlexNet?

We're interested in studying the behavior of DNNs on ImageNet-C in more detail. However, we noticed that we cannot find the robust results of severity at each level for AlexNet even in Table 2. So could you provide these results at your convenience? Thanks!

Dataset function missing in ImageNet-P test.py

robustness/ImageNet-P/test.py

Line 21 in e08603a

from utils.video_loader import VideoFolder

utils.video_loader is missing.