amiratag / ace Goto Github PK

View Code? Open in Web Editor NEW

153.0 8.0 39.0 47.66 MB

Towards Automatic Concept-based Explanations

License: MIT License

Python 100.00%

ace's Introduction

ACE

ACE: Towards Automatic Concept Based Explanations

Please cite the following work if you use this benchmark or the provided tools or implementations:

@inproceedings{ghorbani2019towards,
  title={Towards automatic concept-based explanations},
  author={Ghorbani, Amirata and Wexler, James and Zou, James Y and Kim, Been},
  booktitle={Advances in Neural Information Processing Systems},
  pages={9273--9282},
  year={2019}
}

Getting Started

Here is the tensorflow implementations of the paper Towards Automatic Concept-based Explanations presented at NeurIPS 2019.

Ghorbani, Amirata, James Wexler, James Y. Zou, and Been Kim. 
"Towards Automatic Concept-based Explanations." 
Advances in Neural Information Processing Systems. 2019.

Prerequisites

Required python libraries:

  Scikit-image: https://scikit-image.org/
  Tensorflow: https://www.tensorflow.org/
  TCAV: https://github.com/tensorflow/tcav

Installing

An example run command:

python3 ace_run.py --num_parallel_runs 0 --target_class zebra --source_dir SOURCE_DIR --working_dir SAVE_DIR --model_to_run GoogleNet --model_path ./tensorflow_inception_graph.pb --labels_path ./imagenet_labels.txt --bottlenecks mixed4c --num_random_exp 40 --max_imgs 50 --min_imgs 30

where:

num_random_exp: number of random concepts with respect to which concept-activaion-vectors are computed for calculating the TCAV score of a discovered concept (recommended >20).

For example if you set num_random_exp=20, you need to create folders random500_0, rando500_1, ..., random_500_19 and put them in the SOURCE_DIR where each folder contains a set of 50-500 randomly selected images of the dataset (ImageNet in our case).

target_class: Name of the class which prediction is to be explained.

SOURCE_DIR: Directory where the discovery images (refer to the paper) are saved. 
It should contain (at least) num_random_exp + 2 folders: 
1-"target_class" which contains images of the class to be explained (in this example the shoulder should be names as zebra). 
2-"random_discovery" which contains randomly selected images of the same dataset (at lease $max_imgs number of images).
3-"random500_0, ..., random_500_${num_random_exp} where each one contains 500 randomly selected images from the data set"

num_parallel_runs: Number of parallel jobs (loading images, etc). If 0, parallel processing is deactivated.

SAVE_DIR: Where the experiment results (both text report and the discovered concept examples) are saved.

model_to_run: One of InceptionV3 or GoogleNet is supported (the weights are provided for GoogleNet). You can change the "make_model" function in ace_helpers.py to have your own customized model.
model_path: Path to the model's saved graph.

If you are using a custom model, you should write a wrapper for it containing the following methods:

run_examples(images, BOTTLENECK_LAYER): which basically returens the activations of the images in the BOTTLENECK_LAYER. 'images' are original images without preprocessing (float between 0 and 1)
get_image_shape(): returns the shape of the model's input
label_to_id(CLASS_NAME): returns the id of the given class name.
get_gradient(activations, CLASS_ID, BOTTLENECK_LAYER): computes the gradient of the CLASS_ID logit in the logit layer with respect to activations in the BOTTLENECK_LAYER.

Authors

Amirata Ghorbani - Website
James Wexler - Website
James Zou - Website
Been Kim - Website

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Acknowledgments

Work was done as part of Google Brain internship.

ace's People

Contributors

Stargazers

Watchers

ace's Issues

save_dir subdirectory explanations?

I did a small test with some random images and was wondering if you had some explanations for the directories within save_dir. In addition the annotations on the save_dir/results/*.png images (images are annotated with some numbers..). This would be really helpful to understand the results!

Thank you

Run ACE with Inception-V3 model

How might I use a Inception-V3 model pre-trained on the ImageNet dataset with the ACE library? Is it possible to just replace the tensorflow_inception_graph.pbfile provided? If so, is there a graph definition file I can use i.e. pb file?

Otherwise, how might I do this using Keras etc.?

Cheers

Using the Xception model

Can this method be used with the Xception model? I'm trying to write a wrapper for it but I'm not sure about what the bottleneck layers are.. I can get all the tensors from graph.get_operations() but I'm not sure how to determine which tensors would represent the bottleneck layers. If you have any suggestions, that would be great.

what's the random500 directory?

Hi,what's the random500 directory?
Could you upload all files and directories?

about random500_0

What images should put in the directory named ./SOURCE_DIR/random500_i

Which tensorflow version to use?

I'm trying to run ace_run.py, and I get an AttributeError: module 'tensorflow' has no attribute 'gfile' error, which seems to be caused by an incorrect tensorflow version.

google.protobuf.message.DecodeError: Error parsing message

File "./tcav/tcav/model.py", line 318, in import_graph
graph_def = tf.compat.v1.GraphDef.FromString(tf.compat.v1.gfile.Open(saved_path, 'rb').read())
google.protobuf.message.DecodeError: Error parsing message

Do you know how I could resolve this problem? Thank you

problem with get_gradients

Anyone facing problem with the get_gradients?
Traceback (most recent call last):
File "ace_run.py", line 112, in
main(parse_arguments(sys.argv[1:]))
File "ace_run.py", line 68, in main
scores = cd.tcavs(test=False)
File "/gstore/home/baners20/GA_progression/interpretability/ACE/ace.py", line 667, in tcavs
gradients = self._return_gradients(tcav_score_images)
File "/gstore/home/baners20/GA_progression/interpretability/ACE/ace.py", line 622, in _return_gradients
acts[i:i+1], [class_id], bn).reshape(-1)
TypeError: get_gradient() missing 1 required positional argument: 'example'

ModuleNotFoundError: No module named 'model'

Hi,
I am trying to run the file ace_run.py with

python3 ace_run.py --num_parallel_runs 0 --target_class Zebra --source_dir ./source ---working_dir ./save --model_to_run InceptionV3 --bottlenecks mixed_8 --num_test 20 --num_random_exp 40 --max_imgs 50 --min_imgs 30 --test_dir ./test
but I am getting an error message:

Traceback (most recent call last):
File "ace_run.py", line 11, in
import ace_helpers
File "/home/andi/Git/ACE/ace_helpers.py", line 8, in
import model
ModuleNotFoundError: No module named 'model'

Is there a file missing?

Cheers,
Andi

CAV computation in context of ACE

Hi,

I had a question about the CAV computation. I read the original paper and it looks like a CAV is computed by calculating the linear decision boundary between the concept and completely random images. For this implementation of ACE, won't using randomly chosen images from the dataset, as written in the README, make it difficult to find a linear boundary separating the concepts from the random dataset images since the concepts are derived from the dataset and will thus be represented in the random dataset images?

In the end, I'm unsure if I should collect random images (from google or imagenet) for the CAV computation or the random images from the dataset. Do you know whether the random dataset images do get separated well from concept images?

Thanks

A spelling error in the source code

if model_to_run == 'InceptionV3':
    mymodel = model.InceptionV3Wrapper_public(
        sess, model_saved_path=model_path, labels_path=labels_path)
elif model_to_run == 'GoogleNet':
    # common_typos_disable
    mymodel = model.GoolgeNetWrapper_public(
        sess, model_saved_path=model_path, labels_path=labels_path)

in ace_helpers.py line 38, mymodel = model.GoolgeNetWrapper_public
should be mymodel = model.GoogleNetWrapper_public, notice the spelling of Google

or you may meet error like :
AttributeError: module 'tcav.model' has no attribute 'GoolgeNetWrapper_public'

Shape Error

I am getting this error,

ValueError: Cannot feed value of shape (0,) for Tensor 'Placeholder:0', which has shape '(?, ?, ?, 3)'

Could you please help me?
I will appreciate your response
Thanks,

ResourcExhaustedError

I was wondering if this type of error was ever seen with using ACE:

365 Caused by op 'import/xception/block2_sepconv1/separable_conv2d', defined at:
366 File "./ACE/ace_run.py", line 127, in
367 main(parse_arguments(sys.argv[1:]))
368 File "./ACE/ace_run.py", line 40, in main
369 sess, str(args.model_to_run), args.model_path, args.labels_path)
370 File "/home/hjcho/projects/hnsc/histoXai/ACE/ace_helpers.py", line 45, in make_model
371 mymodel= model.XceptionHPVWrapper_public(sess, model_path, labels_path)
372 File "/home/hjcho/projects/hnsc/histoXai/tcav/tcav/model.py", line 477, in init
373 super(XceptionHPVWrapper_public, self).init(sess, model_path, labels_path, image_shape, endpoints_xc, 'import')
374 File "/home/hjcho/projects/hnsc/histoXai/tcav/tcav/model.py", line 251, in init
375 scope=scope)
376 File "/home/hjcho/projects/hnsc/histoXai/tcav/tcav/model.py", line 324, in import_graph
377 input_graph_def, graph_inputs, list(endpoints.values()), name=sc)
378 File "/home/hjcho/anaconda3/envs/env_tf113/lib/python3.7/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
379 return func(*args, **kwargs)
380 File "/home/hjcho/anaconda3/envs/env_tf113/lib/python3.7/site-packages/tensorflow/python/framework/importer.py", line 442, in import_graph_def
381 _ProcessNewOps(graph)
382 File "/home/hjcho/anaconda3/envs/env_tf113/lib/python3.7/site-packages/tensorflow/python/framework/importer.py", line 235, in _ProcessNewOps
383 for new_op in graph._add_new_tf_operations(compute_devices=False): # pylint: disable=protected-access
384 File "/home/hjcho/anaconda3/envs/env_tf113/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 3433, in _add_new_tf_operations
385 for c_op in c_api_util.new_tf_operations(self)
386 File "/home/hjcho/anaconda3/envs/env_tf113/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 3433, in
387 for c_op in c_api_util.new_tf_operations(self)
388 File "/home/hjcho/anaconda3/envs/env_tf113/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 3325, in _create_op_from_tf_operation
389 ret = Operation(c_op, self)
390 File "/home/hjcho/anaconda3/envs/env_tf113/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 1801, in init
391 self._traceback = tf_stack.extract_stack()
392
393 ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[100,296,296,128] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocato
394 [[node import/xception/block2_sepconv1/separable_conv2d (defined at /home/hjcho/projects/hnsc/histoXai/tcav/tcav/model.py:324) ]]
395 Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
396
397 [[node import/xception/add_9/add (defined at /home/hjcho/projects/hnsc/histoXai/tcav/tcav/model.py:324) ]]
398 Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
399

If you have seen it, do you know how it can be resolved?

SOURCE_DIR clarification

SOURCE_DIR: Directory where the discovery images (refer to the paper) are saved.
It should contain (at least) num_random_exp + 2 folders:
1-"target_class" which contains images of the class to be explained.
2-"random_discovery" which contains randomly selected images of the same dataset (at lease $max_imgs number of images).
3-"random500_0, ..., random_500_${num_random_exp} where each one contains 500 randomly selected images from the data set"

So I have a dataset with images belonging to either class A or B. I want to explain class A. The target_class directory should contain class A images. random_discovery should contain random images from the dataset which can be either class A or B. and random500_x directories should contains images from the dataset, which can be either class A or B. All the images for each of these folders come from the same dataset. Is that correct?

tcav python module updated recently making implementation incompatible

Traceback (most recent call last):
File "/gdrive/My Drive/braxai/ACE-master/ace_run.py", line 12, in
from ace import ConceptDiscovery
File "/gdrive/My Drive/braxai/ACE-master/ace.py", line 18, in
from tcav import cav, tcav_helpers
ImportError: cannot import name 'tcav_helpers'

parameter param_dict of create_patches doesn't work

if I use the customized parameter like this
cd.create_patches(param_dict={'n_segments': [5,10,15]})
it will use the default parameter after the second discovery image
because you use the "pop" function in the _return_superpixels
param_dict.pop('n_segments', [15, 50, 80])

support for resnet and custom models

Is there any extra code to be written to test ACE for custom models/resnet

random_discovery?

What is the random_discovery folder used for? Are these supposed to be the images that are segmented for concept discovery?

modulenotfound

Hi,
I didn't know how to contact the owner of this repository so I will submit some bugs I have run into so far.

Will you be updating the code to make it compatible with tf 2.0?

Also, in the description for running ace, there's a typo. 1) I think the file to run is ace_run.py and 2) I think ---working_dir is supposed to be --working_dir and -bottlenecks to --bottlenecks.

I'm going to be using this tool for my project with histology slides. I've also never posted issues, so I hope this is the right place to bring some bugs to your attention.

Thank you!