evolving-ai-lab / synthesizing Goto Github PK

View Code? Open in Web Editor NEW

473.0 31.0 87.0 1.79 MB

Code for paper "Synthesizing the preferred inputs for neurons in neural networks via deep generator networks"

License: MIT License

Shell 46.09% Python 53.91%

deep-networks neurons caffe neural-network visualization image synthesis

synthesizing's Introduction

Synthesizing preferred inputs via deep generator networks

This repository contains source code necessary to reproduce some of the main results in the paper:

Nguyen A, Dosovitskiy A, Yosinski J, Brox T, Clune J. (2016). "Synthesizing the preferred inputs for neurons in neural networks via deep generator networks.". NIPS 29

If you use this software in an academic article, please cite:

@article{nguyen2016synthesizing,
  title={Synthesizing the preferred inputs for neurons in neural networks via deep generator networks},
  author={Nguyen, Anh and Dosovitskiy, Alexey and Yosinski, Jason band Brox, Thomas and Clune, Jeff},
  journal={NIPS 29},
  year={2016}
}

For more information regarding the paper, please visit www.evolvingai.org/synthesizing

Setup

Installing software

This code is built on top of Caffe. You'll need to install the following:

Install Caffe; follow the official installation instructions.
Build the Python bindings for Caffe
If you have an NVIDIA GPU, you can optionally build Caffe with the GPU option to make it run faster
Make sure the path to your caffe/python folder in settings.py is correct
Install ImageMagick command-line interface on your system.

Downloading models

You will need to download a few models. There are download.sh scripts provided for your convenience.

The image generation network (Upconvolutional network) from [3]. You can download directly on their website or using the provided script cd nets/upconv && ./download.sh
A network being visualized (e.g. from Caffe software package or Caffe Model Zoo). The provided examples use these models:
- BVLC reference CaffeNet: cd nets/caffenet && ./download.sh
- BVLC GoogLeNet: cd nets/googlenet && ./download.sh
- AlexNet CNN trained on MIT Places dataset: cd nets/placesCNN && ./download.sh

Settings:

Paths to the downloaded models are in settings.py. They are relative and should work if the download.sh scripts run correctly.
The paths to the model being visualized can be overriden by providing arguments net_weights and net_definition to act_max.py.

Usage

The main algorithm is in act_max.py, which is a standalone Python script; you can pass various command-line arguments to run different experiments. Basically, to synthesize a preferred input for a target neuron h (e.g. the “candle” class output neuron), we optimize the hidden code input (red) of a deep image generator network to produce an image that highly activates h.

Examples

We provide here four different examples as a starting point. Feel free to be creative and fork away to produce even cooler results!

1_activate_output.sh: Optimizing codes to activate output neurons of the CaffeNet DNN trained on ImageNet dataset. This script synthesizes images for 5 example neurons.

Running ./1_activate_output.sh produces this result:

2_activate_output_placesCNN.sh: Optimizing codes to activate output neurons of a different network, here AlexNet DNN trained on MIT Places205 dataset. The same prior used here produces the best images for AlexNet architecture trained on different datasets. It also works on other architectures but the image quality might degrade (see Sec. 3.3 in our paper).

Running ./2_activate_output_placesCNN.sh produces this result:

3_start_from_real_image.sh: Instead of starting from a random code, this example starts from a code of a real image (here, an image of a red bell pepper) and optimizes it to increase the activation of the "bell pepper" neuron.

Depending on the hyperparameter settings, one could produce images near or far the initialization code (e.g. ending up with a green pepper when starting with a red pepper).
The debug option in the script is enabled allowing one to visualize the activations of intermediate images.
Running ./3_start_from_real_image.sh produces this result:

Optimization adds more green leaves and a surface below the initial pepper

4_activate_hidden.sh: Optimizing codes to activate hidden neurons at layer 5 of the DeepScene DNN trained on MIT Places dataset. This script synthesizes images for 5 example neurons.

Running ./4_activate_hidden.sh produces this result:

From left to right are units that are semantically labeled by humans in [2] as:
lighthouse, building, bookcase, food, and painting

This result matches the conclusion that object detectors automatically emerge in a DNN trained to classify images of places [2]. See Fig. 6 in our paper for more comparison between these images and visualizations produced by [2].

5_activate_output_GoogLeNet.sh: Here is an example of activating the output neurons of a different architecture, GoogLeNet, trained on ImageNet. Note that the learning rate used in this example is different from that in the example 1 and 2 above.

Running ./5_activate_output_GoogLeNet.sh produces this result:

Visualizing your own models

To visualize your own model you should search for the hyperparameter setting that produces the best images for your model. One simple way to do this is sweeping across different parameters (see code setup in the provided example bash scripts).
For even better result, one can train an image generator network to invert features from the model being visualized instead of using the provided generator (which is trained to invert CaffeNet). However, training such generator may not be easy for many reasons (e.g. inverting very deep nets like ResNet).

Licenses

Note that the code in this repository is licensed under MIT License, but, the pre-trained models used by the code have their own licenses. Please carefully check them before use.

The image generator networks (in nets/upconv/) are for non-commercial use only. See their page for more.
See the licenses of the models that you visualize (e.g. DeepScene CNN) before use.

Questions?

Please feel free to drop me a line or create github issues if you have questions/suggestions.

References

[1] Yosinski J, Clune J, Nguyen A, Fuchs T, Lipson H. "Understanding Neural Networks Through Deep Visualization". ICML 2015 Deep Learning workshop.

[2] Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A. "Object detectors emerge in deep scene cnns". ICLR 2015.

[3] Dosovitskiy A, Brox T. "Generating images with perceptual similarity metrics based on deep networks". arXiv preprint arXiv:1602.02644. 2016

synthesizing's People

Contributors

Stargazers

Watchers

Forkers

wanjinchang hyzcn bygreencn alanren1 lijian8 yosinski longjohncoder mltbnz somaticapi rmtsukuru jiriroz hundred06 baiyancheng20 ominux dvmbfairy bowenxu davidjesusacu jbciii chagge emaag jemisa vyraun joeltg oleshiy mlopezantequera leezqcst shenmifangke hiiyl wang4959520 ml-lab kustomzone coderx7 benjamesbabala styanddty codeaudit caomw hyh21521038 yangxs zdwolfe chunniunai220ml ilovecv jakeelwes vseledkin mihaelacr-google forestdengtech nagyistge rsantana-isg hdmjdp mikewlange alexbigboy ten2net sagarjoglekar curioustauseef badwell csu-gh aidasliaudanskas iqbal-chowdhury jliangnku curisan shubhampachori12110095 tuxxon rouseguy ekoziol eve56 leetcodes tongsong91 sandorlevi conjuer tinyloop sreejank strategist922 saum25 qiao1025566574 airsteffen daitomanabe sjamieson xpo3 spadesq nunofernandes-plight dberga cognami 23119841 evazhang612 wut-biolab-zhangyanping

synthesizing's Issues

Details about training the generator

Hello,

I'm trying to adapt your code for a different network (a regression), and I have trouble training the generator network. I'm using http://arxiv.org/abs/1602.02644 as a reference, but I have trouble finding a good combination of generator architecture, discriminant architecture, and hyper parameters.

Would you be open to shared some details of how you trained the generator?

Thanks,

Activating 2 neurons at the same time (feature)

This is not an issue, but a feature request. Is there code available for visualizing 2 (or more) neurons at the same time?

I'd like to explore the section of your paper about using 2, or more, neurons to generate a single image, for example volcano (n09472597 volcano) + Candle (n02948072 candle, taper, wax light).

Very cool project.

Weights link not available

Hello,

I am trying to run the code but I can't download the weights. The link seems to be deactivatex.

cd nets/upconv && ./download.sh or the direct link doesn't respond.

Can you help with this problem?

Thank you in advance.

Upconvolutional network unavailable

The provided links for downloading the Upconvolutional network is unavailable: http://lmb.informatik.uni-freiburg.de/resources/software.php appears to be down, meaning its not currently possible to fully install this repository.

I don't know if this unavailability is permanent - and have emailed the original papers' authors - but do you know of any other sources for this?

How to increase output resolution?

How to increase the resolution of the output images?

Output size ?

Hello ! I totally love this code, thank you so much for putting this together. But I just have a question, how to up the output size ? I just bought a very fast computer and I would like to have High rez results !

That would be amazing, thanks in advance !

abort trap 6

Thanks for posting the code; it looks really interesting! When I run any of the examples I get this sort of error:

Starting optimizing WARNING: Logging before InitGoogleLogging() is written to STDERR F1227 00:45:44.057626 2037047296 inner_product_layer.cpp:144] Cannot use GPU in CPU-only Caffe: check mode. * Check failure stack trace: * ./1_activate_output.sh: line 57: 14235 Abort trap: 6 python ./act_max.py --act_layer ${act_layer} --opt_layer ${opt_layer} --unit ${unit} --xy ${xy} --n_iters ${n_iters} --start_lr ${lr} --end_lr ${end_lr} --L2 ${L2} --seed ${seed} --clip ${clip} --bound ${bound_file} --debug ${debug} --output_dir ${output_dir} --init_file ${init_file} convert: unable to open image output/fc8_0643_200_0.99_8.0__0.jpg': No such file or directory @ error/blob.c/OpenBlob/2701.
convert: no images defined `output/fc8_0643_200_0.99_8.0__0.jpg' @ error/convert.c/ConvertImageCommand/3258. convert: unable to open image` output/fc8_0643_200_0.99_8.0__0.jpg': No such file or directory @ error/blob.c/OpenBlob/2701.

unit: 624 xy: 0
n_iters: 200
L2: 0.99
start learning rate: 8.0
end learning rate: 1e-10
seed: 0
opt_layer: fc6
act_layer: fc8
init_file: None
clip: 0
bound: act_range/3x/fc6.txt

debug: 0
output dir: output
net weights: nets/caffenet/bvlc_reference_caffenet.caffemodel
net definition: nets/caffenet/caffenet.prototxt

`
Any idea what could be causing it?
Thanks so much!

generator.caffemodel

I want to visualize my own models, but I don't know how to train the generator.caffemodel.
Can you help me? Thank you~

Handling of xy in convolutional layers

Looking at the code inside of act_max.py it appears to me that the way the neurons get activated in convolutional layers is a bit of a shortcut and will miss most of them. It looks like only the diagonal ones are accessed since xy is the same for x and y:

# Move in the direction of increasing activation of the given neuron
  if end in fc_layers:
    one_hot.flat[unit] = 1.
  elif end in conv_layers:
    one_hot[:, unit, xy, xy] = 1.
  else:
    raise Exception("Invalid layer type!")

I made a quick fix which seems to work, which treats the xy as an index and converts it to a column/row representation - I'm not sure if this makes sense, but it does return results that look reasonable:

# Move in the direction of increasing activation of the given neuron
  if end in fc_layers:
    one_hot.flat[unit] = 1.
  elif end in conv_layers:
    one_hot[:, unit, xy % one_hot.shape[2], int(xy/one_hot.shape[2])] = 1.
  else:
    raise Exception("Invalid layer type!")

further down in that function another fix is required:

# Check the activations
  if end in fc_layers:
    fc = acts[end][0]
    best_unit = fc.argmax()
    obj_act = fc[unit]

  elif end in conv_layers:
    fc = acts[end][0, :, xy % acts[end].shape[2] , int(xy/acts[end].shape[2])]
    best_unit = fc.argmax()
    obj_act = fc[unit]

A question about model input

Hello,thanks a lot for sharing the code. I try to run the program following you description. A question to ask you: "1_activate_output.sh","2_activate_output_placesCNN.sh","4_activate_hidden.sh","5_activate_output_GoogLeNet.sh",they needn't to input image,right? My understanding is you train the model only classify the type(eg:mask,library,and so on) and collect features of these image, then input a type and the model will generate that type image,right?

Online Demo

This code looks cool. but is there an online demo so i can experiment it first