google-deepmind / neural-processes Goto Github PK

This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

License: Apache License 2.0

Jupyter Notebook 100.00%

neural-processes's Introduction

The Neural Process Family

This repository contains notebook implementations of the following Neural Process variants:

Conditional Neural Processes (CNPs)
Neural Processes (NPs)
Attentive Neural Processes (ANPs)

The code for CNPs can be found in conditional_neural_process.ipynb while the code for both NPs and ANPs is located in attentive_neural_process.ipynb.

The notebooks include an overview of the different building blocks of the models as well as the code to run each model in the browser. Any further details can be found in the CNP paper, the NP paper and the ANP paper.

Quick run

The easiest way to run the code is to run it in the browser on Colab. Here below are the links to each of the notebooks in Colab:

Colaboratory is a free Jupyter notebook environment provided by Google that requires no setup and runs entirely in the cloud. The hosted runtime already includes the following dependencies, tested on the following versions in brackets:

Numpy (1.14.6)
Tensorflow (1.13.1)
Matplotlib (2.2.4)

which are all we need to run the code in this repository.

Alternatively, you can open the .ipynb files using Jupyter notebook. If you do this you will also have to set up a local kernel that includes Tensorflow.

Citing CNPs

If you like our work and end up using neural processes for your reseach give us a shout-out:

Conditional Neural Processes: Garnelo M, Rosenbaum D, Maddison CJ, Ramalho T, Saxton D, Shanahan M, Teh YW, Rezende DJ, Eslami SM. Conditional Neural Processes. In International Conference on Machine Learning 2018.
Neural Processes: Garnelo, M., Schwarz, J., Rosenbaum, D., Viola, F., Rezende, D.J., Eslami, S.M. and Teh, Y.W. Neural processes. ICML Workshop on Theoretical Foundations and Applications of Deep Generative Models 2018.
Attentive Neural Processes: Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O. and Teh, Y.W. Attentive Neural Processes. In International Conference on Learning Representations 2019.

Contact

Any feedback is much appreciated! Drop us a line at [email protected] (Conditional Neural Process) or [email protected] ((Attentive) Neural Process).

Disclaimer

This is not an official Google product.

neural-processes's People

Contributors

Stargazers

Watchers

Forkers

johannah cbonnett filipstrand amoliu xdcesc jdc08161063 gogumee briando2005 hyunghunny gatarelib xy1234552 danielflamshep johnsonc matthewalmeida jdetras mannykayy itssmutnuri bkong1990 ai3dvision codeaudit niubilitydiu guyko81 tony32769 shubhampachori12110095 bbngu kastnerkyle jamied157 rintukutum creatist fudp flyinskybtx amirunpri2018 osirisjs shafiahmed htnani shahidpavis xiaoliang008 abiraja2004 nunofernandes-plight rlabuguen 7472741 kaizhang11 xiaojunzhao nitrogenase artificyan chanuk-yang nicolepcx ashora hyzcn bluematrix007 enricedward algebra-cadabra ymcidence renmengye gvvynplaine dirmeier sffej datajunkie007 soudia webappengineer shyamalschandra rbs-pli arita37 macaroidair23 wxyzjl mandiehyewon dyf-ai hiaweng ruichen-v pyligent justin-ibc lukemshannonhill fishewyz iamvnie blakecheng fuqianggu yuchenlichuck vikul-gupta ihaeyong jinyeom christabella nband conorfoy wangyongguang 3ft upstatepedro robot-ai-machinelearning skhalil insujeon cognoscentai fredchettouh sts-sadr shirishbahirat muba1 lawliet357 wangyu-cqu ankitshah009 arnaudpannatier muskanmahajan37 zcmail

neural-processes's Issues

Demo for classification experiment

Hello,
I don't know if this is the right forum to ask.....

Is there a available demo on the one-shot classification using the Omniglot dataset presented in section 4.3 of the paper. I am unable to understand how you combined the different channels of convolutional output to compute the statistics and combine it with test image to perform classification?

Lack of variability of ANP result

So here are two results (two columns) of ANP after 50000+ iterations.
The first row is the default plot and the plots in the second row contains 16 samples of the curve (given same context x & y). And the red curve is the mean of these 16 curves. As we can see, there has very limited variation.

This is true also for NP. However, if I set use_deterministic_path=False, the variation start to emerge:

My guess is that during training the decoder only prefer the deterministic path and just ignore the latent path. Then no matter what latent code is delivered the output won't change much. What's your opinion?

Error when x or y size is not one

Hi. I appreciate for sharing code.
I wonder how I could use the code when x and y size is not one.

Thank you

Question for latent encoder of attentive neural processes script

Hi,

First of all, thank you for opening your great script!

I have a question about latent encoder.

In your code, posterior and prior use same latent encoder. Exactly using same (x,y) pair encoder is no problem, but the linear layers to sample the latent variable would be different functions for prior and posterior I think.

If my understanding about the code is wrong, then could I know which point I missed?

Thanks.

Best Regards,
Jaesik Yoon.

Inference tim in classification task

Hi,
Thanks for sharing the code.
I wonder how did you do the prediction in test time on the classification task?

Thanks

Reuse error

When I run the 'attentive_neural_process.ipynb', I got the mistakes as follows:

ValueError: Variable layer_3/kernel already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:

File "", line 28, in batch_mlp
output, output_sizes[-1], name="layer_{}".format(i + 1))
File "", line 31, in call
hidden = batch_mlp(encoder_input, self._output_sizes, "latent_encoder")
File "", line 58, in call
prior = self._latent_encoder(context_x, context_y)

Latent Encoder and Decoder: log_sigma transformation

Hi,

I was wondering what the reasoning behind the specific parameterizations of the standard deviation in the latent encoder is, i.e. why is it bounded?
I usually just use a softplus in such settings.

Also, they are different between the encoder and decoder, not sure if that is intentional.
I'd assume both should be with tf.sigmoid?

(Latent) Encoder: Bounds SD between 0.1 and 1

    # Compute sigma
    sigma = 0.1 + 0.9 * tf.sigmoid(log_sigma)

Decoder: Bounds SD to be higher than 0.1 and...?

    # Bound the variance
    sigma = 0.1 + 0.9 * tf.nn.softplus(log_sigma)

And thank you for that well-documented repository :).

package version control

For the latest version of matplotlib, set_facecolor rather than set_axis_bgcolor should be used in plot_functions?
We should move TensorFlow Distributions to TensorFlow Probability?
It might be better to add information about package versions in the documentation?

NP implementation in ANP

I think NP implementation in ANP is a little bit different with NP paper. right?

In original NP, there is the only latent encoder and latent encoder uses context and target data, while implementation has concated features of latent and deterministic and latent encoder uses only context data.