Giter VIP home page Giter VIP logo

Comments (5)

stratisMarkou avatar stratisMarkou commented on May 21, 2024

Hi @CBird210 and thanks for bringing this up here. From looking at the code it seems that for MC dropout, the method get_weight_samples does not sample the weights but instead gets the raw weight values without turning any of them off. For bayes-by-backprop, the weights are in fact sampled. Any ideas on what's happening in MC dropout @JavierAntoran?

from bayesian-neural-networks.

JavierAntoran avatar JavierAntoran commented on May 21, 2024

Hi @CBird210 @stratisMarkou,

As @stratisMarkou said, for MC dropout we return the raw weight values. The MC dropout posterior is composed of delta functions at every parameter value and delta functions at 0. Thus, sampling the weights would randomly return some weight values and some zeros.

The get_weight_samples function was written to get insight into approximate inference behavior by allowing us to plot a histogram of weight values (see top right plot in https://javierantoran.github.io/assets/poster_advml.pdf). For bayes-by-backprop we actually sample weights as this allows us to represent weight posterior variance in the above histogram. For MC dropout, sampling would not tell us much about the range of the learned weights as dropout probabilities are fixed, not learned. Perhaps get_weight_samples is a poor naming choice. I chose it because all of the other approximate inference methods have a function with that exact name, allowing for easy plug-in replacement of approximate inference methods in experiments.

@CBird210 if you call the all_sample_eval function, specifying the parameter "Nsamples", you will get a vector of Nsamples different predictions from the model.

from bayesian-neural-networks.

CBird210 avatar CBird210 commented on May 21, 2024

Hi,

Thank you so much for getting back to me so quickly!

I noticed that get_weight_samples also seems to give me the exact same numbers if I train the same network twice using Bayes By Backprop. This is just confusing me as it uses the function sample_weights which looks like it should be giving different answers each time. I’m sorry if this is a mistake on my end, could you help me with some clarification?

all_sample_eval looks like it is doing exactly what I needed. However, I noticed that when I use all_sample_eval and just specify Nsamples, the MC Dropout code gives me results over a group of 16 numbers in MNIST while Bayes By Backprop gives me results over a group of 100 numbers in MNIST. Do have an idea of how I could get results from all_sample_eval for the two methods on the same group of data (I’m trying to do a direct comparison of the posterior predictive distribution computed by both)?

Also, when trying to draw parallels between the code and the source material, I’m having a little trouble with parts of the Bayes by Backprop paper. Could you maybe point me in the direction of where in the code steps 4-7 of their algorithm (in section 3.2) are taking place? Once again, I’m new to Python so apologies if this is really obvious.

Thanks again for your help!

from bayesian-neural-networks.

JavierAntoran avatar JavierAntoran commented on May 21, 2024

Hi,

I noticed that get_weight_samples also seems to give me the exact same numbers if I train the same network twice using Bayes By Backprop.

This should not happen. You probably have fixed some random seed in your code or you may be mistakenly loading the same saved model for both runs?

I noticed that when I use all_sample_eval and just specify Nsamples, the MC Dropout code gives me results over a group of 16 numbers in MNIST while Bayes By Backprop gives me results over a group of 100 numbers in MNIST.

Nsamples controls how many MonteCarlo sampled are drawn when approximating the posterior predictive. In order to control which data is being evaluated, you need to ensure that your inputs (x, y) are the same. From your comment, its sounds like you are running different batch sizes.

Could you maybe point me in the direction of where in the code steps 4-7 of their algorithm (in section 3.2) are taking place?

Sure. Note that step 4 is written in a bit of a strange way in the paper. For me, that step is more clear in equation 8. In our code, that occurs in lines 198-208

for i in range(samples):
        out, tlqw, tlpw = self.model(x, sample=True)
        mlpdw_i = F.cross_entropy(out, y, reduction='sum')
        Edkl_i = (tlqw - tlpw) / self.Nbatches
        mlpdw_cum = mlpdw_cum + mlpdw_i
        Edkl_cum = Edkl_cum + Edkl_i

mlpdw = mlpdw_cum / samples
Edkl = Edkl_cum / samples

loss = Edkl + mlpdw

Note that there is a sign discrepancy between their algorithm and our optimisation as Pytorch minimises a loss as opposed to maximising a value function.

steps 5-7 occur through automatic differentiation with:

loss.backward()
self.optimizer.step()

Hope this helps!
Javier

from bayesian-neural-networks.

CBird210 avatar CBird210 commented on May 21, 2024

Sorry for late reply! This was very useful - thank you!

from bayesian-neural-networks.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.