Hi,
thanks for the easy-to-use code!
I had a question about the use of priors for the weights when calculating the log probability to be used inside the stein gradient update equation: is that something without which things would not work in your experience? I was wondering since the stein gradient just needs us to specify a log_prob that we're interested in maximizing, and so doing svgd with just the model likelihood (wrt ground truth data) as the log_p is also correct right?
I just wanted to clarify if the weight priors are something desirable that we can choose to add t the log_p term because of desired regularisation(as specified in your accompanying paper) or if that's something essential that if not included, renders the math/theory wrong.
Thanks,
Gunshi