cdoersch / vae_tutorial Goto Github PK
View Code? Open in Web Editor NEWCaffe code to accompany my Tutorial on Variational Autoencoders
License: MIT License
Caffe code to accompany my Tutorial on Variational Autoencoders
License: MIT License
I have some trouble happening with loss=nan. I am confused why it happened by using my own data.
I modify the "batch size" from 100 to 1, and then, modify the param of "Dummydata" shape dim from 100 to 1. But I don't know whether should I modify Reduction loss_weight. Is that the key factor influencing the mistake result loss=nan?
When i run python mnist_vae.py, got "Check failed: target_blobs.size() == source_layer.blobs_size() (2 vs. 0) Incompatible number of blobs for layer decode4". This error come from 'net=caffe.Net('mnist_vae.prototxt','snapshots/mnist_vae_iter_60000.caffemodel', caffe.TEST'.
I have checked the code,but found nothing wrong. hope for your help! ๐ญ
hi, I find there maybe a issue in model prototxt about the KL-divergence loss bewteen Q(z|X) and P(z).
In the paper, the KL-divergence of Enquation 7:
The first term is trace of diagonal matrix and should be sum of all diagonal elements, ex: x1+x2+x3.
But in the model file implementation is sum of square in diagonal.
layer{
name: "var"
type: "Eltwise"
bottom: "sd"
bottom: "sd"
top: "var"
eltwise_param{
operation: PROD
}
include {
phase: TRAIN
}
}
layer{
name: "kldiv_plus_half"
type: "Eltwise"
bottom: "meansq"
bottom: "var"
bottom: "logsd"
top: "kldiv_plus_half"
eltwise_param{
operation: SUM
coeff: 0.5
coeff: 0.5
coeff: -1.0
}
include {
phase: TRAIN
}
}
That makes me some confuse.
Hi thanks for the great tutorial. I have trouble understanding math. What is the reason to pass in encode3
to logsd
before the nonlinearity is applied? Why not give encode3neur
to both mu
and logsd
? I would ask if it's a typo, but running the reference prototxt, I can make it converge.
I have combined the VAE layers with convolution and deconvolution layers, and am having trouble training MNIST with this new architecture. (Using Sigmoid neurons instead of ReLU, if that matters).
There is a log of the determinant of the covariance term in the KLD equation, however, I see that you have a sum instead of a product, where the determinant of a diagonal matrix would be a product. Could you provide some insight as to why this is?
Hi, thanks for a great tutorial on VAEs!
I have a quick question about the implementation. In the tutorial, the reconstruction loss is L2 (as I thought it should be).
However, in the Caffe implementation, there is what seems to be an additional cross entropy reconstruction loss
What is the purpose of this loss? Or am I missing something?
I realise cross entropy loss if often better for less blurry images, but since we parametrize P(X|z) by a gaussian with mean f(z), I thought the log likelihood should be proportional to ||X - f(z)||^2.
Thank you!
Thank you for this great tutorial and the code. But I have trouble understanding the kldiv(Power) layer in mnist_vae. According to the tutorial, the loss function should contain D[Q(z|X)||P(z)], not exp(D[Q(z|X)||P(z)]). So why do we need this layer?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.