jxhe / vae-lagging-encoder Goto Github PK
View Code? Open in Web Editor NEWPyTorch implementation of "Lagging Inference Networks and Posterior Collapse in Variational Autoencoders" (ICLR 2019)
License: MIT License
PyTorch implementation of "Lagging Inference Networks and Posterior Collapse in Variational Autoencoders" (ICLR 2019)
License: MIT License
I was wondering if you have a script for sampling or generating sentences. I can see there is a function sample_sentences in text.py but am not able to run it properly. Can you help me out in using that function to generate sentences so that i can see some samples?
Thanks
I trained the model with my own data. My dataset size is 5M. After I trained the model, I ran the reconstruction code.
python text.py --dataset [dataset] --decode_from [pretrained model path] --decode_input [a text file for reconstruction]
I end up getting the same sentences
My config:
params={
'enc_type': 'lstm',
'dec_type': 'lstm',
'nz': 32,
'ni': 512,
'enc_nh': 1024,
'dec_nh': 1024,
'dec_dropout_in': 0.5,
'dec_dropout_out': 0.5,
'batch_size': 32,
'epochs': 100,
'test_nepoch': 5,
'train_data': 'datasets/bj/train.txt',
'val_data': 'datasets/bj/valid.txt',
'test_data': 'datasets/bj/test.txt'
}
Hello,
Thanks for the code. I was wondering what is the value of beta to be used during testing for the baseline beta vae models in your paper. Is it always one or the fixed beta?
Also, in case of VAE + annealing, how is the model selection during training done? By evaluating NLL on current value of beta or beta=1?
Thanks
Hello,
Thanks for putting up this code! I was curious to know what part of the code you used to generate the mutual information plots shown in Figure 5 of the paper. Can you point me to that?
Hi,
Thanks for the code and a very nice work! I was trying to run your code on yahoo dataset on GPU and it seems to be taking longer than what is reported in the paper. In the paper, it is reported 11 hours, but I have been running for 2 days and it is still in epoch 3. Am I doing something wrong? Any information on this would be great.
Hi, thanks for your code, it helps me a lot.
I have been learning latent variable models within weeks and feel puzzled about this field. As reconstruction loss (or decoder) is expensive to compute and prone to collapse, I`m wondering is the reconstruction procedure indispensable for a latent mode to capture useful information?
For example, in a supervised multi-task setting, if I want latent space can capture domain-specific signal, how can I towards this end by just using classification label and domain label but reconstruction loss, are there any relevant literatures ?
I am stuck in this question, hope you can direct me out.
Hi,
From your experiments, does a vanilla VAE show posterior collapse for say yahoo dataset? I ran with
python text.py --dataset yahoo --kl_start 1.0
and found that KL term remains non zero while MI and AU go to zero (In fact MI is negative). Is this the expected behaviour ?
Hi, how are you?
Thanks for a great paper git repository.
Line 433 in d78230a
report_num_sents
comparing to returning the mean from both loss_rc
and loss_kl
?Thanks.
Matan.
Hi Junxian,
Thank you so much for your paper and code, it helps me a lot.
But I am a little confused by the loss calculation here in test method in test.py.
**test_loss = (report_rec_loss + report_kl_loss) / report_num_sents
nll = (report_kl_loss + report_rec_loss) / report_num_sents
kl = report_kl_loss / report_num_sents
ppl = np.exp(nll * report_num_sents / report_num_words)**
what I saw and did before was rec_loss as nll_loss (it is calculated using nll_loss) and ppl is calculated purely on rec_loss (nll_loss).
But I see you use kl in the term here, so I wonder what is the intuition here and it seems test_loss is the same as nll?
Any explanations would be much appreciated.
Thank you in advance.
Hi,
In the paper, it says that the true posterior can be computed by sum(z*p(z|x) in Appendix A. However, in the code it seems that it is calculating the likelihood P(z,x)? Why?
Hello,
I just want to ask if u have ever trained your aggressive VAE training on large scale ImageNet dataset 32x32 or CIFAR 10 ?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.