Giter VIP home page Giter VIP logo

vae-lagging-encoder's People

Contributors

dspoka avatar jxhe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

vae-lagging-encoder's Issues

generating/sampling sentences

I was wondering if you have a script for sampling or generating sentences. I can see there is a function sample_sentences in text.py but am not able to run it properly. Can you help me out in using that function to generate sentences so that i can see some samples?

Thanks

Prior and Posterior sentence generator generates same sentences

I trained the model with my own data. My dataset size is 5M. After I trained the model, I ran the reconstruction code.

python text.py --dataset [dataset] --decode_from [pretrained model path] --decode_input [a text file for reconstruction]

I end up getting the same sentences

My config:


params={
   'enc_type': 'lstm',
   'dec_type': 'lstm',
   'nz': 32,
   'ni': 512,
   'enc_nh': 1024,
   'dec_nh': 1024,
   'dec_dropout_in': 0.5,
   'dec_dropout_out': 0.5,
   'batch_size': 32,
   'epochs': 100,
   'test_nepoch': 5,
   'train_data': 'datasets/bj/train.txt',
   'val_data': 'datasets/bj/valid.txt',
   'test_data': 'datasets/bj/test.txt'
}

Evaluation of beta vae model in the paper

Hello,
Thanks for the code. I was wondering what is the value of beta to be used during testing for the baseline beta vae models in your paper. Is it always one or the fixed beta?

Also, in case of VAE + annealing, how is the model selection during training done? By evaluating NLL on current value of beta or beta=1?

Thanks

Code for mutual information test?

Hello,

Thanks for putting up this code! I was curious to know what part of the code you used to generate the mutual information plots shown in Figure 5 of the paper. Can you point me to that?

Training time for text data

Hi,

Thanks for the code and a very nice work! I was trying to run your code on yahoo dataset on GPU and it seems to be taking longer than what is reported in the paper. In the paper, it is reported 11 hours, but I have been running for 2 days and it is still in epoch 3. Am I doing something wrong? Any information on this would be great.

Is the reconstruction procedure indispensable for latent modes?

Hi, thanks for your code, it helps me a lot.

I have been learning latent variable models within weeks and feel puzzled about this field. As reconstruction loss (or decoder) is expensive to compute and prone to collapse, I`m wondering is the reconstruction procedure indispensable for a latent mode to capture useful information?

For example, in a supervised multi-task setting, if I want latent space can capture domain-specific signal, how can I towards this end by just using classification label and domain label but reconstruction loss, are there any relevant literatures ?

I am stuck in this question, hope you can direct me out.

Does a vanilla VAE show PC for text model?

Hi,

From your experiments, does a vanilla VAE show posterior collapse for say yahoo dataset? I ran with
python text.py --dataset yahoo --kl_start 1.0 and found that KL term remains non zero while MI and AU go to zero (In fact MI is negative). Is this the expected behaviour ?

Question regarding the NLL term

Hi, how are you?
Thanks for a great paper git repository.

report_rec_loss / report_num_sents, time.time() - start))

What is the upside in dividing the losses with report_num_sents comparing to returning the mean from both loss_rc and loss_kl?
From what I understand, this is the same except when we have batches in different sizes (let's say at the end of an epoch).
I believe returning micro average sounds more like it than macro average.
Am I missing something?
What do you think?

Thanks.
Matan.

Regarding NLL loss, Recon loss and PPL.

Hi Junxian,

Thank you so much for your paper and code, it helps me a lot.

But I am a little confused by the loss calculation here in test method in test.py.

**test_loss = (report_rec_loss + report_kl_loss) / report_num_sents

nll = (report_kl_loss + report_rec_loss) / report_num_sents
kl = report_kl_loss / report_num_sents
ppl = np.exp(nll * report_num_sents / report_num_words)**

what I saw and did before was rec_loss as nll_loss (it is calculated using nll_loss) and ppl is calculated purely on rec_loss (nll_loss).

But I see you use kl in the term here, so I wonder what is the intuition here and it seems test_loss is the same as nll?

Any explanations would be much appreciated.

Thank you in advance.

Code for computing true posterior

Hi,

In the paper, it says that the true posterior can be computed by sum(z*p(z|x) in Appendix A. However, in the code it seems that it is calculating the likelihood P(z,x)? Why?

Testing on large-scale dataset

Hello,
I just want to ask if u have ever trained your aggressive VAE training on large scale ImageNet dataset 32x32 or CIFAR 10 ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.