Giter VIP home page Giter VIP logo

pointer-networks's People

Contributors

jingxil avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

pointer-networks's Issues

The input to the decoder is strange

Hello, thanks for sharing your codes. I have a question after reading your code.

Why do you use the scores rather than the probabilities of alignments?
In the paper, authors use the softmax results of alignments, which should be probabilities.

Initial state of the decoder

Hi,

Firstly, thanks a lot for sharing for your code.

I had a question regarding the initial state of the decoder. I see that it is currently initialized with zero_state. Shouldn't the decoder initial state be initialized with the final state of the encoder? Please do correct me if I am wrong

Removing the last value of output for decoder_input_ids

Hi,
Thanks a lot for the codes. I have a question, the decoder_input_ids has a value that removes the last value from output list, which is the end token. But for outputs that do not fullfil the length, you have used a pad ID that is 1. So the end value of such an output is the pad ID and not the end token. In that case, if you truncate the end value of output, it will only remove the padID and not the end token. Is that how it should be? I hope what I have understood is right.
Thanks a lot.

Details of the evaluation method

Hello jingxil,

Thank you for sharing this!
This is one of the best, most understandable implementations of Pointer Networks I have come across. ๐Ÿ˜ƒ

I was able to train the model and achieve some good training results:
ch-5

Could you please help me with the following questions?

  1. Is the purpose of the Forward Only mode to only switch to evaluation mode, or is there something more?

  2. During prediction how are encoder and decoder weights decided? During training we maintain an array of weights for the encoder (shape: [batch_size,max_input_sequence_len,2]) and decoder (shape: [batch_size,max_output_sequence_len])

    What happens when we see a new input during prediction? Do we set the encoder weights and decoder weights to '1' with zero padding?

Thanks again for the amazing work!

Evaluating test data while training, not other process

Hello jingxil,

Thank you for sharing this wonderful codes :)
I'm trying to use this code in "Sentence Reordering Task" now, then I got some troubles..

You use the forward_only option to separate "train" scope and "test" (or inference) scope into process units.
However, I want to do test during training to see if the model overfits.. but it's not easy because of attention mechanism(-> train and test run differently..) :(

Is there any good way to solve this problem??

Even if you don't code it, I'd really appreciate if you could share a reference sites or hints I could refer to

Thank you

the real length of encoder length

Hello, I have another question and hope you could help me out.

In the model part, from line 109 to line 125:

...  other codes ...

# Shape: [batch_size, max_input_sequence_len + 1, 2]
encoder_inputs = tf.stack(encoder_inputs, axis=0)

...  other codes ...

# Encode input to obtain memory for later queries
memory, _ = tf.nn.bidirectional_dynamic_rnn(fw_enc_cell, bw_enc_cell, encoder_inputs, enc_input_lens, dtype=tf.float32)

...  other codes ...

If I understand it correctly, encoder inputs should include END token, therefore the real length of the encoder inputs now should be enc_input_lens + 1 and 1 is for END.

Then why the enc_input_lens doesn't change in this code?

Thank you :) @jingxil

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.