Giter VIP home page Giter VIP logo

Comments (6)

Duum avatar Duum commented on May 12, 2024

I have the same question, I don't think the mask work!.

from transformer.

mingxiansen avatar mingxiansen commented on May 12, 2024

I tested the model, you are right, all of masking are not work! @Kyubyong

from transformer.

bobobe avatar bobobe commented on May 12, 2024

i found that there are two method implement position embedding,key_mask can work on PE which params are training among model,but not work on PE which the paper says,because the padding embeddings are not 0 !who can fix it? @Kyubyong

from transformer.

shaunzhuyw avatar shaunzhuyw commented on May 12, 2024

I alos found the same question. if the raw code are used, the mask doesn't work at all. I verify the code like this:
self.length_mask = tf.cast(tf.sequence_mask(length_batch, maxlen), tf.int32)
length_embedding = tf.Variable(tf.concat([tf.zeros(shape=(1, num_units)), tf.ones(shape=(maxlen-1, num_units))], 0), trainable=False)
self.length_mask_embedding = tf.nn.embedding_lookup(length_embedding, self.length_mask)
self.dec_position_embedding *= self.length_mask_embedding
self.dec += self.dec_position_embedding

where length_batch is the length of each sentence in the batch

from transformer.

Yang-Charles avatar Yang-Charles commented on May 12, 2024

I recently ran this model and found that it has not been running.
tqdm progressbar no running!

from transformer.

zsgchinese avatar zsgchinese commented on May 12, 2024

I alos found the same question. if the raw code are used, the mask doesn't work at all. I verify the code like this:
self.length_mask = tf.cast(tf.sequence_mask(length_batch, maxlen), tf.int32)
length_embedding = tf.Variable(tf.concat([tf.zeros(shape=(1, num_units)), tf.ones(shape=(maxlen-1, num_units))], 0), trainable=False)
self.length_mask_embedding = tf.nn.embedding_lookup(length_embedding, self.length_mask)
self.dec_position_embedding *= self.length_mask_embedding
self.dec += self.dec_position_embedding

where length_batch is the length of each sentence in the batch

in the length_embedding, the second embedding of one should be a shape[1, num_units]? not the max_len - 1

from transformer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.