Giter VIP home page Giter VIP logo

Comments (8)

ZhitingHu avatar ZhitingHu commented on July 23, 2024

By my experience, the transformer decoder works well on other tasks. I personally haven't tried BERT as an encoder for generation.

What summarization data are you using?

Have you tried replacing transformer decoder with an RNN decoder for a quick comparison?

from texar.

santhoshkolloju avatar santhoshkolloju commented on July 23, 2024

I am using CNN data for building the model. I have not tried RNN decoder will try to do that .
But can you please check the code once in the given link if something is wrong.

from texar.

santhoshkolloju avatar santhoshkolloju commented on July 23, 2024

I was able to debug the the problem. Bert weights got disturbed and it was giving similar embeddings irrespective of example I am passing so the context of the sentence at the encoder is not correctly captured. Is there a way where I can set trainable as false for Bert based encoder directly?

from texar.

ZhitingHu avatar ZhitingHu commented on July 23, 2024

You can use tf.stop_gradient to disable gradient backpropagation to BERT, or specify the variables argument when calling tf.contrib.layers.optimize_loss to exclude BERT variables

from texar.

Vibha111094 avatar Vibha111094 commented on July 23, 2024

I am still facing the same issue.
Can you please elaborate on how you solved it?

from texar.

Vibha111094 avatar Vibha111094 commented on July 23, 2024

Is this the correct way?
hparams=tf.stop_gradient(hparams)

from texar.

Vibha111094 avatar Vibha111094 commented on July 23, 2024

This seems to be working:
encoder_output = tf.stop_gradient(encoder_output)
Is this the right way ?

from texar.

santhoshkolloju avatar santhoshkolloju commented on July 23, 2024

allvars = tf get_trainable_variables()
nonBert =[v for v in allvars if 'bert' not in v]

train_op = tx.core.get_train_op(
mle_loss,
learning_rate=learning_rate,
variables=non Bert,
global_step=global_step,
hparams=opt)

from texar.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.