Giter VIP home page Giter VIP logo

Comments (6)

mandarjoshi90 avatar mandarjoshi90 commented on August 20, 2024

Hi Shubham,
Merging with transformers isn't top priority for now. I'm not sure I understand the second part of the post. Are you finetuning SpanBERT and seeing variance across seeds? If so, how much?

from spanbert.

shtoshni avatar shtoshni commented on August 20, 2024

Hi Mandar,
So I am not finetuning and just using BertModel.from_pretrained('spanbert-base-cased'). Because the current URL points to .tar.gz, I have to use your codebase which is based on pytorch_pretrained_bert (using the tar URL with current transformers API makes it thinks it's a TF checkpoint.)
But for tokenization, I'm using the BertTokenizer from transformers API just because it provides a sort of unified interface for many models + the SpanBERT model uses the same Tokenizer from what I could understand.
For some reason, the combination leads to results that vary across runs (no finetuning involved!).

BTW I understand that merging with transformers is not a priority but just to clarify, do you think for non-finetuning results it's just a matter of calling BERT based APIs with SpanBERT model's initialization. Also, for finetuning what might change?

from spanbert.

shtoshni avatar shtoshni commented on August 20, 2024

Ah sorry, I figured out the issue. So transformers initializes the pretrained model to be in eval() mode by default but that's not the case with SpanBERT codebase. The randomization in dropout causes random results :)

from spanbert.

phosseini avatar phosseini commented on August 20, 2024

Ah sorry, I figured out the issue. So transformers initializes the pretrained model to be in eval() mode by default but that's not the case with SpanBERT codebase. The randomization in dropout causes random results :)

Did you manage to convert the code to a transformers friendly version? I am trying to convert the run_*.py codes to a transformers-based format, but I'm having some troubles: issue

from spanbert.

shtoshni avatar shtoshni commented on August 20, 2024

No I just used the SpanBERT codebase to load the pretrained model.
Regarding your issue, I think the model rarely results just logits. Typically it's a list, tuple, or I think in the latest codebase they made it a class - see here. I think you should inspect the model output, check it's type etc.

from spanbert.

phosseini avatar phosseini commented on August 20, 2024

No I just used the SpanBERT codebase to load the pretrained model.
Regarding your issue, I think the model rarely results just logits. Typically it's a list, tuple, or I think in the latest codebase they made it a class - see here. I think you should inspect the model output, check it's type etc.

Thanks for the feedback. I finally managed to write transformers compatible version - see here it needs some more testing though for TACRED, but I could run it on my relation extraction task/data which follows the same format as TACRED.

from spanbert.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.