Giter VIP home page Giter VIP logo

Comments (22)

kamalkraj avatar kamalkraj commented on May 19, 2024

Label 'X' is not considered for f1 metrics

if m and label_map[label_ids[i][j]] != "X":

Label 'X' is not equal to Label 'O'

from bert-ner.

kugwzk avatar kugwzk commented on May 19, 2024

But did you use in train?

from bert-ner.

kamalkraj avatar kamalkraj commented on May 19, 2024

For training I used "X".
While Inference and f1 metrics only the first output label of token is considered . same as in BERT paper

from bert-ner.

kugwzk avatar kugwzk commented on May 19, 2024

I think add more label in the conll2003 NER standard dataset make it not very comparable for previous works. Could you remove 'X' label during training and get a similar result?

from bert-ner.

kamalkraj avatar kamalkraj commented on May 19, 2024

If you remove "X" while training or replace "X" with "O" , model performance will drop to ~89 f1 score

from bert-ner.

kugwzk avatar kugwzk commented on May 19, 2024

So this is my opinion, use ‘X’ label make it high F1-score, it's not fair. I get the similar result about 91.3 F1 score. And I think BERT origin paper is also remove 'X' label because they use document information to get high F1-score. In short, 'X' label don't have any signal.

from bert-ner.

kamalkraj avatar kamalkraj commented on May 19, 2024

91.3 without using "X" ??

from bert-ner.

kugwzk avatar kugwzk commented on May 19, 2024

Yes. I get the word piece output from BERT model, and then map the first token's vector so I get the same numbers vectors as the standard dataset. And then use a softmax matrix to get the final result. But I only use the BERTModel in pytorch_pretrained_bert.

from bert-ner.

kamalkraj avatar kamalkraj commented on May 19, 2024

For example
After extracting features for the sentence below
Jim Hen ##son was a puppet ##eer
You're giving only [Jim , Hen , was , a, puppet] hidden states to linear classification layer ?

from bert-ner.

kugwzk avatar kugwzk commented on May 19, 2024

Yes. Because I think the fine tune bert could learn this pattern.

from bert-ner.

kamalkraj avatar kamalkraj commented on May 19, 2024

I will try this way and let you know

from bert-ner.

kamalkraj avatar kamalkraj commented on May 19, 2024

@kugwzk
Can you share your code ?
How are you handling padding after extracting first sub-token hidden states from BERT ?

from bert-ner.

kugwzk avatar kugwzk commented on May 19, 2024

I just record the origin word position use a dict in python. For example: [Jim Hen ##son was a puppet ##eer] for [0,1,3,4,5], so I padding the origin word sequence again in the classifier layer. It may be slowly :).

from bert-ner.

alphanlp avatar alphanlp commented on May 19, 2024

i think we can add mask to X label when training

from bert-ner.

ereday avatar ereday commented on May 19, 2024

I agree with @kugwzk on "misusage of X label". @tkukurin In the latest version, are you still using X during training (or evaluation) or have you already removed it as suggested ?

from bert-ner.

kamalkraj avatar kamalkraj commented on May 19, 2024

@ereday
Latest version still use X , I have code without usingX label , I need to clean the code a bit. I will try to push code by monday

from bert-ner.

Nic-Ma avatar Nic-Ma commented on May 19, 2024

Hi @kamalkraj ,

I am Nic from NVIDIA, thanks for your contribution on this project!
I tried to replace [CLS], X with O directly, and removed [SEP].
I think you supposed to release new code today, have you finished?
Thanks.

from bert-ner.

kamalkraj avatar kamalkraj commented on May 19, 2024

@toxic2m
Check out branch experiment

from bert-ner.

Nic-Ma avatar Nic-Ma commented on May 19, 2024

Hi @kamalkraj ,

Actually, I already done this part locally, and I suggest you to map [CLS] and [SEP] to O directly.
Then your FC layer only output real number of classifications, can get better performance.
Thanks.

from bert-ner.

sbmaruf avatar sbmaruf commented on May 19, 2024

Is there anyone able to reproduce BERT_NER paper's results (92.4F1 for BERT Base) ?
The experiment on the master branch supports the result given in the BERT paper. But as @kugwzk mentioned the problem. Is there anyone able to reproduce the results without inferred X tags?

from bert-ner.

kugwzk avatar kugwzk commented on May 19, 2024

@sbmaruf The result of Conll03 NER reported in the BERT origin paper used document context, which is different from the standard sentence-based evaluation. You can see something about that in here: allenai/allennlp#2067 (comment)

from bert-ner.

sbmaruf avatar sbmaruf commented on May 19, 2024

@kugwzk thanks for your reply. but how to add document level context with NER?
any idea or code repo?

from bert-ner.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.