Giter VIP home page Giter VIP logo

Comments (7)

mukesh-mehta avatar mukesh-mehta commented on June 19, 2024 3

There are few other tags present in the data which are not BIO tags.

“	data/source_txt/train/t1_biology_1_404.txt	 22015	 22016	 I-Qualifier	 T221	 T220	 Supplements
seahorse”)—a	data/source_txt/train/t1_biology_1_404.txt	 22016	 22028	 Qualifier	 T221	 T220	 Supplements
transfer	data/source_txt/train/t1_biology_1_404.txt	 26543	 26551	 B-Qualifier	 T253	 T251	 Supplements
steady	data/source_txt/train/t1_biology_1_0.txt	 2683	 2689	 I-Alias-Term	 T211	 T210	 AKA
state”)—the	data/source_txt/train/t1_biology_1_0.txt	 2690	 2701	 Alias-Term	 T211	 T210	 AKA
DNA	data/source_txt/train/t1_biology_1_0.txt	 3323	 3326	 B-Alias-Term	 T221	 T220	 AKA

I found the following list of TAGS in data.

  • 'O',
  • 'I-Definition',
  • 'I-Term',
  • 'I-Secondary-Definition',
  • 'B-Term',
  • 'B-Definition',
  • 'I-Definiti-frag',
  • 'I-Qualifier',
  • 'I-Alias-Term',
  • 'B-Alias-Term',
  • 'B-Secondary-Definition',
  • 'I-Referential-Definition',
  • 'B-Referential-Definition',
  • 'B-Qualifier',
  • 'B-Referential-Term',
  • 'I-Referential-Term',
  • 'B-Definiti-frag',
  • 'I-Ordered-Definition',
  • 'Definition',
  • 'Term',
  • 'I-Ordered-Term',
  • 'Alias-Term',
  • 'B-Te-frag',
  • 'B-Ordered-Definition',
  • 'B-Ordered-Term',
  • 'Secondary-Definition',
  • 'I-Te-frag',
  • 'Qualifier',
  • 'Referential-Definition',
  • 'B-Alias-Te-frag'

from deft_corpus.

mukesh-mehta avatar mukesh-mehta commented on June 19, 2024 1

Thanks @sashaspala for resolving the issue

from deft_corpus.

sashaspala avatar sashaspala commented on June 19, 2024

You're right - we don't really have a discussion about fragments in the paper. They're not super common in the textbooks case, but they do come up occasionally when the definition or term phrase is non-contiguous (often when the term is plopped in the definition phrase). I'll add this to the to-do list.

from deft_corpus.

Franck-Dernoncourt avatar Franck-Dernoncourt commented on June 19, 2024

Thanks, FYI this is the tag list I had generated two months ago, but I am not sure if it is still up-to-date:

image

from deft_corpus.

mukesh-mehta avatar mukesh-mehta commented on June 19, 2024

Any update on the issue??

from deft_corpus.

Franck-Dernoncourt avatar Franck-Dernoncourt commented on June 19, 2024

Thanks @sashaspala, this fixes the tagging issue raised Mukesh Mehta, but was some documentation added to explain all tags, such as B-Definiti-frag?

from deft_corpus.

mukesh-mehta avatar mukesh-mehta commented on June 19, 2024

Updated list of BIO Tags from Train and Dev set

['B-Referential-Definition',
 'I-Referential-Term',
 'I-Alias-Term',
 'I-Qualifier',
 'B-Ordered-Term',
 'B-Ordered-Definition',
 'B-Referential-Term',
 'O',
 'B-Qualifier',
 'I-Term-frag',
 'I-Definition',
 'I-Definition-frag',
 'I-Referential-Definition',
 'I-Term',
 'B-Secondary-Definition',
 'I-Ordered-Definition',
 'B-Alias-Term',
 'I-Ordered-Term',
 'B-Definition',
 'B-Term-frag',
 'B-Definition-frag',
 'I-Secondary-Definition',
 'B-Term',
 'B-Alias-Term-frag']

from deft_corpus.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.