Giter VIP home page Giter VIP logo

named-entity-recognition-bidirectionallstm-cnn-conll's People

Contributors

mxhofer avatar shuttle1987 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

named-entity-recognition-bidirectionallstm-cnn-conll's Issues

Cannot import compute_f1 from validation

I tried to install validation library from pip and run the packages.
It seems like compute_f1 is no longer avaliable.
Is scikit learn f1 score a better choice?

One question regarding padding

hi,

I see that you are padding the inputs to get equal length of 52. But, it seems padding is applied to only character inputs but not to words.

 # 0-pads all words
 def padding(Sentences):
     maxlen = 52
     for sentence in Sentences:
         char = sentence[2]
         for x in char:
             maxlen = max(maxlen, len(x))
     for i, sentence in enumerate(Sentences):
         Sentences[i][2] = pad_sequences(Sentences[i][2], 52, padding='post')
     return Sentences

Sentences contains below:

         dataset.append([wordIndices, caseIndices, charIndices, labelIndices]) 
     return dataset

I see that you have made batches of inputs with words of equal length. Is this the correct approach?
Can you pls let me know.

Dataset - Ontonotes

Hi,

I own the Ontonotes dataset in folders but I am looking to convert them to the same style as the CONLL2003 .txt files that are provided in this repo. Could the authors provide some details about preprocessing Ontonotes dataset so that Ontonotes could train as well?

Thanks!

Question about padding

def padding(Sentences):
maxlen = 52
for sentence in Sentences:
char = sentence[2]
for x in char:
maxlen = max(maxlen,len(x))
for i,sentence in enumerate(Sentences):
Sentences[i][2] = pad_sequences(Sentences[i][2],52,padding='post')
return Sentences

Please, explain what is actually happening here. It will be beneficial to me if you explain with a example.

Question about CoNLL

Hi, I would like to ask another question. As I would like to prepare my own data to test out the NER.
I would like to know whether the tense including POS tag and Chunk tag matters? Can I just have the Word Tag and NER tag is sufficient?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.