Keras implementation of "Few-shot Learning for Named Entity Recognition in Medical Text"

Home Page: https://arxiv.org/abs/1811.05468

License: MIT License

Jupyter Notebook 83.45% Python 16.55%

bidirectional-lstm cnn-keras conll-2003 keras

named-entity-recognition-bidirectionallstm-cnn-conll's People

Contributors

Stargazers

Watchers

Forkers

anirband jingcheng-wu madhbhavikar hkazuakey alabarga pruv shaficse newenglandml jongikp weixuanwx silencelsy l1th1um saudbinhabib mon3 songjein cofec rsip4sh eramam gitrekm bhushan45 bicky23 dingding2018 brahmaiahkkk morindaz ashish9-verma haaaaaby mac-kim msoancah xan678 vskornyakov gaohuan2015 ahmedtijane thunder001 shuttle1987 fitrialif itsammycodethngy jill3240 sumhncku ritam-9 luongyen123 mariuszfilip jakubglinka bethpadera brmcdonnell yuanmengwei ayanbasak13 jhnlp hafsah2018 rz-zhang rosnikv vikas-kumar-infrrd dev-essbee sunshine866 vkiranraj youikim possible1402 kucukagan majiga pranavsebastian greitzmann ayans21 duongvotran aiedward shivapk guojson asanka25 mpearce25 leiqi ycfgithub nraov liweirong1120 shilpadeepaselvaraj onkar-2803 c-pzzo yurikim2145 kajalwaldiya akashmavle5 jxhzxn crispae ai-ml-cv akshat0098

named-entity-recognition-bidirectionallstm-cnn-conll's Issues

Cannot import compute_f1 from validation

I tried to install validation library from pip and run the packages.
It seems like compute_f1 is no longer avaliable.
Is scikit learn f1 score a better choice?

One question regarding padding

hi,

I see that you are padding the inputs to get equal length of 52. But, it seems padding is applied to only character inputs but not to words.

 # 0-pads all words
 def padding(Sentences):
     maxlen = 52
     for sentence in Sentences:
         char = sentence[2]
         for x in char:
             maxlen = max(maxlen, len(x))
     for i, sentence in enumerate(Sentences):
         Sentences[i][2] = pad_sequences(Sentences[i][2], 52, padding='post')
     return Sentences

Sentences contains below:

         dataset.append([wordIndices, caseIndices, charIndices, labelIndices]) 
     return dataset

I see that you have made batches of inputs with words of equal length. Is this the correct approach?
Can you pls let me know.

Dataset - Ontonotes

Hi,

I own the Ontonotes dataset in folders but I am looking to convert them to the same style as the CONLL2003 .txt files that are provided in this repo. Could the authors provide some details about preprocessing Ontonotes dataset so that Ontonotes could train as well?

Thanks!

mxhofer / named-entity-recognition-bidirectionallstm-cnn-conll Goto Github PK

named-entity-recognition-bidirectionallstm-cnn-conll's People

Contributors

Stargazers

Watchers

Forkers

named-entity-recognition-bidirectionallstm-cnn-conll's Issues

Cannot import compute_f1 from validation

One question regarding padding

Dataset - Ontonotes

Question about padding

Question about CoNLL

How to prepare a custom training data ?

TensorFlow version?

testing for new sentences

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent