kyzhouhzau / bert-ner Goto Github PK

View Code? Open in Web Editor NEW

1.2K 36.0 335.0 2.24 MB

Use Google's BERT for named entity recognition （CoNLL-2003 as the dataset）.

License: MIT License

Python 71.43% Perl 27.91% Shell 0.65%

bert ner tensorflow conll-2003 google-bert

bert-ner's Introduction

For better performance, you can try NLPGNN, see NLPGNN for more details.

BERT-NER Version 2

Use Google's BERT for named entity recognition （CoNLL-2003 as the dataset）.

The original version （see old_version for more detail） contains some hard codes and lacks corresponding annotations,which is inconvenient to understand. So in this updated version,there are some new ideas and tricks （On data Preprocessing and layer design） that can help you quickly implement the fine-tuning model (you just need to try to modify crf_layer or softmax_layer).

Folder Description:

BERT-NER
|____ bert                          # need git from [here](https://github.com/google-research/bert)
|____ cased_L-12_H-768_A-12	    # need download from [here](https://storage.googleapis.com/bert_models/2018_10_18/cased_L-12_H-768_A-12.zip)
|____ data		            # train data
|____ middle_data	            # middle data (label id map)
|____ output			    # output (final model, predict results)
|____ BERT_NER.py		    # mian code
|____ conlleval.pl		    # eval code
|____ run_ner.sh    		    # run model and eval result

Usage:

bash run_ner.sh

What's in run_ner.sh:

python BERT_NER.py\
    --task_name="NER"  \
    --do_lower_case=False \
    --crf=False \
    --do_train=True   \
    --do_eval=True   \
    --do_predict=True \
    --data_dir=data   \
    --vocab_file=cased_L-12_H-768_A-12/vocab.txt  \
    --bert_config_file=cased_L-12_H-768_A-12/bert_config.json \
    --init_checkpoint=cased_L-12_H-768_A-12/bert_model.ckpt   \
    --max_seq_length=128   \
    --train_batch_size=32   \
    --learning_rate=2e-5   \
    --num_train_epochs=3.0   \
    --output_dir=./output/result_dir

perl conlleval.pl -d '\t' < ./output/result_dir/label_test.txt

Notice: cased model was recommened, according to this paper. CoNLL-2003 dataset and perl Script comes from here

RESULTS:(On test set)

Parameter setting:

do_lower_case=False
num_train_epochs=4.0
crf=False

accuracy:  98.15%; precision:  90.61%; recall:  88.85%; FB1:  89.72
              LOC: precision:  91.93%; recall:  91.79%; FB1:  91.86  1387
             MISC: precision:  83.83%; recall:  78.43%; FB1:  81.04  668
              ORG: precision:  87.83%; recall:  85.18%; FB1:  86.48  1191
              PER: precision:  95.19%; recall:  94.83%; FB1:  95.01  1311

Result description:

Here i just use the default paramaters, but as Google's paper says a 0.2% error is reasonable(reported 92.4%). Maybe some tricks need to be added to the above model.

reference:

[1] https://arxiv.org/abs/1810.04805

[2] https://github.com/google-research/bert

bert-ner's People

Contributors

Stargazers

Watchers

Forkers

ilyeong-ai jamesgu14 george86028 allensmile fancyerii zengpr zouxiaoyuonly delaiahz coontash wushicanasl mahjiong gonewithgt huangxiancun yuanjie-ai cosecant-csc yuhonghong66 gingersna yclinyimeng excelsimon oliverkehl asuolai rachel708 greengrass2015 caibinbupt hqwu-hitcs huguanglong lbda1 thaoth58 yueping123 jcsyl erichan2046 haif-liu weiczhu 0xb7ee uestc-chen sc89703312 tu-cao boogh penut85420 lapolonio hungz23 zhanghuiyong liuwq168 zawecha1 david-lee-1990 gdh756462786 codeants2012 wut0n9 cjm1044642385 wupm1021 yangbeyond mrdeadpool robets2020 fan-niu paulzhangising zhongyunuestc zyxpaidaxing mathstao jojolin pokbe flyingzhy guodj youngsmile akshitj1 countback phelanwang luo1129 apogiatzis junjieqian fighting-yan wuliuyuedetian umangkeshri erosvall jeniyat lduml mon3 chongtwo anish52 nonva zorrock tjunlp leyiwang chenny0808 lixianyao jungbomp snachx simonghrt chenmoshushi hhy5277 joerg99 jackliu2014 tslnihaogit cslele qiuyuew goldenwave3201 iajqs legendary001 xwyfct hogking kelly2016

bert-ner's Issues

Why PaddingInputExample is not used?

I just wonder why you do not use PaddingInputExample like here

How to reproduce your results.

I use the same run command like yours, but I get worse results on dev dataset.

eval_f = 0.89656204
eval_precision = 0.90508
eval_recall = 0.88843685
global_step = 653
loss = 17.190592

I use "BERT-Base, Multilingual Cased: 104 languages, 12-layer, 768-hidden, 12-heads, 110M parameters" as checkpoint, which is public by google at November 23rd, 2018.

TypeError: eval_metric_ops[confusion_matrix] must be Operation or Tensor

How to solve this issue?

TypeError: eval_metric_ops[confusion_matrix] must be Operation or Tensor, given:<tf.Variable 'total_confusion_matrix: 0' shape=(5, 5) dtype=float64_ref>

prediction tool

Awesome work man. Respect for your inspiring work.

I'd like to implement a simple command line program for more ppl to play around with. Would like to refer to your job here.

Result of NER

Your final result seems,

accuracy:  98.07%; precision:  90.65%; recall:  88.29%; FB1:  89.45
              LOC: precision:  92.50%; recall:  91.71%; FB1:  92.10  1387
             MISC: precision:  82.63%; recall:  76.99%; FB1:  79.71  668
              ORG: precision:  88.75%; recall:  84.22%; FB1:  86.43  1191
              PER: precision:  94.51%; recall:  94.72%; FB1:  94.62  1311

Result description:
As Google's paper says a 0.2% error is reasonable(reported 92.4%).

How can this result is comparable to google's result. google's result was 92.4 for BERT base and 92.8 for BERT large.
This result is 89.45.

Question: training the model without init_checkpoint

INFO:tensorflow:Error recorded from training_loop: local variable 'initialized_variable_names' referenced before assignment
INFO:tensorflow:training_loop marked as finished
WARNING:tensorflow:Reraising captured error
Traceback (most recent call last):
File "BERT_NER.py", line 612, in
tf.app.run()
File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\platform\app.py", line 125, in run
_sys.exit(main(argv))
File "BERT_NER.py", line 545, in main
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\tpu\python\tpu\tpu_estimator.py", line 2409, in train
rendezvous.raise_errors()
File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\tpu\python\tpu\error_handling.py", line 128, in raise_errors
six.reraise(typ, value, traceback)
File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\six.py", line 693, in reraise
raise value
File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\tpu\python\tpu\tpu_estimator.py", line 2403, in train
saving_listeners=saving_listeners
File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\estimator\estimator.py", line 354, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\estimator\estimator.py", line 1207, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\estimator\estimator.py", line 1237, in _train_model_default
features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\tpu\python\tpu\tpu_estimator.py", line 2195, in _call_model_fn
features, labels, mode, config)
File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\estimator\estimator.py", line 1195, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\tpu\python\tpu\tpu_estimator.py", line 2479, in _model_fn
features, labels, is_export_mode=is_export_mode)
File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\tpu\python\tpu\tpu_estimator.py", line 1259, in call_without_tpu
return self._call_model_fn(features, labels, is_export_mode=is_export_mode)
File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\tpu\python\tpu\tpu_estimator.py", line 1533, in _call_model_fn
estimator_spec = self._model_fn(features=features, **kwargs)
File "BERT_NER.py", line 419, in model_fn
if var.name in initialized_variable_names:
UnboundLocalError: local variable 'initialized_variable_names' referenced before assignment

Training the model without using the init_checkpoint flag returns this error

No Output is generated

Hi,
When I run BERT-NER in evaluate and predict more then no output is generated. The size of eval.tf_record, predict.tf_record and label_test.txt remains 0. What is expected out of running this?

Below is command i am running:
python BERT_NER.py --task_name="NER" --do_train=False --do_eval=True --do_predict=True --data_dir=NERdata --vocab_file=/home/local/BSSTEST/sandeep.a.bhutani/bucket_copy/uncased_L-12_H-768_A-12/vocab.txt --bert_config_file=/home/local/BSSTEST/sandeep.a.bhutani/bucket_copy/uncased_L-12_H-768_A-12/bert_config.json --init_checkpoint=/home/local/BSSTEST/sandeep.a.bhutani/bucket_copy/uncased_L-12_H-768_A-12/bert_model.ckpt --max_seq_length=16 --train_batch_size=32 --learning_rate=2e-5 --num_train_epochs=3.0 --output_dir=./output/result_dir/

Below is sample tail of output:

....
....
INFO:tensorflow:  name = bert/encoder/layer_11/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_11/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_11/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_11/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/pooler/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/pooler/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = output_weights:0, shape = (13, 768)
INFO:tensorflow:  name = output_bias:0, shape = (13,)
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from ./output/result_dir/model.ckpt-653
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:prediction_loop marked as finished
INFO:tensorflow:prediction_loop marked as finished

Is it possible that the tag I-PER independently occurs，namely the pre tag of I-PER tag is not B-PER。

as the last layer is softmax，every tag is independent.

关于labels及label_id的设计

你好，

看到你对BERT-NER的尝试很受启发。有个问题需要请教，BERT论文原文中提到“where no prediction is made for X”，那么我们还是否将labels中加入“X”呢，如果不加入“X”，又该怎么去体现“no prediction is made”呢？

how to save as a pd.file?

I want to save the model as a pb.file , but i have no idea how to modify the code?

Traing hanging

The training step stuck on following step:
....
INFO:tensorflow: name = output_bias:0, shape = (13,)
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
2019-01-05 10:45:57.943879: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into /tmp/NER/model.ckpt.
(stuck for 1+hr)

I am wondering maybe I am using different version of tensorflow (1.12)?
(I did not modify any script file, command as below:)

CUDA_VISIBLE_DEVICES=1 python BERT_NER.py
--task_name="NER"
--vocab_file=$BERT_DIR/vocab.txt
--bert_config_file=$BERT_DIR/bert_config.json
--init_checkpoint=$BERT_DIR/bert_model.ckpt
--do_train=True
--do_eval=True
--do_predict=True
--data_dir=NERdata
--train_batch_size=32
--learning_rate=3e-5
--num_train_epochs=3.0
--max_seq_length=128
--output_dir=/tmp/NER/

How to get NER ?

I want to print the NER set ,well not only the precision, recall and so on.How can I find out the Ner result?

How to integrate this in RASA

I want to integrate this NER through BERT into RASA. But the training data formats for RASA entity recognition and this project are different. Can you please help as to how to integrate BERT based entity recognition into RASA. I have integrated the intent classification that BERT offers into RASA, but I having difficulties to integrate the entity recognition

Why text at index 1?

BERT-NER/BERT_NER.py

Line 194 in 58c2011

text = tokenization.convert_to_unicode(line[1])

In dataset, the word shoud at line[0]

INFO: tensorflow: Saving checkpoints for 0 into

INFO: tensorflow: Saving checkpoints for 0 into ./output/result_dir/model.ckpt
您好，我卡在了上面这一句，我不明白它是在做什么。
它似乎是在写入文件，我的CPU占用率也的确很高，不过我的文件大小并没有改变。
想请问一下，遇到这个是等它继续呢，还是说我的设置出了问题，要直接退出重新弄？
stuck here for 2 hrs....

Skipping training since max_steps has already saved.

I tried using the same model on a custom dataset. But I got the above error. Do you know how I can get around it?

Which tf version and python version should be used?

I can not run the scripts. The program fails:

Do you have any idea why this happens?

Doubts on the evaluation.

hello zhou. Thanks a lot for ur contribution on the work of fine-tuning. But I have a question about the evaluation metrics. It seems that in ur evaluation metrics, it evaluates the precision,recall separately for each class(B-person,I-person,B-MISC,I-MISC,.....). If so, the results may not be accurate enough? Thanks a lot!

Key output_bias not found in checkpoint

前面运行的好好的，后面就爆出了这个错误：Key output_bias not found in checkpoint。貌似是模型载入的时候出错。

急求大佬解答

bert模型为based的那个uncased_L-12_H-768_A-12的模型

uesd other NER labels, and have some problems

i change the ner labels in BERT_NER.py line 227, and i have problems like follow,

InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [29,29] rhs shape= [13,13]

serving model

Hi, could you explain how to save the model using estimator.export_saved_model?
Thanks

conll2003 run on gpu devF=0.90-0.91?

Thank for your timely work!
when running on GPU, conll2003 doesn't perform as good as you or paper result, by the way, I tried several times , the dev F score is wandering [0.89, 0.912].
Does your work run on TPU ?

Why `tf.train.init_from_checkpoint` in func `model_fn` run twice

In function model_fn:

        if init_checkpoint:
            (assignment_map, initialized_variable_names) = modeling.get_assignment_map_from_checkpoint(tvars,init_checkpoint)
            tf.train.init_from_checkpoint(init_checkpoint, assignment_map)
            if use_tpu:
                def tpu_scaffold():
                    tf.train.init_from_checkpoint(init_checkpoint, assignment_map)
                    return tf.train.Scaffold()
                scaffold_fn = tpu_scaffold
            else:
                tf.train.init_from_checkpoint(init_checkpoint, assignment_map)

why tf.train.init_from_checkpoint should run twice?

How to use conlleval.pl

how to use script conlleval.pl for evaluation
i made the env to support perl, but still can not run the script correctly

TPUEstimatorSpec.predictions must be dict of Tensors

When running predict on Google Colab (to use TPU) the code crashes with the following error:

TPUEstimatorSpec.predictions must be dict of Tensors.

To solve it one can place the following code in create_model

predict = tf.argmax(probabilities, axis=-1)
predict_dict = {'predictions': predict}  # this way it is not shot down by check in TPUEstimatorSpec
return loss, per_example_loss, logits, predict_dict

This of course also means changing the interpretation of the result

result = estimator.predict(input_fn=predict_input_fn)
result = list(result)
result = [pred['predictions'] for pred in result]

Currently I'm unable to to pull request since that would mean looking into whether it really is a solution. Just posting it here for anyone who has the same problem.

create_model函数中硬编码的13是指什么？

logits = tf.reshape(logits, [-1, FLAGS.max_seq_length, 13]) 不该是num_labels？这里13是指啥

label of "[SEP]", "[PAD]" and "[CLS]"

The idea of not using the "[SEP]" in NER task seems great, but why it will cause problem in crf?

Per the "[CLS]", have you tried to replace it with "[PAD]"?

BertModel is_training should be set to False?

Thanks a lot for the amazing work!
However, in BERT_NER.py line 348, you set BertModel's is_training parameter to change alone with the fine-tuning procedure, but as the model is not integrated in the real training model (use the output of BertModel's final layer as the real input). So as far as I'm concerned, this parameter should be set to False to gain more reasonable result.

how to use gpu trainning

hello zhou, how to use gpu to train. when the tpu is set to false, then the training defaultly use cpu

为啥预测结果会漏词？

如上图所示，整整漏了一大句话。是因为不支持中文吗？？

Hi thank for Bert_ner can you explain how you modified the run_classifier.py into Ber_ner.py

can you explain the code where you exactly changed to suit Ner in Bert
I am very much interested ,it will be much helpful.

Why don't you allow to use 'do_predict'?

How can I fine-tune on my own data?

Hey,
is it possible to fine-tune on another dataset? I fed in my own data with different labels. But this causes a mismatch of tensors. Of course, I could match the labels (it's just naming) but that's not how it should work, right?

Uncased or Cased?

Thanks for sharing the code for NER task! May I know which model did you use? Cased or uncased? I am getting F1-dev 88.8 using Cased-model and F1-dev 92.6 using Uncased-model.

Problems: runs very slowly when converting single example to feature

I found it cost too much time When running the convert_single_example function.

time0 = time.time()
feature = convert_single_example(ex_index, example, label_list, max_seq_length, tokenizer,mode)
time1 = time.time()
print("time cost:", time1-time0)

the cost is up to a few seconds！
time cost: 4.020495414733887

In convert_single_example function we can fix it by add the following code ：
if not os.path.exists('./output/label2id.pkl'):

in front of

with open('./output/label2id.pkl','wb') as w:
pickle.dump(label_map, w)

when i predict on my test data set ,i got this problem.

ValueError: Assignment map with scope only name dense_37 should map to scope only dense_37/bias. Should be 'scope/': 'other_scope/'.

from bert import modeling

“from bert import modeling” how to solution

training on other kind of dataset

Question: loss in padding sequence

Hi! Thank you so much for releasing BERT-NER codes.
I have a question: the sentences are padded with max_sequence_length and labels in such padded part are set to 0. But when counting the total loss in your code, I found both input and padded parts are included:
one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)
per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
loss = tf.reduce_sum(per_example_loss)

I am wondering Is it reasonable. Should we mask the padded part?

Wrong output format for evaluation script

The conlleval.pl script accepts input file format like <span> <groundtruth> <prediction>, but the model's output is <span> <prediction> <groundtruth>. Although it doesn't affect the F1 score, it swaps the values of precision and recall.

grpc error

when i use tensroflow-serving grpc, the “response = stub.Predict(request, timeout)“ has a error message:
status = StatusCode.FAILED_PRECONDITION
details = "Batched output tensor's 0th dimension does not equal the sum of the 0th dimension sizes of the input tensors"
debug_error_string = "{"created":"@1558057036.708000000","description":"Error received from peer","file":"src/core/lib/surface/call.cc","file_line":1017,"grpc_message":"Batched output tensor's 0th dimension does not equal the sum of the 0th dimension sizes of the input tensors","grpc_status":9}"

New sota on conll03 data?

hi, so the F1 of your system is 0.9326?
Is it comparable with the numbers reported in the literature?

CUDA_ERROR_OUT_OF_MEMORY: out of memory; total memory reported:

I am getting out of memory error.
I am running in prediction mode. Tried even reducing the max_seq_length as well but still same error.

python BERT_NER.py --task_name="NER" --do_train=False --do_eval=False --do_predict=True --data_dir=NERdata --vocab_file=/home/local/BSSTEST/sandeep.a.bhutani/bucket_copy/uncased_L-12_H-768_A-12/vocab.txt --bert_config_file=/home/local/BSSTEST/sandeep.a.bhutani/bucket_copy/uncased_L-12_H-768_A-12/bert_config.json --init_checkpoint=/home/local/BSSTEST/sandeep.a.bhutani/bucket_copy/uncased_L-12_H-768_A-12/bert_model.ckpt --max_seq_length=16 --train_batch_size=32 --learning_rate=2e-5 --num_train_epochs=3.0 --output_dir=./output/result_dir/


INFO:tensorflow:prediction_loop marked as finished
WARNING:tensorflow:Reraising captured error
Traceback (most recent call last):
  File "BERT_NER.py", line 613, in <module>
    tf.app.run()
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
    _sys.exit(main(argv))
  File "BERT_NER.py", line 603, in main
    for prediction in result:
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2446, in predict
    rendezvous.raise_errors()
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/error_handling.py", line 128, in raise_errors
    six.reraise(typ, value, traceback)
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/six.py", line 693, in reraise
    raise value
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2440, in predict
    yield_single_examples=yield_single_examples):
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 593, in predict
    hooks=all_hooks) as mon_sess:
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 921, in __init__
    stop_grace_period_secs=stop_grace_period_secs)
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 643, in __init__
    self._sess = _RecoverableSession(self._coordinated_creator)
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1107, in __init__
    _WrappedSession.__init__(self, self._create_session())
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1112, in _create_session
    return self._sess_creator.create_session()
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 800, in create_session
    self.tf_sess = self._session_creator.create_session()
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 566, in create_session
    init_fn=self._scaffold.init_fn)
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/python/training/session_manager.py", line 288, in prepare_session
    config=config)
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/python/training/session_manager.py", line 185, in _restore_checkpoint
    sess = session.Session(self._target, graph=self._graph, config=config)
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1551, in __init__
    super(Session, self).__init__(target, graph, config=config)
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 676, in __init__
    self._session = tf_session.TF_NewSessionRef(self._graph._c_graph, opts)
tensorflow.python.framework.errors_impl.InternalError: failed initializing StreamExecutor for CUDA device ordinal 0: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_OUT_OF_MEMORY: out of memory; total memory reported: 16914055168

Why the number of label is 13?

Total labels is 12，what is the meaning of num_labels=len(label_list) + 1?

Question: size of logits

hi kyzhouhzau~

thank you for this project
I have a question about the size of num_labels in BERT_NER.py line470.

BERT-NER/BERT_NER.py

Line 470 in 2a9fcb8

num_labels=len(label_list)+1,

why you let the size of logits one bigger than the size of label_list?
thank you so much!

test.txt and label_test.txt isn't same in line numbers

hi
I use the code (thanks for that!)
but there is a problem when test prediction writes in the "output/result_dir/label_test.txt" I thought that this file must be the same as "data/test.txt" but it isn't!

I know that this (Bert-ner) library removes empty new lines in "output/result_dir/label_test.txt" but with removing empty new lines in "data/test.txt" the problem still exists.
(number of lines in "output/result_dir/label_test.txt" is less than "data/test.txt" )

here links of those files:
"data/test.txt" : https://github.com/kyzhouhzau/BERT-NER/blob/master/data/test.txt

"output/result_dir/label_test.txt" : https://github.com/kyzhouhzau/BERT-NER/blob/master/output/result_dir/label_test.txt

thanks

Extending to other NER labels?

Can we train using a custom NER tagset, currently in the code the IOB labels seem to be hardcoded?

--max_seq_length=128 -> 150

hi kyzhouhzau~

thank you for this project :)
there is a minor error which i'd like to report.

def convert_single_example(ex_index, example, label_list, max_seq_length, tokenizer):
...
    input_ids = tokenizer.convert_tokens_to_ids(ntokens)
    input_mask = [1] * len(input_ids)
    while len(input_ids) < max_seq_length:
        input_ids.append(0)
        input_mask.append(0)
        segment_ids.append(0)
        label_ids.append(0)
    print('length check', len(input_ids), max_seq_length)
    assert len(input_ids) == max_seq_length  <-- error
    assert len(input_mask) == max_seq_length
    assert len(segment_ids) == max_seq_length
    assert len(label_ids) == max_seq_length
...

tokenizer.convert_tokens_to_ids(ntokens) would generate longer list than max_seq_length when we are using --max_seq_length=128.

so, i ran with --max_seq_length=150. it was fine.

Sequence index problem

As i know , BERT model outputs a fixed-size matrix for unequal sentence . That means , "我是**人" and “他也是个**人” both outputs the matrix shape 1*768 (I use model chinese_L-12_H-768_A-12). Then, this model how to deal with word index ?

需要什么配置的GPU才能运行起来？

你好，之前我运行官方的cola做分类，但是显存溢出，你能告诉我至少需要多大显存什么样的显卡才能跑起来吗？或者你的配置？谢谢