Giter VIP home page Giter VIP logo

bert-ner's Introduction

For better performance, you can try NLPGNN, see NLPGNN for more details.

BERT-NER Version 2

Use Google's BERT for named entity recognition (CoNLL-2003 as the dataset).

The original version (see old_version for more detail) contains some hard codes and lacks corresponding annotations,which is inconvenient to understand. So in this updated version,there are some new ideas and tricks (On data Preprocessing and layer design) that can help you quickly implement the fine-tuning model (you just need to try to modify crf_layer or softmax_layer).

Folder Description:

BERT-NER
|____ bert                          # need git from [here](https://github.com/google-research/bert)
|____ cased_L-12_H-768_A-12	    # need download from [here](https://storage.googleapis.com/bert_models/2018_10_18/cased_L-12_H-768_A-12.zip)
|____ data		            # train data
|____ middle_data	            # middle data (label id map)
|____ output			    # output (final model, predict results)
|____ BERT_NER.py		    # mian code
|____ conlleval.pl		    # eval code
|____ run_ner.sh    		    # run model and eval result

Usage:

bash run_ner.sh

What's in run_ner.sh:

python BERT_NER.py\
    --task_name="NER"  \
    --do_lower_case=False \
    --crf=False \
    --do_train=True   \
    --do_eval=True   \
    --do_predict=True \
    --data_dir=data   \
    --vocab_file=cased_L-12_H-768_A-12/vocab.txt  \
    --bert_config_file=cased_L-12_H-768_A-12/bert_config.json \
    --init_checkpoint=cased_L-12_H-768_A-12/bert_model.ckpt   \
    --max_seq_length=128   \
    --train_batch_size=32   \
    --learning_rate=2e-5   \
    --num_train_epochs=3.0   \
    --output_dir=./output/result_dir

perl conlleval.pl -d '\t' < ./output/result_dir/label_test.txt

Notice: cased model was recommened, according to this paper. CoNLL-2003 dataset and perl Script comes from here

RESULTS:(On test set)

Parameter setting:

  • do_lower_case=False
  • num_train_epochs=4.0
  • crf=False
accuracy:  98.15%; precision:  90.61%; recall:  88.85%; FB1:  89.72
              LOC: precision:  91.93%; recall:  91.79%; FB1:  91.86  1387
             MISC: precision:  83.83%; recall:  78.43%; FB1:  81.04  668
              ORG: precision:  87.83%; recall:  85.18%; FB1:  86.48  1191
              PER: precision:  95.19%; recall:  94.83%; FB1:  95.01  1311

Result description:

Here i just use the default paramaters, but as Google's paper says a 0.2% error is reasonable(reported 92.4%). Maybe some tricks need to be added to the above model.

reference:

[1] https://arxiv.org/abs/1810.04805

[2] https://github.com/google-research/bert

bert-ner's People

Contributors

kyzhouhzau avatar lapolonio avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bert-ner's Issues

How to reproduce your results.

I use the same run command like yours, but I get worse results on dev dataset.

eval_f = 0.89656204
eval_precision = 0.90508
eval_recall = 0.88843685
global_step = 653
loss = 17.190592

I use "BERT-Base, Multilingual Cased: 104 languages, 12-layer, 768-hidden, 12-heads, 110M parameters" as checkpoint, which is public by google at November 23rd, 2018.

prediction tool

Awesome work man. Respect for your inspiring work.

I'd like to implement a simple command line program for more ppl to play around with. Would like to refer to your job here.

Result of NER

Your final result seems,

accuracy:  98.07%; precision:  90.65%; recall:  88.29%; FB1:  89.45
              LOC: precision:  92.50%; recall:  91.71%; FB1:  92.10  1387
             MISC: precision:  82.63%; recall:  76.99%; FB1:  79.71  668
              ORG: precision:  88.75%; recall:  84.22%; FB1:  86.43  1191
              PER: precision:  94.51%; recall:  94.72%; FB1:  94.62  1311

Result description:
As Google's paper says a 0.2% error is reasonable(reported 92.4%).

How can this result is comparable to google's result. google's result was 92.4 for BERT base and 92.8 for BERT large.
This result is 89.45.

Question: training the model without init_checkpoint

INFO:tensorflow:Error recorded from training_loop: local variable 'initialized_variable_names' referenced before assignment
INFO:tensorflow:training_loop marked as finished
WARNING:tensorflow:Reraising captured error
Traceback (most recent call last):
File "BERT_NER.py", line 612, in
tf.app.run()
File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\platform\app.py", line 125, in run
_sys.exit(main(argv))
File "BERT_NER.py", line 545, in main
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\tpu\python\tpu\tpu_estimator.py", line 2409, in train
rendezvous.raise_errors()
File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\tpu\python\tpu\error_handling.py", line 128, in raise_errors
six.reraise(typ, value, traceback)
File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\six.py", line 693, in reraise
raise value
File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\tpu\python\tpu\tpu_estimator.py", line 2403, in train
saving_listeners=saving_listeners
File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\estimator\estimator.py", line 354, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\estimator\estimator.py", line 1207, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\estimator\estimator.py", line 1237, in _train_model_default
features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\tpu\python\tpu\tpu_estimator.py", line 2195, in _call_model_fn
features, labels, mode, config)
File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\estimator\estimator.py", line 1195, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\tpu\python\tpu\tpu_estimator.py", line 2479, in _model_fn
features, labels, is_export_mode=is_export_mode)
File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\tpu\python\tpu\tpu_estimator.py", line 1259, in call_without_tpu
return self._call_model_fn(features, labels, is_export_mode=is_export_mode)
File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\tpu\python\tpu\tpu_estimator.py", line 1533, in _call_model_fn
estimator_spec = self._model_fn(features=features, **kwargs)
File "BERT_NER.py", line 419, in model_fn
if var.name in initialized_variable_names:
UnboundLocalError: local variable 'initialized_variable_names' referenced before assignment

Training the model without using the init_checkpoint flag returns this error

No Output is generated

Hi,
When I run BERT-NER in evaluate and predict more then no output is generated. The size of eval.tf_record, predict.tf_record and label_test.txt remains 0. What is expected out of running this?

Below is command i am running:
python BERT_NER.py --task_name="NER" --do_train=False --do_eval=True --do_predict=True --data_dir=NERdata --vocab_file=/home/local/BSSTEST/sandeep.a.bhutani/bucket_copy/uncased_L-12_H-768_A-12/vocab.txt --bert_config_file=/home/local/BSSTEST/sandeep.a.bhutani/bucket_copy/uncased_L-12_H-768_A-12/bert_config.json --init_checkpoint=/home/local/BSSTEST/sandeep.a.bhutani/bucket_copy/uncased_L-12_H-768_A-12/bert_model.ckpt --max_seq_length=16 --train_batch_size=32 --learning_rate=2e-5 --num_train_epochs=3.0 --output_dir=./output/result_dir/

Below is sample tail of output:

....
....
INFO:tensorflow:  name = bert/encoder/layer_11/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_11/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_11/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_11/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/pooler/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/pooler/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = output_weights:0, shape = (13, 768)
INFO:tensorflow:  name = output_bias:0, shape = (13,)
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from ./output/result_dir/model.ckpt-653
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:prediction_loop marked as finished
INFO:tensorflow:prediction_loop marked as finished

关于labels及label_id的设计

你好,

看到你对BERT-NER的尝试很受启发。有个问题需要请教,BERT论文原文中提到“where no prediction is made for X”,那么我们还是否将labels中加入“X”呢,如果不加入“X”,又该怎么去体现“no prediction is made”呢?

Traing hanging

The training step stuck on following step:
....
INFO:tensorflow: name = output_bias:0, shape = (13,)
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
2019-01-05 10:45:57.943879: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into /tmp/NER/model.ckpt.
(stuck for 1+hr)

I am wondering maybe I am using different version of tensorflow (1.12)?
(I did not modify any script file, command as below:)

CUDA_VISIBLE_DEVICES=1 python BERT_NER.py
--task_name="NER"
--vocab_file=$BERT_DIR/vocab.txt
--bert_config_file=$BERT_DIR/bert_config.json
--init_checkpoint=$BERT_DIR/bert_model.ckpt
--do_train=True
--do_eval=True
--do_predict=True
--data_dir=NERdata
--train_batch_size=32
--learning_rate=3e-5
--num_train_epochs=3.0
--max_seq_length=128
--output_dir=/tmp/NER/

How to get NER ?

I want to print the NER set ,well not only the precision, recall and so on.How can I find out the Ner result?

How to integrate this in RASA

I want to integrate this NER through BERT into RASA. But the training data formats for RASA entity recognition and this project are different. Can you please help as to how to integrate BERT based entity recognition into RASA. I have integrated the intent classification that BERT offers into RASA, but I having difficulties to integrate the entity recognition

INFO: tensorflow: Saving checkpoints for 0 into

INFO: tensorflow: Saving checkpoints for 0 into ./output/result_dir/model.ckpt
您好,我卡在了上面这一句,我不明白它是在做什么。
它似乎是在写入文件,我的CPU占用率也的确很高,不过我的文件大小并没有改变。
想请问一下,遇到这个是等它继续呢,还是说我的设置出了问题,要直接退出重新弄?
stuck here for 2 hrs....

Doubts on the evaluation.

hello zhou. Thanks a lot for ur contribution on the work of fine-tuning. But I have a question about the evaluation metrics. It seems that in ur evaluation metrics, it evaluates the precision,recall separately for each class(B-person,I-person,B-MISC,I-MISC,.....). If so, the results may not be accurate enough? Thanks a lot!

Key output_bias not found in checkpoint

前面运行的好好的,后面就爆出了这个错误:Key output_bias not found in checkpoint。貌似是模型载入的时候出错。

急求大佬解答

bert模型为based的那个uncased_L-12_H-768_A-12的模型

uesd other NER labels, and have some problems

i change the ner labels in BERT_NER.py line 227, and i have problems like follow,

InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [29,29] rhs shape= [13,13]

serving model

Hi, could you explain how to save the model using estimator.export_saved_model?
Thanks

conll2003 run on gpu devF=0.90-0.91?

Thank for your timely work!
when running on GPU, conll2003 doesn't perform as good as you or paper result, by the way, I tried several times , the dev F score is wandering [0.89, 0.912].
Does your work run on TPU ?

Why `tf.train.init_from_checkpoint` in func `model_fn` run twice

In function model_fn:

        if init_checkpoint:
            (assignment_map, initialized_variable_names) = modeling.get_assignment_map_from_checkpoint(tvars,init_checkpoint)
            tf.train.init_from_checkpoint(init_checkpoint, assignment_map)
            if use_tpu:
                def tpu_scaffold():
                    tf.train.init_from_checkpoint(init_checkpoint, assignment_map)
                    return tf.train.Scaffold()
                scaffold_fn = tpu_scaffold
            else:
                tf.train.init_from_checkpoint(init_checkpoint, assignment_map)

why tf.train.init_from_checkpoint should run twice?

How to use conlleval.pl

how to use script conlleval.pl for evaluation
i made the env to support perl, but still can not run the script correctly

TPUEstimatorSpec.predictions must be dict of Tensors

When running predict on Google Colab (to use TPU) the code crashes with the following error:

TPUEstimatorSpec.predictions must be dict of Tensors.

To solve it one can place the following code in create_model

predict = tf.argmax(probabilities, axis=-1)
predict_dict = {'predictions': predict}  # this way it is not shot down by check in TPUEstimatorSpec
return loss, per_example_loss, logits, predict_dict

This of course also means changing the interpretation of the result

result = estimator.predict(input_fn=predict_input_fn)
result = list(result)
result = [pred['predictions'] for pred in result]

Currently I'm unable to to pull request since that would mean looking into whether it really is a solution. Just posting it here for anyone who has the same problem.

label of "[SEP]", "[PAD]" and "[CLS]"

The idea of not using the "[SEP]" in NER task seems great, but why it will cause problem in crf?

Per the "[CLS]", have you tried to replace it with "[PAD]"?

BertModel is_training should be set to False?

Thanks a lot for the amazing work!
However, in BERT_NER.py line 348, you set BertModel's is_training parameter to change alone with the fine-tuning procedure, but as the model is not integrated in the real training model (use the output of BertModel's final layer as the real input). So as far as I'm concerned, this parameter should be set to False to gain more reasonable result.

how to use gpu trainning

hello zhou, how to use gpu to train. when the tpu is set to false, then the training defaultly use cpu

How can I fine-tune on my own data?

Hey,
is it possible to fine-tune on another dataset? I fed in my own data with different labels. But this causes a mismatch of tensors. Of course, I could match the labels (it's just naming) but that's not how it should work, right?

Uncased or Cased?

Thanks for sharing the code for NER task! May I know which model did you use? Cased or uncased? I am getting F1-dev 88.8 using Cased-model and F1-dev 92.6 using Uncased-model.

Problems: runs very slowly when converting single example to feature

I found it cost too much time When running the convert_single_example function.

time0 = time.time()
feature = convert_single_example(ex_index, example, label_list, max_seq_length, tokenizer,mode)
time1 = time.time()
print("time cost:", time1-time0)

the cost is up to a few seconds!
time cost: 4.020495414733887

In convert_single_example function we can fix it by add the following code :
if not os.path.exists('./output/label2id.pkl'):

in front of

with open('./output/label2id.pkl','wb') as w:
pickle.dump(label_map, w)

Question: loss in padding sequence

Hi! Thank you so much for releasing BERT-NER codes.
I have a question: the sentences are padded with max_sequence_length and labels in such padded part are set to 0. But when counting the total loss in your code, I found both input and padded parts are included:
one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)
per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
loss = tf.reduce_sum(per_example_loss)

I am wondering Is it reasonable. Should we mask the padded part?

Wrong output format for evaluation script

The conlleval.pl script accepts input file format like <span> <groundtruth> <prediction>, but the model's output is <span> <prediction> <groundtruth>. Although it doesn't affect the F1 score, it swaps the values of precision and recall.

grpc error

when i use tensroflow-serving grpc, the “response = stub.Predict(request, timeout)“ has a error message:
status = StatusCode.FAILED_PRECONDITION
details = "Batched output tensor's 0th dimension does not equal the sum of the 0th dimension sizes of the input tensors"
debug_error_string = "{"created":"@1558057036.708000000","description":"Error received from peer","file":"src/core/lib/surface/call.cc","file_line":1017,"grpc_message":"Batched output tensor's 0th dimension does not equal the sum of the 0th dimension sizes of the input tensors","grpc_status":9}"

New sota on conll03 data?

hi, so the F1 of your system is 0.9326?
Is it comparable with the numbers reported in the literature?

CUDA_ERROR_OUT_OF_MEMORY: out of memory; total memory reported:

I am getting out of memory error.
I am running in prediction mode. Tried even reducing the max_seq_length as well but still same error.

python BERT_NER.py --task_name="NER" --do_train=False --do_eval=False --do_predict=True --data_dir=NERdata --vocab_file=/home/local/BSSTEST/sandeep.a.bhutani/bucket_copy/uncased_L-12_H-768_A-12/vocab.txt --bert_config_file=/home/local/BSSTEST/sandeep.a.bhutani/bucket_copy/uncased_L-12_H-768_A-12/bert_config.json --init_checkpoint=/home/local/BSSTEST/sandeep.a.bhutani/bucket_copy/uncased_L-12_H-768_A-12/bert_model.ckpt --max_seq_length=16 --train_batch_size=32 --learning_rate=2e-5 --num_train_epochs=3.0 --output_dir=./output/result_dir/


INFO:tensorflow:prediction_loop marked as finished
WARNING:tensorflow:Reraising captured error
Traceback (most recent call last):
  File "BERT_NER.py", line 613, in <module>
    tf.app.run()
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
    _sys.exit(main(argv))
  File "BERT_NER.py", line 603, in main
    for prediction in result:
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2446, in predict
    rendezvous.raise_errors()
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/error_handling.py", line 128, in raise_errors
    six.reraise(typ, value, traceback)
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/six.py", line 693, in reraise
    raise value
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2440, in predict
    yield_single_examples=yield_single_examples):
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 593, in predict
    hooks=all_hooks) as mon_sess:
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 921, in __init__
    stop_grace_period_secs=stop_grace_period_secs)
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 643, in __init__
    self._sess = _RecoverableSession(self._coordinated_creator)
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1107, in __init__
    _WrappedSession.__init__(self, self._create_session())
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1112, in _create_session
    return self._sess_creator.create_session()
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 800, in create_session
    self.tf_sess = self._session_creator.create_session()
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 566, in create_session
    init_fn=self._scaffold.init_fn)
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/python/training/session_manager.py", line 288, in prepare_session
    config=config)
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/python/training/session_manager.py", line 185, in _restore_checkpoint
    sess = session.Session(self._target, graph=self._graph, config=config)
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1551, in __init__
    super(Session, self).__init__(target, graph, config=config)
  File "/anaconda3/envs/bertenv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 676, in __init__
    self._session = tf_session.TF_NewSessionRef(self._graph._c_graph, opts)
tensorflow.python.framework.errors_impl.InternalError: failed initializing StreamExecutor for CUDA device ordinal 0: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_OUT_OF_MEMORY: out of memory; total memory reported: 16914055168

Question: size of logits

hi kyzhouhzau~

thank you for this project
I have a question about the size of num_labels in BERT_NER.py line470.

num_labels=len(label_list)+1,

why you let the size of logits one bigger than the size of label_list?
thank you so much!

test.txt and label_test.txt isn't same in line numbers

hi
I use the code (thanks for that!)
but there is a problem when test prediction writes in the "output/result_dir/label_test.txt" I thought that this file must be the same as "data/test.txt" but it isn't!

I know that this (Bert-ner) library removes empty new lines in "output/result_dir/label_test.txt" but with removing empty new lines in "data/test.txt" the problem still exists.
(number of lines in "output/result_dir/label_test.txt" is less than "data/test.txt" )

here links of those files:
"data/test.txt" : https://github.com/kyzhouhzau/BERT-NER/blob/master/data/test.txt

"output/result_dir/label_test.txt" : https://github.com/kyzhouhzau/BERT-NER/blob/master/output/result_dir/label_test.txt

thanks

--max_seq_length=128 -> 150

hi kyzhouhzau~

thank you for this project :)
there is a minor error which i'd like to report.

def convert_single_example(ex_index, example, label_list, max_seq_length, tokenizer):
...
    input_ids = tokenizer.convert_tokens_to_ids(ntokens)
    input_mask = [1] * len(input_ids)
    while len(input_ids) < max_seq_length:
        input_ids.append(0)
        input_mask.append(0)
        segment_ids.append(0)
        label_ids.append(0)
    print('length check', len(input_ids), max_seq_length)
    assert len(input_ids) == max_seq_length  <-- error
    assert len(input_mask) == max_seq_length
    assert len(segment_ids) == max_seq_length
    assert len(label_ids) == max_seq_length
...

tokenizer.convert_tokens_to_ids(ntokens) would generate longer list than max_seq_length when we are using --max_seq_length=128.

so, i ran with --max_seq_length=150. it was fine.

Sequence index problem

As i know , BERT model outputs a fixed-size matrix for unequal sentence . That means , "我是**人" and “他也是个**人” both outputs the matrix shape 1*768 (I use model chinese_L-12_H-768_A-12). Then, this model how to deal with word index ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.