ma-dan / xlnet-chinesener Goto Github PK

View Code? Open in Web Editor NEW

45.0 1.0 6.0 3.67 MB

Tensorflow solution of NER task Using BiLSTM-CRF model with CMU/Google XLNet

Python 100.00%

xlnet ner named-entity-recognition bilstm-crf

xlnet-chinesener's Introduction

XLNet Chinese NER

基于Bi-LSTM + CRF 的中文机构名、人名、地名识别，MSRA NER语料，BIO标注

CMU XLNet

参考资料:

https://github.com/yanwii/ChineseNER

https://github.com/macanv/BERT-BiLSTM-CRF-NER

https://github.com/zihangdai/xlnet

https://github.com/ymcui/Chinese-PreTrained-XLNet

下载xlnet中文预训练模型

参考 https://github.com/ymcui/Chinese-PreTrained-XLNet

放到根目录 **chinese_xlnet_base_L-12_H-768_A-12** 下

用法

# 训练
python3 model.py --entry train

# 预测
python3 model.py --entry predict

介绍

xlnet 模型的加载和使用

def xlnet_layer(self):
    # 加载bert配置文件
    xlnet_config = xlnet.XLNetConfig(json_path = FLAGS.xlnet_config)
    run_config = xlnet.create_run_config(self.is_training, True, FLAGS)

    # 创建bert模型　
    xlnet_model = xlnet.XLNetModel(
        xlnet_config = xlnet_config,
        run_config = run_config,
        input_ids = self.input_ids,
        seg_ids = self.segment_ids,
        input_mask = self.input_mask)

    # 加载词向量
    self.embedded = xlnet_model.get_sequence_output()
    self.model_inputs = tf.nn.dropout(
        self.embedded, self.dropout
    )

xlnet 优化器

self.train_op, self.learning_rate, _ = model_utils.get_train_op(FLAGS, self.loss)

xlnet-chinesener's People

Stargazers

Watchers

Forkers

tianyikenan rocke2020 ooco123 aiedward goyner chz367

xlnet-chinesener's Issues

which versions of python, tensorflow, and numpy should I use?

xlnet_data_utils.py文件中的pad_data(self, data)这个函数有点问题

pad_data(self, data)函数中的if (len(tag_ids) == len(inputs_ids) == len(segment_ids) == len(input_mask)):这个if条件会把train或者dev的数据过滤掉很多（大概三分一），不知道是不是应该把max_length = max([len(i[2]) for i in c_data]) 中的2改为1

我自己跑了一下，为什么跑出来的效果这么差呢？
训练到最后的日志日下：
[->] step 696/792 loss 8.37 acc 0.37
[->] step 697/792 loss 17.91 acc 0.37
[->] step 698/792 loss 25.09 acc 0.37
[->] step 699/792 loss 13.42 acc 0.37
[->] step 700/792 loss 21.37 acc 0.40
ORG recall 0 precision 0 f1 0
PER recall 0.38461538461538464 precision 0.625 f1 0.4761904761904762
LOC recall 0.23809523809523808 precision 0.3448275862068966 f1 0.2816901408450704
[->] step 701/792 loss 16.74 acc 0.38
[->] step 702/792 loss 16.36 acc 0.39
[->] step 703/792 loss 16.23 acc 0.42
[->] step 704/792 loss 18.58 acc 0.42

预测结果如下：
[->] restore model

美国的华莱士，我和他谈笑风生。
['B-LOC', 'I-LOC', 'O', 'B-LOC', 'I-LOC', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O']
[
{
"begin": 0,
"end": 2,
"entity": "美国",
"type": "LOC"
},
{
"begin": 3,
"end": 5,
"entity": "华莱",
"type": "LOC"
}
]

memory error when training

`
2019-10-24 21:55:25.500656: W tensorflow/core/common_runtime/bfc_allocator.cc:279] ****************************************************************************************************
2019-10-24 21:55:25.500952: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at cwise_ops_common.cc:70 : Resource exhausted: OOM when allocating tensor with shape[8,509,3072] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1322, in _do_call
return fn(*args)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1307, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[8,509,3072] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: gradients/model/transformer/layer_20/ff/drop_1/dropout/mul_grad/Mul = Mul[T=DT_FLOAT, _grappler:ArithmeticOptimizer:MinimizeBroadcasts=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](ConstantFolding/gradients/model/transformer/dropout/dropout/div_grad/RealDiv_recip, model/transformer/layer_20/ff/drop_1/dropout/Floor)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

 [[Node: loss_layer/Mean/_4463 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_39982_loss_layer/Mean", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "model.py", line 429, in
tf.app.run()
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "model.py", line 424, in main
model.train()
File "model.py", line 329, in train
global_steps, loss, logits, acc, length = self.xlnet_step(sess, batch)
File "model.py", line 270, in xlnet_step
embedding, global_steps, loss, _, logits, acc, length = sess.run([self.embedded, self.global_steps, self.loss, self.train_op, self.logits, self.accuracy, self.length], feed_dict=feed)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run
run_metadata_ptr)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1135, in _run
feed_dict_tensor, options, run_metadata)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
run_metadata)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[8,509,3072] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: gradients/model/transformer/layer_20/ff/drop_1/dropout/mul_grad/Mul = Mul[T=DT_FLOAT, _grappler:ArithmeticOptimizer:MinimizeBroadcasts=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](ConstantFolding/gradients/model/transformer/dropout/dropout/div_grad/RealDiv_recip, model/transformer/layer_20/ff/drop_1/dropout/Floor)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

 [[Node: loss_layer/Mean/_4463 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_39982_loss_layer/Mean", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

Caused by op 'gradients/model/transformer/layer_20/ff/drop_1/dropout/mul_grad/Mul', defined at:
File "model.py", line 429, in
tf.app.run()
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "model.py", line 424, in main
model.train()
File "model.py", line 302, in train
self.__creat_model()
File "model.py", line 110, in __creat_model
self.xlnet_optimizer_layer()
File "model.py", line 253, in xlnet_optimizer_layer
self.train_op, self.learning_rate, _ = model_utils.get_train_op(FLAGS, self.loss)
File "/home/eric/Documents/projects/XLNet-ChineseNER/model_utils.py", line 145, in get_train_op
grads_and_vars = optimizer.compute_gradients(total_loss)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/optimizer.py", line 511, in compute_gradients
colocate_gradients_with_ops=colocate_gradients_with_ops)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gradients_impl.py", line 532, in gradients
gate_gradients, aggregation_method, stop_gradients)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gradients_impl.py", line 701, in _GradientsHelper
lambda: grad_fn(op, *out_grads))
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gradients_impl.py", line 396, in _MaybeCompile
return grad_fn() # Exit early
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gradients_impl.py", line 701, in
lambda: grad_fn(op, *out_grads))
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/math_grad.py", line 880, in _MulGrad
math_ops.reduce_sum(gen_math_ops.mul(grad, y), rx), sx),
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 4759, in mul
"Mul", x=x, y=y, name=name)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3414, in create_op
op_def=op_def)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1740, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

...which was originally created as op 'model/transformer/layer_20/ff/drop_1/dropout/mul', defined at:
File "model.py", line 429, in
tf.app.run()
[elided 2 identical lines from previous traceback]
File "model.py", line 302, in train
self.__creat_model()
File "model.py", line 95, in __creat_model
self.xlnet_layer()
File "model.py", line 153, in xlnet_layer
input_mask = self.input_mask)
File "/home/eric/Documents/projects/XLNet-ChineseNER/xlnet.py", line 222, in init
) = modeling.transformer_xl(**tfm_args)
File "/home/eric/Documents/projects/XLNet-ChineseNER/modeling.py", line 649, in transformer_xl
reuse=reuse)
File "/home/eric/Documents/projects/XLNet-ChineseNER/modeling.py", line 69, in positionwise_ffn
name='drop_1')
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/layers/core.py", line 271, in dropout
return layer.apply(inputs, training=training)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 774, in apply
return self.call(inputs, *args, **kwargs)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 329, in call
outputs = super(Layer, self).call(inputs, *args, **kwargs)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 703, in call
outputs = self.call(inputs, *args, **kwargs)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/layers/core.py", line 229, in call
return super(Dropout, self).call(inputs, training=training)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/keras/layers/core.py", line 144, in call
lambda: array_ops.identity(inputs))
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/keras/utils/tf_utils.py", line 51, in smart_cond
pred, true_fn=true_fn, false_fn=false_fn, name=name)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/smart_cond.py", line 54, in smart_cond
return true_fn()

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[8,509,3072] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: gradients/model/transformer/layer_20/ff/drop_1/dropout/mul_grad/Mul = Mul[T=DT_FLOAT, _grappler:ArithmeticOptimizer:MinimizeBroadcasts=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](ConstantFolding/gradients/model/transformer/dropout/dropout/div_grad/RealDiv_recip, model/transformer/layer_20/ff/drop_1/dropout/Floor)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

 [[Node: loss_layer/Mean/_4463 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_39982_loss_layer/Mean", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

错误提示：ValueError: not enough values to unpack (expected 5, got 0)

python 3.7
tensorflow 1.13.2

做predict时候的pre_path长度和句子的字的个数不一样，导致预测的对不上

按道理pre_path长度和句子的字的个数应该是一样的才行，这样预测的ner的标签才对应的上句子的字

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.