Giter VIP home page Giter VIP logo

xlnet-chinesener's Introduction

XLNet Chinese NER

基于Bi-LSTM + CRF 的中文机构名、人名、地名识别,MSRA NER语料,BIO标注

CMU XLNet

参考资料:

https://github.com/yanwii/ChineseNER

https://github.com/macanv/BERT-BiLSTM-CRF-NER

https://github.com/zihangdai/xlnet

https://github.com/ymcui/Chinese-PreTrained-XLNet

下载xlnet中文预训练模型

参考 https://github.com/ymcui/Chinese-PreTrained-XLNet

放到根目录 **chinese_xlnet_base_L-12_H-768_A-12** 下

用法

# 训练
python3 model.py --entry train

# 预测
python3 model.py --entry predict

介绍

xlnet 模型的加载和使用

def xlnet_layer(self):
    # 加载bert配置文件
    xlnet_config = xlnet.XLNetConfig(json_path = FLAGS.xlnet_config)
    run_config = xlnet.create_run_config(self.is_training, True, FLAGS)

    # 创建bert模型 
    xlnet_model = xlnet.XLNetModel(
        xlnet_config = xlnet_config,
        run_config = run_config,
        input_ids = self.input_ids,
        seg_ids = self.segment_ids,
        input_mask = self.input_mask)

    # 加载词向量
    self.embedded = xlnet_model.get_sequence_output()
    self.model_inputs = tf.nn.dropout(
        self.embedded, self.dropout
    )

xlnet 优化器

self.train_op, self.learning_rate, _ = model_utils.get_train_op(FLAGS, self.loss)

xlnet-chinesener's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

xlnet-chinesener's Issues

ner性能

请问你实验的ner性能怎么样呀?我试了下MSRA-NER数据集,效果不是很好。

为什么跑出来的效果这么差呢?

我自己跑了一下,为什么跑出来的效果这么差呢?
训练到最后的日志日下:
[->] step 696/792 loss 8.37 acc 0.37
[->] step 697/792 loss 17.91 acc 0.37
[->] step 698/792 loss 25.09 acc 0.37
[->] step 699/792 loss 13.42 acc 0.37
[->] step 700/792 loss 21.37 acc 0.40
ORG recall 0 precision 0 f1 0
PER recall 0.38461538461538464 precision 0.625 f1 0.4761904761904762
LOC recall 0.23809523809523808 precision 0.3448275862068966 f1 0.2816901408450704
[->] step 701/792 loss 16.74 acc 0.38
[->] step 702/792 loss 16.36 acc 0.39
[->] step 703/792 loss 16.23 acc 0.42
[->] step 704/792 loss 18.58 acc 0.42

预测结果如下:
[->] restore model

美国的华莱士,我和他谈笑风生。
['B-LOC', 'I-LOC', 'O', 'B-LOC', 'I-LOC', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O']
[
{
"begin": 0,
"end": 2,
"entity": "美国",
"type": "LOC"
},
{
"begin": 3,
"end": 5,
"entity": "华莱",
"type": "LOC"
}
]

memory error when training

`
2019-10-24 21:55:25.500656: W tensorflow/core/common_runtime/bfc_allocator.cc:279] ****************************************************************************************************
2019-10-24 21:55:25.500952: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at cwise_ops_common.cc:70 : Resource exhausted: OOM when allocating tensor with shape[8,509,3072] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1322, in _do_call
return fn(*args)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1307, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[8,509,3072] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: gradients/model/transformer/layer_20/ff/drop_1/dropout/mul_grad/Mul = Mul[T=DT_FLOAT, _grappler:ArithmeticOptimizer:MinimizeBroadcasts=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](ConstantFolding/gradients/model/transformer/dropout/dropout/div_grad/RealDiv_recip, model/transformer/layer_20/ff/drop_1/dropout/Floor)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

 [[Node: loss_layer/Mean/_4463 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_39982_loss_layer/Mean", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "model.py", line 429, in
tf.app.run()
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "model.py", line 424, in main
model.train()
File "model.py", line 329, in train
global_steps, loss, logits, acc, length = self.xlnet_step(sess, batch)
File "model.py", line 270, in xlnet_step
embedding, global_steps, loss, _, logits, acc, length = sess.run([self.embedded, self.global_steps, self.loss, self.train_op, self.logits, self.accuracy, self.length], feed_dict=feed)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run
run_metadata_ptr)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1135, in _run
feed_dict_tensor, options, run_metadata)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
run_metadata)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[8,509,3072] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: gradients/model/transformer/layer_20/ff/drop_1/dropout/mul_grad/Mul = Mul[T=DT_FLOAT, _grappler:ArithmeticOptimizer:MinimizeBroadcasts=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](ConstantFolding/gradients/model/transformer/dropout/dropout/div_grad/RealDiv_recip, model/transformer/layer_20/ff/drop_1/dropout/Floor)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

 [[Node: loss_layer/Mean/_4463 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_39982_loss_layer/Mean", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

Caused by op 'gradients/model/transformer/layer_20/ff/drop_1/dropout/mul_grad/Mul', defined at:
File "model.py", line 429, in
tf.app.run()
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "model.py", line 424, in main
model.train()
File "model.py", line 302, in train
self.__creat_model()
File "model.py", line 110, in __creat_model
self.xlnet_optimizer_layer()
File "model.py", line 253, in xlnet_optimizer_layer
self.train_op, self.learning_rate, _ = model_utils.get_train_op(FLAGS, self.loss)
File "/home/eric/Documents/projects/XLNet-ChineseNER/model_utils.py", line 145, in get_train_op
grads_and_vars = optimizer.compute_gradients(total_loss)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/optimizer.py", line 511, in compute_gradients
colocate_gradients_with_ops=colocate_gradients_with_ops)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gradients_impl.py", line 532, in gradients
gate_gradients, aggregation_method, stop_gradients)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gradients_impl.py", line 701, in _GradientsHelper
lambda: grad_fn(op, *out_grads))
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gradients_impl.py", line 396, in _MaybeCompile
return grad_fn() # Exit early
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gradients_impl.py", line 701, in
lambda: grad_fn(op, *out_grads))
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/math_grad.py", line 880, in _MulGrad
math_ops.reduce_sum(gen_math_ops.mul(grad, y), rx), sx),
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 4759, in mul
"Mul", x=x, y=y, name=name)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3414, in create_op
op_def=op_def)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1740, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

...which was originally created as op 'model/transformer/layer_20/ff/drop_1/dropout/mul', defined at:
File "model.py", line 429, in
tf.app.run()
[elided 2 identical lines from previous traceback]
File "model.py", line 302, in train
self.__creat_model()
File "model.py", line 95, in __creat_model
self.xlnet_layer()
File "model.py", line 153, in xlnet_layer
input_mask = self.input_mask)
File "/home/eric/Documents/projects/XLNet-ChineseNER/xlnet.py", line 222, in init
) = modeling.transformer_xl(**tfm_args)
File "/home/eric/Documents/projects/XLNet-ChineseNER/modeling.py", line 649, in transformer_xl
reuse=reuse)
File "/home/eric/Documents/projects/XLNet-ChineseNER/modeling.py", line 69, in positionwise_ffn
name='drop_1')
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/layers/core.py", line 271, in dropout
return layer.apply(inputs, training=training)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 774, in apply
return self.call(inputs, *args, **kwargs)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 329, in call
outputs = super(Layer, self).call(inputs, *args, **kwargs)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 703, in call
outputs = self.call(inputs, *args, **kwargs)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/layers/core.py", line 229, in call
return super(Dropout, self).call(inputs, training=training)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/keras/layers/core.py", line 144, in call
lambda: array_ops.identity(inputs))
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/keras/utils/tf_utils.py", line 51, in smart_cond
pred, true_fn=true_fn, false_fn=false_fn, name=name)
File "/home/eric/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/smart_cond.py", line 54, in smart_cond
return true_fn()

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[8,509,3072] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: gradients/model/transformer/layer_20/ff/drop_1/dropout/mul_grad/Mul = Mul[T=DT_FLOAT, _grappler:ArithmeticOptimizer:MinimizeBroadcasts=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](ConstantFolding/gradients/model/transformer/dropout/dropout/div_grad/RealDiv_recip, model/transformer/layer_20/ff/drop_1/dropout/Floor)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

 [[Node: loss_layer/Mean/_4463 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_39982_loss_layer/Mean", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

`

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.