Giter VIP home page Giter VIP logo

entity_extractor_by_binary_tagging's Introduction

Entity-extractor-by-binary-tagging

“半指针-半标注”方法实体的抽取器,基于苏神的三元组抽取方法改造,这里取消了三元组抽取模型中对s的抽取,直接抽取实体并做分类(相当于直接抽取p和o)。改造后的实体抽取方法不仅可以运用于短实体的抽取,也可以运用到长句实体的抽取。

环境

  • python 3.6.7
  • transformers==3.0.2
  • torch==1.6.0

其他环境见requirements.txt

原理

模型原理图

运行

  • 按照data中的格式整理好数据
[
    {
        "text": "XAAAXXXBBXXCCCCCCCCCCCXX",
        "a": "AAA",
        "b": "BB",
        "c": "CCCCCCCCCCC"
    },
]
  • 在system.config文件中配置好参数,其中class_name必须和json文件中的类别的key一致
class_name=[a,b,c]
  • 选择训练模式
################ Status ################
mode=train
# string: train/test/interactive_predict
  • 根据结果调高或调低decision_threshold这个超参数(sigmoid的输出大于这个参数会被判定为实体的首/尾)
decision_threshold=0.5
  • 运行main.py

结果

  • example_datasets1

example_datasets1

这里的数据模式比较简单,比较容易达到验证集拟合状态

  • example_datasets2

example_datasets2

当前模型这个人民日报的ner数据集效果不佳,需要近一步调参炼丹

测试

  • 选择测试模式,程序会读取训练过程中最好的模型
################ Status ################
mode=interactive_predict
# string: train/test/interactive_predict

交互测试结果如下

  • example_datasets1

img04

  • example_datasets2

img05

参考

entity_extractor_by_binary_tagging's People

Contributors

stanleylsx avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.