Giter VIP home page Giter VIP logo

deeplearning_nlp's Introduction

基于深度学习的自然语言处理库

本项目是对DeepNLP的重构,着重增强架构设计的合理性,提高代码的可读性,减少模块的耦合度,并增加一些新功能。

环境

  • python >= 3.5
  • tensorflow >= 1.3.0
  • sklearn
  • scipy

项目结构

本项目的核心代码位于python\dnlp目录下

python/dnlp
│  cws.py   # 分词
│  ner.py   # 命名实体识别
│  rel_extract.py # 关系抽取
│  __init__.py
│
├─config
│     config.py  # 配置项
│     __init__.py
│  
├─core  # 核心功能模块
│  │  dnn_crf.py    # 基于dnn-crf的序列标注
│  │  dnn_crf_base.py # 基于dnn-crf的序列标注的基类
│  │  mmtnn.py      # max-margin tensor nural network模型
│  │  re_cnn.py     # 基于cnn的关系抽取
│  │  __init__.py
│  
├─data_process  # 训练和测试数据的预处理
│     processor.py  # 基类
│     process_cws.py  # 对分词的预处理 
│     process_emr.py 
│     process_ner.py  # 对命名实体识别的预处理
│     process_pos.py  # 对词性标注的预处理
│     __init__.py
│  
│
├─models  # 保存训练后的模型
│
├─scripts # 运行脚本,包括初始化数据集和训练测试等等
│     init_datasets.py  # 初始化训练数据
│     cws_ner.py    # 进行分词和命名实体识别的训练和使用
│     __init__.py
│
├─tests  # 单元测试
├─utils  # 公用函数
      constant.py  # 一些常量
      __init__.py
  

运行

  1. 初始化数据
python python\scripts\init_datasets.py
  1. 训练
python python\scripts\cws_ner.py -t
  1. 使用
python python\scripts\cws_ner.py -p

参考论文

中文分词 && 命名实体识别

实体关系抽取

ToDo-List

  • 完善文档
  • 增加更多算法的实现
  • 支持pip
  • 加入TensorBoard支持
  • 支持TensorFlow Estimator和Save Model
  • 增加对Java、C++的支持

deeplearning_nlp's People

Contributors

supercoderhawk avatar

Watchers

James Cloos avatar 潮水中一滴浪花 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.