Giter VIP home page Giter VIP logo

natural-language-processing's Introduction

Natural-Language-Processing

Basic knowledge of natural language processing, including Chinese word segmentation technology, part of speech tagging and Named-Entity Recognition, term extraction and keyword extraction, syntactic analysis, text vectorization, knowledge graph and graph embedding, retrieval and recommendation, etc

自然语言处理的基础知识,包含中文分词技术、词性标注与命名体识别、术语抽取与关键词提取、句法分析、文本向量化、知识图谱与图嵌入、检索与推荐等

数据集

囊括1988-2019年A股上市公司年报与专利

年报(目前只放信息行业)

百度云网盘链接:链接: https://pan.baidu.com/s/1vO0O9X5ewW8wvcOQZDA3Rw 提取码: fpte

专利

百度云网盘链接:链接: https://pan.baidu.com/s/1ZWUjw7_bvCuE-88K1j05mQ 提取码: 6zm7

金融科技专利分类(包含在金融科技专利分类项目上所做的工作)

github链接 :https://github.com/gmihaila/fintech_patents

数学

概率论与数理统计

矩阵论

线性代数

机器学习

西瓜书

PDF版百度云链接:链接: https://pan.baidu.com/s/1Tbwt2HqETIDpGvhJsJAoNA 提取码: tj23

学习笔记:https://github.com/Vay-keen/Machine-learning-learning-notes

南瓜书

南瓜书是对周志华老师的西瓜书里公式的细致讲解

Github:https://github.com/datawhalechina/pumpkin-book/tree/master/docs

在线阅读(内容实时更新):https://datawhalechina.github.io/pumpkin-book

最新版PDF下载:https://github.com/datawhalechina/pumpkin-book/releases

配套视频教程:https://www.bilibili.com/video/BV1Mh411e7VU

深度学习

李沐《动手学深度学习》:https://courses.d2l.ai/zh-v2/

李宏毅2021春机器学习课程:https://www.bilibili.com/video/BV1Wv411h7kN

深度学习入门:https://github.com/Mikoto10032/DeepLearning

深度学习课本中文翻译:https://github.com/exacity/deeplearningbook-chinese

深度学习与自然语言处理:https://github.com/apachecn/AiLearning

深度学习500问:https://github.com/scutan90/DeepLearning-500-questions

深度学习模型:https://github.com/rasbt/deeplearning-models

深度文本匹配:利用 tensorflow.keras 深度学习框架实现:https://github.com/wangle1218/deep_text_matching

PyTorch版本的MatchZoo,可以帮助大家快速的实现、比较、以及分享最新的文本匹配模型.:https://github.com/NTMC-Community/MatchZoo-py

机器学习与深度学习模型

PC-RNN(循环神经网络): 论文“Patent Citation Dynamics Modeling via Multi-Attention Recurrent Networks”的官方Pytorch实现”

github链接:https://github.com/TaoranJ/PC-RNN

NLP-Projects(囊括一些自然语言处理项目的概念与脚本)

github链接:https://github.com/gaoisbest/NLP-Projects

Google bert模型

github链接:https://github.com/google-research/bert

BERT-related Papers:https://gitee.com/helodoger/BERT-related-papers

FinBERT(基于 BERT 架构的金融领域预训练语言模型):https://github.com/valuesimplex/FinBERT

Hugging Face(各种预训练模型的pytorch版本):https://huggingface.co/docs/transformers/index

PaddlePaddle Models

(『飞桨』官方模型库,包含多种学术前沿和工业场景验证的深度学习模型)

github链接:https://github.com/PaddlePaddle/models

常用工具

HanLP

(包含中文分词 词性标注 命名实体识别 依存句法分析 语义依存分析 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理)

github链接:https://github.com/hankcs/HanLP

funNLP

(中文NLP资源库)

github链接:https://github.com/fighting41love/funNLP

ChineseNlpCorpus

中文自然语言处理数据集:https://github.com/InsaneLife/ChineseNLPCorpus

中文命名实体识别数据集:https://zhuanlan.zhihu.com/p/529541521

中文金融情感词典

github链接:https://github.com/MengLingchao/Chinese_financial_sentiment_dictionary

金融领域NLP

(包括市场分析、风险管理、资产管理等)

github链接:https://github.com/icoxfog417/awesome-financial-nlp

Jiagu

(Jiagu使用大规模语料训练而成。将提供中文分词、词性标注、命名实体识别、情感分析、知识图谱关系抽取、关键词抽取、文本摘要、新词发现、情感分析、文本聚类等常用自然语言处理功能)

github链接:https://github.com/ownthink/Jiagu

YEDDA

(NLP文本标注工具)

github链接(源链接,支持python2):https://github.com/jiesutd/YEDDA

github链接(修改后,支持python3):https://github.com/Freeshman/YEDDA

扩展包

python扩展包 : https://www.lfd.uci.edu/~gohlke/pythonlibs/

清华镜像:https://pypi.tuna.tsinghua.edu.cn/simple/

阿里镜像:http://mirrors.aliyun.com/pypi/simple/

情感分析(Sentiment Analysis)

基于分层句法和词汇图卷积的方面级情感分析:https://github.com/NLPWM-WHU/BiGCN

基于方面的情感分析,使用PyTorch实现 :https://github.com/songyouwei/ABSA-PyTorchhttps://github.com/wkk-nlp?tab=repositories

命名实体识别(Named-Entity Recognition)

NLP数据集: https://github.com/liucongg/NLPDataSet (包括中文摘要数据集、中文片段抽取式阅读理解数据集(QA)、中文文本相似度数据集和中文NER数据集)

中文命名实体识别 : https://github.com/taishan1994/awesome-chinese-ner (包括综述,顶会论文,数据集,NER工具等)

ACL顶会论文:https://aclanthology.org/

谷歌学术镜像:https://ac.scmor.com/

natural-language-processing's People

Contributors

jxustliao avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.