Giter VIP home page Giter VIP logo

chinese-literature-ner-re-dataset's People

Contributors

imwebson avatar jingjingxupku avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

chinese-literature-ner-re-dataset's Issues

Entity tags: Term and Physical

Hi, except for the 7 entity tags defined, I also find 2 new entity tags in the NER dataset: "Term" and "Physical". So how to deal with these two types? How to be consistent with the thesis experiment?

Some questions about the data set

Hello, I got some questions about the NER Dataset.
I found that there are 7 kinds of entities according to your paper.
However after processing with your Dataset, I found 10 kinds of entities.
The three additional entities are Term, Physical and ABstract(only three phrases, I suppose it should be Abstract……).

Also, I found some extremely long sentences which are not separated by ".".
And other sentences are separated by "。"

开放知识图谱社区联盟邀请

您好!
我是OpenKG社区的工作人员,了解到您在GitHub上开放的一些知识图谱数据集,诚邀您的加入。加入OpenKG,您的机构/个人将获得如下几个方面的价值:

获得宣传机会,提升您的机构/个人在知识图谱、人工智能和大数据等领域的影响力。OpenKG通过会议、微信公众号、媒体合作等形式定期或不定期宣传加入OpenKG的数据、工具和机构。

获得商业或项目合作机会,联盟促进成员机构之间的商业或项目合作,鼓励挖掘开放知识图谱的商业价值,培育商业模式。加入OpenKG,您将受邀加入OpenKG群和社区,结识国际国内一流的科研机构和企业,获得更多的合作机会。

获得外链数据的机会,联盟将组织第三方机构建立各成员数据之间的链接,您所开放的数据将被更加广泛的与其它各种数据建立关联。我们相信这种数据之间的相互链接会给您的知识图谱数据带来更多更大的价值。

优先获得知识图谱相关技术资源,联盟由国内在第一线从事知识图谱相关研究和开发的大学教授、专业技术人员发起,加入联盟您将优先获得相关技术培训、技术资料、人才等资源。

OpenKG声明:
OpenKG是**中文信息学会语言与知识计算专业委员会于2015年发起和倡导的开放知识图谱社区联盟项目。旨在推动以中文为基础的知识图谱数据的开放、互联与众包,以及知识图谱算法、工具和平台的开源开放工作。
OpenKG设立[常设工作组和管理委员会]总体协调开展工作。由来自浙江大学、东南大学、同济大学等多个单位的知识图谱专业团队联合提供持久性技术支持和日常管理运营。
OpenKG是公益性中立项目,在OpenKG发布的资源所有权均归资源发布机构或个人所有,资源的质量、版权、隐私保护、合法性、及更新维护均由资源发布者负责。

联系咨询:[email protected]
OpenKG网址:http://openkg.cn/
image

questions about the NER dataset

 In your paper: " we manually annotate 726 articles, 29,096 sentences and over 100,000 characters in total ".But we found that 24165 / 1895 / 2837 sentences in train / dev / test. Their sum does not reach 20,906 sentences. 
 Also, in the train set. we found some extremely long sentences which are separated by ".". Does it mean that “.” is the “。”.

Passage separator

Thanks for sharing the interesting data set.
This data is annotated on passage level. The released data set, the words are separated by sentence.
I wonder would you kindly release a version with passage separator?
The format may look like follows

w0 tag0 
w1 tag1 
sentence_separator  
w2 tag2
w2 tag3
passage_separator  # end of a passage 
w tag
w tag 
...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.