Giter VIP home page Giter VIP logo

wuyucheng2002 / chinese-ancient-poetry-text-mining Goto Github PK

View Code? Open in Web Editor NEW
17.0 1.0 1.0 32.36 MB

古诗词爬虫和文本挖掘,含13个朝代的3万多条诗人数据、85万多条诗词数据,包括主题聚类、相关诗词推荐、藏头诗生成、诗词翻译等算法实现

Python 4.15% Shell 0.02% Jupyter Notebook 95.83%
chinese-poetry text-generation text-mining text-translation topic-clustering

chinese-ancient-poetry-text-mining's Introduction

古诗词爬虫和文本挖掘

爬虫和文本挖掘部分代码和数据开源

  • spider:爬虫和数据清洗代码,文件具体功能见第一行注释
  • data:数据整理结果,包括13个朝代的3万多条诗人数据,85万多首诗词数据,10万多条意象数据,近2万首含译注赏析的诗词数据,以及各个朝代不同省市的诗人信息
  • topic_model&LSA:主题聚类和推荐模型
  • GPT2-Chinese-old_gpt_2:GPT2实现藏头诗生成,含训练好的模型,可以输入格律、风格和藏头字,自动生成藏头诗,主要参考https://github.com/Morizeyao/GPT2-Chinese
  • bert2transformer_on_NMT:Bert实现翻译模型,含训练好的模型,输入文言文或者古诗词,会自动生成相应的白话文翻译,主要参考https://github.com/rjk-git/bert2transformer_on_NMT

由于github文件大小限制,仓库里主要包含代码文件,完整的代码、所有数据和训练好的模型文件存于百度网盘(链接: https://pan.baidu.com/s/1ExaqJ4O35MZP-EQrgoFCIA 提取码: hg5j)

机器学习部分参考代码和资料

前端开发推荐学习网站

chinese-ancient-poetry-text-mining's People

Contributors

wuyucheng2002 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

vpegasus

chinese-ancient-poetry-text-mining's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.