Giter VIP home page Giter VIP logo

youmi's Introduction

youmi

智能问答系统demo, word2vec语义匹配

数据

链接:https://pan.baidu.com/s/1KYpyC42pi8xDT19sH02EZw 密码:g22n

结果

you:我账号被人盗了

[('账号被盗了', 0.859556081969784), ('我的号被盗了', 0.8302024684697034), ('帐号被盗了', 0.829032549231153), ('账号被人盗绑怎么办?', 0.7960073683134146), ('账号被盗怎么办?', 0.7198214053422787), ('号刚才被盗了', 0.7108523513577553), ('找回被盗账号', 0.7011944711921758), ('账号给盗了怎么找回', 0.6763156222131452), ('账号忘了', 0.6746552683415384), ('如何找回被盗账号', 0.6667820752109237)]

new commit:

增加了深度学习方法:CNN/LSTM训练产生的embeddings语义向量用于匹配运算。可以解决新词没有语义的问题。

2019-06-29 commit

(很久不更新了,缺少的模型文件找不到了,已修改了不加载该文件,应该可以运行了,不影响整体逻辑)

youmi's People

Contributors

zoulala avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

youmi's Issues

中文wiki word2vec问题

您好!
很荣幸可以学习您的项目;该项目中使用的中文wiki的word2vec模型是 “wiki.zh.word_200v.model”,并且能够成功运行。
但是,遇到一个问题:
1. 我在wiki官网上下载中文语料并且重新训练wiki word2vec模型文件,训练时的参数为:
model=Word2Vec(LineSentence(rawdata),size=400,window=5,min_count=5,
workers=multiprocessing.cpu_count())
2.把步骤1中训练的word2vec模型替换到您的项目中,发现在get_vec_sen函数中,
tianlong_libs.xlsx中的词始终在该模型中找不到,因此vector词向量和始终为0;
3. 是不是我步骤1中自己生成word2vec参数不正确呢?
4. 您提供的wiki.zh.word_200v.model这个模型是如何训练的呢?训练参数可以贴下吗?

谢谢解惑!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.