Giter VIP home page Giter VIP logo

nlp_base's People

Contributors

cclauss avatar lpty avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nlp_base's Issues

Xgboost 的中文疑问句判别模型中读取配置文件转换 json 出错

你好,我下载你的代码学习过程中,运行 /interrogative/manage.py 出现报错:

Traceback (most recent call last):
  File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 1758, in <module>
    main()
  File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 1752, in main
    globals = debugger.run(setup['file'], None, None, is_module)
  File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 1147, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/Applications/PyCharm.app/Contents/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/Users/jinglun/PycharmProjects/DownloadProjects/nlp_base/interrogative/manage.py", line 4, in <module>
    train()
  File "/Users/jinglun/PycharmProjects/DownloadProjects/nlp_base/interrogative/src/api.py", line 17, in train
    model.train()
  File "/Users/jinglun/PycharmProjects/DownloadProjects/nlp_base/interrogative/src/model.py", line 81, in train
    self.initialize_model()
  File "/Users/jinglun/PycharmProjects/DownloadProjects/nlp_base/interrogative/src/model.py", line 40, in initialize_model
    self.max_depth = to_json(self.config.get('model', 'max_depth'))
  File "/Users/jinglun/PycharmProjects/DownloadProjects/nlp_base/interrogative/src/util.py", line 16, in to_json
    return demjson.decode(text, encoding='utf-8')
  File "/Users/jinglun/software/miniconda2/envs/nlp_base36/lib/python3.6/site-packages/demjson.py", line 5699, in decode
    return_stats=(return_stats or write_stats) )
  File "/Users/jinglun/software/miniconda2/envs/nlp_base36/lib/python3.6/site-packages/demjson.py", line 4915, in decode
    raise errors[0]
  File "/Users/jinglun/software/miniconda2/envs/nlp_base36/lib/python3.6/site-packages/demjson.py", line 2428, in set_input
    self.buf = buffered_stream( txt, encoding=encoding )
  File "/Users/jinglun/software/miniconda2/envs/nlp_base36/lib/python3.6/site-packages/demjson.py", line 1614, in __init__
    self.set_text( txt, encoding )
  File "/Users/jinglun/software/miniconda2/envs/nlp_base36/lib/python3.6/site-packages/demjson.py", line 1685, in set_text
    raise newerr
  File "/Users/jinglun/software/miniconda2/envs/nlp_base36/lib/python3.6/site-packages/demjson.py", line 1675, in set_text
    decoded = helpers.unicode_decode( txt, encoding )
  File "/Users/jinglun/software/miniconda2/envs/nlp_base36/lib/python3.6/site-packages/demjson.py", line 1256, in unicode_decode
    unitxt, numbytes = cdk.decode( txt, **cdk_kw )  # DO THE DECODE HERE!
  File "/Users/jinglun/software/miniconda2/envs/nlp_base36/lib/python3.6/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
demjson.JSONDecodeError: a Unicode decoding error occurred

具体代码行是如下 /interrogative/model.py 下这行:

self.max_depth = to_json(self.config.get('model', 'max_depth'))

我的 config.py 中 model 配置没有改动,如下:

'model': {
                'max_depth': [4, 5, 6],
                'eta': [0.1, 0.05, 0.02],
                'subsample': [0.5, 0.7, 1.0],
                'max_iterations': 100,
                'objective': ['binary:logistic'],
                'silent': [1],
                'num_boost_round': 2000,
                'nfold': 5,
                'stratified': 1,
                'metrics': 'auc',
                'early_stopping_rounds': 50,
                'model_path': ' src/data/{}.model'
            }

不是很明白为什么会报这个错误,网上搜索也没有找到解决方法,请教一下这个可以怎么解决吗?

Do you have the trained model

I want to run your code,but i didn't find any useful model.could you send me your model you have trained and vocabunary dictionary?

关于NER的F1值

你好;

有幸看到你的代码,我在跑CRF的时候,想问一下,用类别(BIO)做的F1是不是不好啊,我感觉应该用实际的识别出的实体的结果做F1会好一点?

ModuleNotFoundError: No module named 'model'

When I use 'from interrogative.api import *'

The error occurs:
Traceback (most recent call last):
File "", line 1, in
File "/home1/lx/nlp_base/interrogative/interrogative/api.py", line 7, in
from model import get_model
ModuleNotFoundError: No module named 'model'

打扰了这个xgboost.core.XGBoostError: Invalid Parameter format for silent expect boolean but value='91'怎么改呀,谢谢

Building prefix dict from the default dictionary ...
Loading model from cache C:\Users\Lenovo\AppData\Local\Temp\jieba.cache
Loading model cost 0.906 seconds.
Prefix dict has been built succesfully.
Traceback (most recent call last):
File "C:/Users/Lenovo/Desktop/interrogative/manage.py", line 3, in
train()
File "C:\Users\Lenovo\Desktop\interrogative\src\api.py", line 17, in train
model.train()
File "C:\Users\Lenovo\Desktop\interrogative\src\model.py", line 81, in train
_, best_param, best_iter_round = self.model_param_select()
File "C:\Users\Lenovo\Desktop\interrogative\src\model.py", line 67, in model_param_select
early_stopping_rounds=self.early_stopping_rounds) # stop when metrics not get better
File "D:\python project\venv\lib\site-packages\xgboost\training.py", line 445, in cv
fold.update(i, obj)
File "D:\python project\venv\lib\site-packages\xgboost\training.py", line 230, in update
self.bst.update(self.dtrain, iteration, fobj)
File "D:\python project\venv\lib\site-packages\xgboost\core.py", line 1109, in update
dtrain.handle))
File "D:\python project\venv\lib\site-packages\xgboost\core.py", line 176, in _check_call
raise XGBoostError(py_str(_LIB.XGBGetLastError()))
xgboost.core.XGBoostError: Invalid Parameter format for silent expect boolean but value='91'

训练完之后,安装上面的例子进行命名实体识别,没有识别出来。

from ner.api import recognize
sentence = u'新华社北京十二月三十一日电(**人民广播电台记者刘振英、新华社记者张宿堂)今天是一九九七年的最后一天。'
u'辞旧迎新之际,国务院总理李鹏今天上午来到北京石景山发电总厂考察,向广大企业职工表示节日的祝贺,'
u'向将要在节日期间坚守工作岗位的同志们表示慰问'
predict = recognize(sentence)

##########################################################################
y_predict = self.model.predict(features)
这一步出来的是<type 'list'>: [['O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O']],请问遇见过这个问题吗

AttributeError: 'float' object has no attribute 'decode'

So sorry to bother you again...

when I use "train()"

the error occur:
Building prefix dict from the default dictionary ...
Loading model from cache /tmp/jieba.cache
Loading model cost 0.173 seconds.
Prefix dict has been built succesfully.

Traceback (most recent call last):
File "", line 1, in
File "interrogative/api.py", line 17, in train
model.train()
File "interrogative/model.py", line 76, in train
self.initialize_model()
File "interrogative/model.py", line 31, in initialize_model
train, label = self.corpus.generator()
File "interrogative/corpus.py", line 62, in generator
corpus = cls.read_corpus_from_file(corpus_path)
File "interrogative/corpus.py", line 34, in perform_word_segment
tokenizer = jieba.Tokenizer()
File "/home1/liuxin/anaconda3/envs/py27/lib/python2.7/site-packages/pandas/core/series.py", line 3591, in apply
mapped = lib.map_infer(values, f, convert=convert_dtype)
File "pandas/_libs/lib.pyx", line 2217, in pandas._libs.lib.map_infer
File "interrogative/corpus.py", line 34, in
tokenizer = jieba.Tokenizer()
File "/home1/liuxin/.local/lib/python2.7/site-packages/jieba/init.py", line 282, in cut
sentence = strdecode(sentence)
File "/home1/liuxin/.local/lib/python2.7/site-packages/jieba/_compat.py", line 37, in strdecode
sentence = sentence.decode('utf-8')
AttributeError: 'float' object has no attribute 'decode'

data set error

Content of question_recog.csv is:
content,label
在么,1
你好,0
公司在哪里,1
需要多少钱,1
未成年可以贷款吗,1
你现在在干什么,1
我在这里,0

And when I use 'train()', the error occurs:

Building prefix dict from the default dictionary ...
Loading model from cache /tmp/jieba.cache
Loading model cost 0.201 seconds.
Prefix dict has been built succesfully.
Traceback (most recent call last):
File "", line 1, in
File "interrogative/api.py", line 17, in train
model.train()
File "interrogative/model.py", line 77, in train
_, best_param, best_iter_round = self.model_param_select()
File "interrogative/model.py", line 63, in model_param_select
early_stopping_rounds=self.early_stopping_rounds) # stop when metrics not get better
File "/home1/liuxin/anaconda3/envs/py27/lib/python2.7/site-packages/xgboost/training.py", line 446, in cv
res = aggcv([f.eval(i, feval) for f in cvfolds])
File "/home1/liuxin/anaconda3/envs/py27/lib/python2.7/site-packages/xgboost/training.py", line 234, in eval
return self.bst.eval_set(self.watchlist, iteration, feval)
File "/home1/liuxin/anaconda3/envs/py27/lib/python2.7/site-packages/xgboost/core.py", line 1173, in eval_set
ctypes.byref(msg)))
File "/home1/liuxin/anaconda3/envs/py27/lib/python2.7/site-packages/xgboost/core.py", line 178, in _check_call
raise XGBoostError(_LIB.XGBGetLastError())
xgboost.core.XGBoostError: [10:15:09] /workspace/src/metric/rank_metric.cc:144: Check failed: !auc_error AUC: the dataset only contains pos or neg samples

help

您好,请问数据能够分享么,或者,能够提供获取训练语料的途径么?

你好,我想问一下,您的(依存分析:基于序列标注的中文依存句法分析模型实现 https://blog.csdn.net/sinat_33741547/article/details/79321401),对语料的预处理的程序能提供一下吗?还有我的呈现报错,错误为:ConfigParser.NoSectionError: No section: 'depparser'

你好,我想问一下,您的(依存分析:基于序列标注的中文依存句法分析模型实现 https://blog.csdn.net/sinat_33741547/article/details/79321401),对语料的预处理的程序能提供一下吗?还有我的呈现报错,错误为:ConfigParser.NoSectionError: No section: 'depparser',如果可以能远程指导一下吗?都调试了好多天了,联系:[email protected].
您的分词的代码,我已经调通,非常希望能得到您的帮助

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.