Giter VIP home page Giter VIP logo

nlpir-python's Introduction

NLPIR-python A python wrapper and toolkit for NLPIR

nlpir-python 是一个 NLPIR 的python调用包

nlpir-python is a python wrapper for NLPIR modules.

Documentation Status PyPI version Test

About

本模块提供两种调用方式:

This package includes two level of method:

1. Native call from Dynamic Link Library(DLL) 原生的直接调用DLL的调用方式

These methods are native method directory from DLL, you can easily use them if you are familiar with the NLPIR modules.

原生方法是直接调用的NLPIR中的api,并进行了部分简化处理,和python化.

    from nlpir.native import ICTCLAS
    test_str = "法国启蒙**家孟德斯鸠曾说过:“一切有权力的人都容易滥用" \
               "权力,这是一条千古不变的经验。有权力的人直到把权力用到" \
               "极限方可休止。”另一法国启蒙**家卢梭从社会契约论的观点" \
               "出发,认为国家权力是公民让渡其全部“自然权利”而获得的," \
               "他在其名著《社会契约论》中写道:“任何国家权力无不是以民" \
               "众的权力(权利)让渡与公众认可作为前提的”。"
    ictclas = ICTCLAS()
    ictclas.paragraph_process(test_str, 0)
2. High-level pythonic method 整合后的更加Python的调用方式

However, the native methods are not very friendly to the beginners. These methods provide a wrapper and tools for the native call, make it easier to use.

然而,对于一般用户来说,原生api功能强大但是却不是很友好.这里nlpir-python对原生api就行包装, 并提供了一些工具方法,使其更利于使用.

    from nlpir import ictclas, tools
    tools.update_license()
    test_str = "法国启蒙**家孟德斯鸠曾说过:“一切有权力的人都容易滥用" \
               "权力,这是一条千古不变的经验。有权力的人直到把权力用到" \
               "极限方可休止。”另一法国启蒙**家卢梭从社会契约论的观点" \
               "出发,认为国家权力是公民让渡其全部“自然权利”而获得的," \
               "他在其名著《社会契约论》中写道:“任何国家权力无不是以民" \
               "众的权力(权利)让渡与公众认可作为前提的”。"

    for word, pos in ictclas.segment(test_str, pos_tagged=True):
        print(word, pos)

NOTE: This module only support python3.6+

NOTE: This repo use the git-lfs, please install lfs when pull this repo

Supported Table

Native Native Doc Native Test High-Level High-Level Doc High-Level Test Tutorial
ICTCLAS
NewWordFinder
KeyExtract
Summary
SentimentNew
SentimentAnalysis
Classify
DeepClassify
Cluster
EyeChecker
DocCompare
DocExtractor
DocParser
iEncoder
HTMLParser
KeyScanner
RedupRemover
SpellChecker
SplitSentence
TextSimilarity
Word2vec

nlpir-python's People

Contributors

cabbagenoob avatar yangyaofei avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nlpir-python's Issues

Error when using sentiment analysis

The error info is listed below:

st=sentiment.SentimentNew()
Traceback (most recent call last):

File "", line 1, in

File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/nlpir/native/sentiment.py", line 22, in init

super().__init__(encode, lib_path, data_path, license_code)

File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/nlpir/native/nlpir_base.py", line 132, in init

self.lib_nlpir, self.lib_path = self.load_library(sys.platform)

File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/nlpir/native/nlpir_base.py", line 224, in load_library

lib_nlpir = ctypes.cdll.LoadLibrary(lib)

File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ctypes/init.py", line 426, in LoadLibrary

return self._dlltype(name)

File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ctypes/init.py", line 348, in init

self._handle = _dlopen(self._name, mode)

OSError: dlopen(/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/nlpir/lib/libSentimentNewdarwin.so, 6): no suitable image found. Did find:

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/nlpir/lib/libSentimentNewdarwin.so: unknown file type, first eight bytes: 0x7F 0x45 0x4C 0x46 0x02 0x01 0x01 0x00

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/nlpir/lib/libSentimentNewdarwin.so: unknown file type, first eight bytes: 0x7F 0x45 0x4C 0x46 0x02 0x01 0x01 0x0

关键词提取模块使用报错

from nlpir.native import KeyExtract
from nlpir import native, PACKAGE_DIR, clean_logs, ictclas

for word, pos in ictclas.segment(test_str, pos_tagged=True):
print(word, pos)
json_out = native.OUTPUT_FORMAT_JSON
def get_key_extract(encode=native.UTF8_CODE):
return KeyExtract(encode=encode)
key_extract = get_key_extract()
word_lists = key_extract.get_key_words("法国启蒙**家孟德斯鸠曾说过")
print(word_lists)

分词功能可以正常使用,但是关键词提取的时候异常退出,请问这是什么问题呢
Process finished with exit code -1073740940 (0xC0000374)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.