Giter VIP home page Giter VIP logo

ckip-classic's Introduction

Introduction

A Linux Python wrapper for CKIP classic tools — CKIP Word Segmentation and CKIP Parser.

Attention

Please use CKIPNLP for structured data types and pipeline drivers.

Attention

For Python 2 users, please use PyCkip 0.4.2 instead.

Git

https://github.com/ckiplab/ckip-classic

GitHub Version GitHub Release GitHub Issues

PyPI

https://pypi.org/project/ckip-classic

PyPI Version PyPI License PyPI Downloads PyPI Python PyPI Implementation PyPI Status

Documentation

https://ckip-classic.readthedocs.io/

ReadTheDocs Home

Contributers

Requirements

Note that one should have CKIPWS/CKIPParser for this project:

Installation

Attention

  • Offline version: CKIPWS (Academic/Commercial License) and CKIPParser (Commercial License).
  • Online version: CKIPParser (Academic License).

Offline Version

Download CKIPWS and/or CKIPParser from above links. Denote <ckipws-linux-root> as the folder containing CKIPWS, and <ckipparser-linux-root> as the folder containing CKIPParser.

pip install --force-reinstall --upgrade ckip-classic \
   --install-option='--ws' \
   --install-option='--ws-dir=<ckipws-linux-root>' \
   --install-option='--parser' \
   --install-option='--parser-dir=<ckipparser-linux-root>'

Ignore ws/parser options if one doesn't have CKIPWS/CKIPParser.

Attention

Please use absolute paths.

Online Version

Register an account at http://parser.iis.sinica.edu.tw/v1/reg.exe

pip install --upgrade ckip-classic

Installation Options

Option Detail Default Value
--[no-]ws Enable/disable CKIPWS. False
--[no-]parser Enable/disable CKIPParser. False
--ws-dir=<ws-dir> CKIPWS root directory.
--ws-lib-dir=<ws-lib-dir> CKIPWS libraries directory <ws-dir>/lib
--ws-share-dir=<ws-share-dir> CKIPWS share directory <ws-dir>
--parser-dir=<parser-dir> CKIPParser root directory.
--parser-lib-dir=<parser-lib-dir> CKIPParser libraries directory <parser-dir>/lib
--parser-share-dir=<parser-share-dir> CKIPParser share directory <parser-dir>
--data2-dir=<data2-dir> "Data2" directory <ws-share-dir>/Data2
--rule-dir=<rule-dir> "Rule" directory <parser-share-dir>/Rule
--rdb-dir=<rdb-dir> "RDB" directory <parser-share-dir>/RDB

Usage

See https://ckip-classic.readthedocs.io/ for API details.

CKIPWS

CKIP Word Segmentation offline driver.

import ckip_classic.ws
print(ckip_classic.__name__, ckip_classic.__version__)

ws = ckip_classic.ws.CkipWs(logger=False)
print(ws('中文字喔'))
for l in ws.apply_list(['中文字喔', '啊哈哈哈']): print(l)

ws.apply_file(ifile='sample/sample.txt', ofile='output/sample.tag', uwfile='output/sample.uw')
with open('output/sample.tag') as fin:
    print(fin.read())
with open('output/sample.uw') as fin:
    print(fin.read())

CKIPParser

CKIP Parser offline driver.

import ckip_classic.parser
print(ckip_classic.__name__, ckip_classic.__version__)

ps = ckip_classic.parser.CkipParser(logger=False)
print(ps('中文字喔'))
for l in ps.apply_list(['中文字喔', '啊哈哈哈']): print(l)

ps.apply_file(ifile='sample/sample.txt', ofile='output/sample.tree')
with open('output/sample.tree') as fin:
    print(fin.read())

CKIPParserClient

CKIP Parser online client.

import ckip_classic.client
print(ckip_classic.__name__, ckip_classic.__version__)

ps = ckip_classic.client.CkipParserClient(username='USERNAME', password='PASSWORD')
print(ps('中文字(Na) 耶(T) ,(COMMACATEGORY)'))
for l in ps.apply_list(['中文字(Na) 耶(T) ,(COMMACATEGORY)', '啊(I) 哈(D) 哈(D) 哈(D) 。(PERIODCATEGORY)']): print(l)

FAQ

Danger

Due to C code implementation, both CkipWs and CkipParser can only be instance once.


Warning

CKIPParser fails if input text contains special characters such as ()+-:|. One may replace these characters by

text = text
   .replace('(', '(')
   .replace(')', ')')
   .replace('+', '+')
   .replace('-', '-')
   .replace(':', ':')
   .replace('|', '|')

Tip

fatal error: Python.h: No such file or directory". What should I do?

Install Python development package

sudo apt-get install python3-dev

Tip

The CKIPWS throws "what(): locale::facet::_S_create_c_locale name not valid". What should I do?

Install locale data.

apt-get install locales-all

Tip

The CKIPParser throws "ImportError: libCKIPParser.so: cannot open shared object file: No such file or directory". What should I do?

Add below command to ~/.bashrc:

export LD_LIBRARY_PATH=<ckipparser-linux-root>/lib:$LD_LIBRARY_PATH

License

GPL-3.0

Copyright (c) 2018-2020 CKIP Lab under the GPL-3.0 License.

ckip-classic's People

Contributors

emfomy avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.