Giter VIP home page Giter VIP logo

ckipnlp's Introduction

CKIP CoreNLP Toolkit

Features

  • Sentence Segmentation
  • Word Segmentation
  • Part-of-Speech Tagging
  • Named-Entity Recognition
  • Constituency Parsing
  • Coreference Resolution

Git

https://github.com/ckiplab/ckipnlp

GitHub Version GitHub Release GitHub Issues

PyPI

https://pypi.org/project/ckipnlp

PyPI Version PyPI License PyPI Downloads PyPI Python PyPI Implementation PyPI Status

Documentation

https://ckipnlp.readthedocs.io/

ReadTheDocs Home

Online Demo

https://ckip.iis.sinica.edu.tw/service/corenlp

Contributers

Installation

Requirements

Driver Requirements

Driver Built-in CkipTagger CkipClassic
Sentence Segmentation    
Word Segmentation†  
Part-of-Speech Tagging†  
Constituency Parsing    
Named-Entity Recognition    
Coreference Resolution‡
  • † These drivers require only one of either backends.
  • ‡ Coreference implementation does not require any backend, but requires results from word segmentation, part-of-speech tagging, constituency parsing, and named-entity recognition.

Installation via Pip

  • No backend (not recommended): pip install ckipnlp.
  • With CkipTagger backend (recommended): pip install ckipnlp[tagger] or pip install ckipnlp[tagger-gpu].
  • With CkipClassic Parser Client backend (recommended): pip install ckipnlp[classic].
  • With CkipClassic offline backend: Please refer https://ckip-classic.readthedocs.io/en/latest/main/readme.html#installation for CkipClassic installation guide.

Attention!

To use CkipClassic Parser Client backend, please

  1. Register an account at http://parser.iis.sinica.edu.tw/v1/reg.php
  2. Set the username and password in the pipeline's options:
pipeline = CkipPipeline(opts={'con_parser': {'username': YOUR_USERNAME, 'password': YOUR_PASSWORD})

Detail

See https://ckipnlp.readthedocs.io/ for full documentation.

License

GPL-3.0

Copyright (c) 2018-2023 CKIP Lab under the GPL-3.0 License.

ckipnlp's People

Contributors

andreawwenyi avatar emfomy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ckipnlp's Issues

Constituency Parsing in CkipPipeline is not working

I want to use Constituency Parsing in the CKIP package. The step I have done and the issue I encountered are as follows.

  1. I registered an account at http://parser.iis.sinica.edu.tw/v1/reg.exe and got my username and password
  2. I followed the usage guideline in Pipelines (https://ckipnlp.readthedocs.io/en/latest/main/usage/pipeline.html).
  3. The ws and pos in CkipPipeline are working properly. However, in Constituency Parsing stage, the program runs very slow and outputs an empty list ([[]]), which is not consistent with that of the online demo version.
    Attached is the code, the output of the code, and the output of the online demo version for the same input.
    0817_1
    0817_2
    0817_3

Co-Reference Pipeline sample code is not working

I use sample code from document-reference, but I get the error-message NodeIDAbsentError: Node 'None' is not in the tree.
Do I misuse this sample code?

from ckipnlp.pipeline import CkipCorefPipeline, CkipDocument

pipeline = CkipCorefPipeline()
doc = CkipDocument(raw='畢卡索他想,完蛋了')

# Co-Reference
corefdoc = pipeline(doc)
print(corefdoc.coref)
for line in corefdoc.coref:
    print(line.to_text())

Use structured data by default.

Use structured data (those utility classes) as input/output for CkipWs and CkipParser by default, and provide a raw flag plain text input/output usage.

OSError: when running usage example

hi,
I was running the usage example CKIPWS and CKIP-Parser,

but got the following error.

~/.local/lib/python3.6/site-packages/ckipnlp/parser/__init__.py in __init__(self, logger, inifile, wsinifile, **options)
     63         CkipParser(**options)
     64 
---> 65         self.__core.init_data(inifile)
     66 
     67         try:

src/parser/ckipparser.pyx in ckipnlp._core.parser.CkipParserCore.init_data()

OSError: 

I've already had my LD_LIBRARY_PATH set and was able to run CKIPWS.py and CKIPCoreNLP3.py.

Would you help me addressing the problems here?🤔

Special characters cause parser segmentation fault

Segmentation fault (core dumped) occurs while CkipParser(do_ws=False) is called because inputs of parser contain halfwidth symbols ex.-

To Reproduce:

import ckipnlp.parser
ps = ckipnlp.parser.CkipParser(do_ws=False)
ps("103-105(Nd)")

Add alignment routine.

CKIPWS and CKIPParser change some characters. An alignment routine helps user to back tracking.

AttributeError: 'NoneType' object has no attribute 'TF_CloseSession'

When I run

pipeline = CkipPipeline(opts={'con_parser': {'username': '***', 'password': '***'}})
......
pipeline.get_conparse(doc)

in a .py file
I can see the output result if it runs and prints in the same time.
But when it have run though all the inputs, and it is going to finish the .py file, I will face downward error.

Traceback (most recent call last):
  File "C:\Users\User\AppData\Local\Programs\Python\Python38\lib\site-packages\ckiptagger\api.py", line 65, in __del__
  File "C:\Users\User\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\client\session.py", line 765, in close
AttributeError: 'NoneType' object has no attribute 'TF_CloseSession'
Exception ignored in: <function POS.__del__ at 0x0000017E5D348550>
Traceback (most recent call last):
  File "C:\Users\User\AppData\Local\Programs\Python\Python38\lib\site-packages\ckiptagger\api.py", line 185, in __del__
  File "C:\Users\User\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\client\session.py", line 765, in close
AttributeError: 'NoneType' object has no attribute 'TF_CloseSession'

It's worth mentioning that when i run it in ipynb in same environment, it can run without this error message.
Is there any who was faced with this error message before?

More Detail of this .py file

What `%` means in Parsing Tree?

When using parsing tree and
feeding a little bit longer sentence, I always get % at root, instead of S, VP or NP.
But I cannot find any information from document to know what it is.
Is there anyone knows that?

Coreference Resolution Example

About sentence segementation

Dear authors/ developers/ editors,

As I cannot find the syntax/use case of sentences-segmentation function stated. Even I search in the internet, may I know whether this function been deprecated/ cancelled?
P.S. I know there is word segmentation, but what I want to use is "sentence" one, that is splitting a paragraph into meaningful sentence
even there is not any punctuation

Failed to install ckipnlp[classic]

I need to do constituency parsing on Chinese sentences for research.
Following the instructions in README, I run pip install ckipnlp[classic] but got error messages:

  Downloading ckip-classic-1.2.1.tar.gz (15 kB)                                                                                                                                                     [32/202]
    ERROR: Command errored out with exit status 1:                                                                                                                                                          
     command: /shared_home/r08922129/anaconda3/envs/syntax/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-lz9bumwv/ckip-classic_a6bd4cf92dbc4779965dbd862e3b8187/setup
.py'"'"'; __file__='"'"'/tmp/pip-install-lz9bumwv/ckip-classic_a6bd4cf92dbc4779965dbd862e3b8187/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"
'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-jm753tfu
         cwd: /tmp/pip-install-lz9bumwv/ckip-classic_a6bd4cf92dbc4779965dbd862e3b8187/
    Complete output (9 lines):
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-lz9bumwv/ckip-classic_a6bd4cf92dbc4779965dbd862e3b8187/setup.py", line 22, in <module>
        assert StrictVersion(setuptools.__version__) >= StrictVersion('40.0'), \
      File "/shared_home/r08922129/anaconda3/envs/syntax/lib/python3.7/distutils/version.py", line 40, in __init__
        self.parse(vstring)
      File "/shared_home/r08922129/anaconda3/envs/syntax/lib/python3.7/distutils/version.py", line 137, in parse
        raise ValueError("invalid version number '%s'" % vstring)
    ValueError: invalid version number '52.0.0.post20210125'
    ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/76/d3/4e7abd934536bf23dabbf41ea5153da893b2a76a2cfadf4746fa9698f3ad/ckip-classic-1.2.1.tar.gz#sha256=67617018e2c2c4a35dee80b9f825205a7503865bb45$
bfd437f8e3f0a124a3b7 (from https://pypi.org/simple/ckip-classic/) (requires-python:>=3.6). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  Downloading ckip-classic-1.2.0.tar.gz (14 kB)
    ERROR: Command errored out with exit status 1:
     command: /shared_home/r08922129/anaconda3/envs/syntax/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-lz9bumwv/ckip-classic_cae0b38a151740bc9c384c1a6364e8a5/setu$
.py'"'"'; __file__='"'"'/tmp/pip-install-lz9bumwv/ckip-classic_cae0b38a151740bc9c384c1a6364e8a5/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'$
'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-8qt0_a4g
         cwd: /tmp/pip-install-lz9bumwv/ckip-classic_cae0b38a151740bc9c384c1a6364e8a5/
    Complete output (9 lines):
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-lz9bumwv/ckip-classic_cae0b38a151740bc9c384c1a6364e8a5/setup.py", line 22, in <module>
        assert StrictVersion(setuptools.__version__) >= StrictVersion('40.0'), \
      File "/shared_home/r08922129/anaconda3/envs/syntax/lib/python3.7/distutils/version.py", line 40, in __init__
        self.parse(vstring)
      File "/shared_home/r08922129/anaconda3/envs/syntax/lib/python3.7/distutils/version.py", line 137, in parse
        raise ValueError("invalid version number '%s'" % vstring)
    ValueError: invalid version number '52.0.0.post20210125'
    ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/e5/df/c42e352e906fc73619cbc60bfafde2adeab7164dd86de172a25fbb8c7b04/ckip-classic-1.2.0.tar.gz#sha256=93e97c0a0c82d8a63fbcc7d909b46bc412be87225dc$
74886bff57068eaf7c11 (from https://pypi.org/simple/ckip-classic/) (requires-python:>=3.6). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  Downloading ckip-classic-1.1.2.tar.gz (14 kB)
    ERROR: Command errored out with exit status 1:
     command: /shared_home/r08922129/anaconda3/envs/syntax/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-lz9bumwv/ckip-classic_f234e4803640443681e72f5ec9dfe94c/setu$
.py'"'"'; __file__='"'"'/tmp/pip-install-lz9bumwv/ckip-classic_f234e4803640443681e72f5ec9dfe94c/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'$
'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-1fxjahsn
         cwd: /tmp/pip-install-lz9bumwv/ckip-classic_f234e4803640443681e72f5ec9dfe94c/
    Complete output (9 lines):
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-lz9bumwv/ckip-classic_f234e4803640443681e72f5ec9dfe94c/setup.py", line 22, in <module>
        assert StrictVersion(setuptools.__version__) >= StrictVersion('40.0'), \
      File "/shared_home/r08922129/anaconda3/envs/syntax/lib/python3.7/distutils/version.py", line 40, in __init__
        self.parse(vstring)
      File "/shared_home/r08922129/anaconda3/envs/syntax/lib/python3.7/distutils/version.py", line 137, in parse
        raise ValueError("invalid version number '%s'" % vstring)
    ValueError: invalid version number '52.0.0.post20210125'

Another question is that I have registered an account at http://parser.iis.sinica.edu.tw/v1/reg.exe
but I didn't see the guide in document to use the online parsing system. The api shown in documents doesn't work either. Could anyone show how to use it? Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.