Giter VIP home page Giter VIP logo

mecab-python3's Introduction

Current PyPI packages Build status

mecab-python3

This is a Python wrapper for the MeCab morphological analyzer for Japanese text. It works with Python 3.5 and greater, as well as Python 2.7. (Note: Python 3.5 is not supported on OSX, see this issue).

Basic usage

>>> import MeCab
>>> wakati = MeCab.Tagger("-Owakati")
>>> wakati.parse("pythonが大好きです").split()
['python', 'が', '大好き', 'です']

>>> chasen = MeCab.Tagger("-Ochasen")
>>> print(chasen.parse("pythonが大好きです"))
python python  python 名詞-固有名詞-組織
          助詞-格助詞-一般
大好き ダイスキ 大好き 名詞-形容動詞語幹
です  デス   です  助動詞 特殊デス 基本形
EOS

The API for mecab-python3 closely follows the API for MeCab itself, even when this makes it not very “Pythonic.” Please consult the MeCab documentation for more information.

Installation

Binary wheels are available for MacOS X and Linux, and are installed by default when you use pip:

pip install mecab-python3

These wheels include an internal (statically linked) copy of the MeCab library, and a copy of the mecab-ipadic dictionary (using UTF-8 text encoding), which is automatically used by default. If you wish to use a different dictionary, you will need to install it yourself, write a mecabrc file directing MeCab to use it, and set the environment variable MECABRC to point to this file.

To build from source using pip,

pip install --no-binary :all: mecab-python3

Alternatively, you can use pip to download the source, then build it by hand:

pip download --no-binary :all: mecab-python3
tar zxf mecab-python3-{version}.tar.gz
cd mecab-python3-{version}
python3 setup.py build
# install as you like

When the module is built from source, it requires the system to provide the MeCab library and at least one dictionary. You must have SWIG, the MeCab library and headers, and a dictionary installed before running pip install or setup.py build. For instance, on Debian-based Linux,

sudo apt-get install swig libmecab-dev mecab-ipadic-utf8

Building wheels with a bundled library and dictionary is only supported in a sanitized CI environment. Consult the scripts in the scripts subdirectory of the source tree to see how it’s done.

Licensing

Like MeCab itself, mecab-python3 is copyrighted free software by Taku Kudo [email protected] and Nippon Telegraph and Telephone Corporation, and is distributed under a 3-clause BSD license (see the file BSD). Alternatively, it may be redistributed under the terms of the GNU General Public License, version 2 (see the file GPL) or the GNU Lesser General Public License, version 2.1 (see the file LGPL).

mecab-python3's People

Contributors

zackw avatar samurait avatar polm avatar chezou avatar akorobko avatar jun-harashima avatar klauer avatar koichiyasuoka avatar cclauss avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.