skip to navigation
skip to content

rakutenma 0.3.3

morphological analyzer (word segmentor + PoS Tagger) for Chinese and Japanese

Rakuten MA Python

Rakuten MA Python (morphological analyzer) is a Python version of Rakuten MA (word segmentor + PoS Tagger) for Chinese and Japanese.

For details about Rakuten MA, See

See also (In Japanese)

Contributions are welcome!


pip install rakutenma


from rakutenma import RakutenMA

# Initialize a RakutenMA instance with an empty model
# the default ja feature set is set already
rma = RakutenMA()

# Let's analyze a sample sentence (from
# With a disastrous result, since the model is empty!

# Feed the model with ten sample sentences from
# "tatoeba.json" is available at
import json
tatoeba = json.load(open("tatoeba.json"))
for i in tatoeba:

# Now what does the result look like?

# Initialize a RakutenMA instance with a pre-trained model
rma = RakutenMA(phi=1024, c=0.007812)  # Specify hyperparameter for SCW (for demonstration purpose)

# Set the feature hash function (15bit)
rma.hash_func = rma.create_hash_func(15)

# Tokenize one sample sentence

# Re-train the model feeding the right answer (pairs of [token, PoS tag])
res = rma.train_one(
# The result of train_one contains:
#   sys: the system output (using the current model)
#   ans: answer fed by the user
#   update: whether the model was updated

# Now what does the result look like?


Added API

As compared to original RakutenMA, following methods are added:

  • RakutenMA::load(model_path) - Load model from JSON file
  • RakutenMA::save(model_path) - Save model to path


As initial setting, following values are set:

  • rma.featset = CTYPE_JA_PATTERNS # RakutenMA.default_featset_ja
  • rma.hash_func = rma.create_hash_func(15)
  • rma.tag_scheme = “SBIEO” # if using Chinese, set “IOB2”


Apache License version 2.0


0.3.3 (2017-05-22)

  • Bug fix about training

0.3.2 (2017-02-01)

  • Use ujson when possible
  • Enable POS to MeCab style
  • Support Python 3.5 and 3.6

0.3 (2016-04-10)

  • Add CUI ($ rakutenma)

0.2.2 (2016-04-09)

  • Bundle model files (model_ja.json, model_ja_min.json)
  • Support Windows

0.2 (2015-01-10)

  • Support Python 2.6 and 2.7

0.1.1 (2015-01-08)

  • Slightly improve performance

0.1 (2015-01-01)

  • First release.
File Type Py Version Uploaded on Size
rakutenma-0.3.3.tar.gz (md5) Source 2017-05-22 23MB