Skip to main content

Fast MUTF-8 encoder & decoder

Project description

Tests

mutf-8

This package contains simple pure-python as well as C encoders and decoders for the MUTF-8 character encoding. In most cases, you can also parse the even-rarer CESU-8.

These days, you'll most likely encounter MUTF-8 when working on files or protocols related to the JVM. Strings in a Java .class file are encoded using MUTF-8, strings passed by the JNI, as well as strings exported by the object serializer.

This library was extracted from Lawu, a Python library for working with JVM class files.

🎉 Installation

Install the package from PyPi:

pip install mutf8

Binary wheels are available for the following:

py3.6 py3.7 py3.8 py3.9
OS X (x86_64) y y y y
Windows (x86_64) y y y y
Linux (x86_64) y y y y

If binary wheels are not available, it will attempt to build the C extension from source with any C99 compiler. If it could not build, it will fall back to a pure-python version.

Usage

Encoding and decoding is simple:

from mutf8 import encode_modified_utf8, decode_modified_utf8

unicode = decode_modified_utf8(byte_like_object)
bytes = encode_modified_utf8(unicode)

This module does not register itself globally as a codec, since importing should be side-effect-free.

📈 Benchmarks

The C extension is significantly faster - often 20x to 40x faster.

MUTF-8 Decoding

Name Min (μs) Max (μs) StdDev Ops
cmutf8-decode_modified_utf8 0.00009 0.00080 0.00000 9957678.56358
pymutf8-decode_modified_utf8 0.00190 0.06040 0.00000 450455.96019

MUTF-8 Encoding

Name Min (μs) Max (μs) StdDev Ops
cmutf8-encode_modified_utf8 0.00008 0.00151 0.00000 11897361.05101
pymutf8-encode_modified_utf8 0.00180 0.16650 0.00000 474390.98091

C Extension

The C extension is optional. If a binary package is not available, or a C compiler is not present, the pure-python version will be used instead. If you want to ensure you're using the C version, import it directly:

from mutf8.cmutf8 import decode_modified_utf8

decode_modified_utf(b'\xED\xA1\x80\xED\xB0\x80')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mutf8-1.0.6.tar.gz (6.4 kB view hashes)

Uploaded Source

Built Distributions

mutf8-1.0.6-cp39-cp39-win_amd64.whl (11.4 kB view hashes)

Uploaded CPython 3.9 Windows x86-64

mutf8-1.0.6-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.4 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

mutf8-1.0.6-cp39-cp39-macosx_10_14_x86_64.whl (8.7 kB view hashes)

Uploaded CPython 3.9 macOS 10.14+ x86-64

mutf8-1.0.6-cp38-cp38-win_amd64.whl (11.4 kB view hashes)

Uploaded CPython 3.8 Windows x86-64

mutf8-1.0.6-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.1 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

mutf8-1.0.6-cp38-cp38-macosx_10_14_x86_64.whl (8.7 kB view hashes)

Uploaded CPython 3.8 macOS 10.14+ x86-64

mutf8-1.0.6-cp37-cp37m-win_amd64.whl (11.4 kB view hashes)

Uploaded CPython 3.7m Windows x86-64

mutf8-1.0.6-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.9 kB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

mutf8-1.0.6-cp37-cp37m-macosx_10_14_x86_64.whl (8.7 kB view hashes)

Uploaded CPython 3.7m macOS 10.14+ x86-64

mutf8-1.0.6-cp36-cp36m-win_amd64.whl (11.5 kB view hashes)

Uploaded CPython 3.6m Windows x86-64

mutf8-1.0.6-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.9 kB view hashes)

Uploaded CPython 3.6m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

mutf8-1.0.6-cp36-cp36m-macosx_10_14_x86_64.whl (8.7 kB view hashes)

Uploaded CPython 3.6m macOS 10.14+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page