Skip to main content

Encoder for unicode to RTF 1.5 command sequences

Project description

This module implements an encoder for unicode to RTF 1.5 command sequences; it can be used to generate valid RTF output with international characters.

Importing this module adds a new rtfunicode codec, allowing you to encode unicode strings to valid RTF bytecode sequences:

>>> import rtfunicode
>>> u'RTF and unicode mix just fine! \u263A'.encode('rtfunicode')
'RTF and unicode mix just fine! \\u9786?'

The RTF command code for a unicode character is uN?, where N is a signed 16-bit integer and the ? is a placeholder character for older RTF readers. This module sets the latter to the ‘?’ literal for simlicity’s sake.

Limitations

This encoder does not support officially unicode outside the BMP (codepoints u0000-uffff). If you do pass in codepoints beyond this point, behaviour depends on wether or not you have a python interpreter compiled with UCS-2 or UCS-4; the former will translate such codepoints using a surragate pair in platform-dependent ordering, but with UCS-4, encoding using ‘strict’ will raise a UnicodeEncodeError exception.

In any case, the RTF standard does not specify how codepoints outside the BMP should be handled.

Requirements

  • Python 2.4 or newer (including python 3.x)

Development

The project code is hosted on GitHub, feel free to report issues, fork the code and issue pull requests.

License

BSD (simplified), see: LICENSE.txt

Author

Martijn Pieters <mj@zopatista.com>

1.0 (2012-06-06)

  • Initial release.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rtfunicode-1.0.zip (7.6 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page