skip to navigation
skip to content

chardet 3.0.4

Universal encoding detector for Python 2 and 3

Chardet: The Universal Character Encoding Detector

  • ASCII, UTF-8, UTF-16 (2 variants), UTF-32 (4 variants)
  • Big5, GB2312, EUC-TW, HZ-GB-2312, ISO-2022-CN (Traditional and Simplified Chinese)
  • EUC-JP, SHIFT_JIS, CP932, ISO-2022-JP (Japanese)
  • EUC-KR, ISO-2022-KR (Korean)
  • KOI8-R, MacCyrillic, IBM855, IBM866, ISO-8859-5, windows-1251 (Cyrillic)
  • ISO-8859-5, windows-1251 (Bulgarian)
  • ISO-8859-1, windows-1252 (Western European languages)
  • ISO-8859-7, windows-1253 (Greek)
  • ISO-8859-8, windows-1255 (Visual and Logical Hebrew)
  • TIS-620 (Thai)


Our ISO-8859-2 and windows-1250 (Hungarian) probers have been temporarily disabled until we can retrain the models.

Requires Python 2.6, 2.7, or 3.3+.


Install from PyPI:

pip install chardet


For users, docs are now available at

Command-line Tool

chardet comes with a command-line script which reports on the encodings of one or more files:

% chardetect somefile someotherfile
somefile: windows-1252 with confidence 0.5
someotherfile: ascii with confidence 1.0


This is a continuation of Mark Pilgrim’s excellent chardet. Previously, two versions needed to be maintained: one that supported python 2.x and one that supported python 3.x. We’ve recently merged with Ian Cordasco’s charade fork, so now we have one coherent version that works for Python 2.6+.

maintainer:Dan Blanchard
File Type Py Version Uploaded on Size
chardet-3.0.4-py2.py3-none-any.whl (md5) Python Wheel py2.py3 2017-06-08 130KB
chardet-3.0.4.tar.gz (md5) Source 2017-06-08 1MB