skip to navigation
skip to content

Not Logged In

chardet 2.2.1

Universal encoding detector for Python 2 and 3

Latest Version: 2.3.0

Chardet: The Universal Character Encoding Detector

Detects
  • ASCII, UTF-8, UTF-16 (2 variants), UTF-32 (4 variants)
  • Big5, GB2312, EUC-TW, HZ-GB-2312, ISO-2022-CN (Traditional and Simplified Chinese)
  • EUC-JP, SHIFT_JIS, ISO-2022-JP (Japanese)
  • EUC-KR, ISO-2022-KR (Korean)
  • KOI8-R, MacCyrillic, IBM855, IBM866, ISO-8859-5, windows-1251 (Cyrillic)
  • ISO-8859-2, windows-1250 (Hungarian)
  • ISO-8859-5, windows-1251 (Bulgarian)
  • windows-1252 (English)
  • ISO-8859-7, windows-1253 (Greek)
  • ISO-8859-8, windows-1255 (Visual and Logical Hebrew)
  • TIS-620 (Thai)

Requires Python 2.6 or later

Command-line Tool

chardet comes with a command-line script which reports on the encodings of one or more files:

% chardetect somefile someotherfile
somefile: windows-1252 with confidence 0.5
someotherfile: ascii with confidence 1.0

About

This is a continuation of Mark Pilgrim’s excellent chardet. Previously, two versions needed to be maintained: one that supported python 2.x and one that supported python 3.x. We’ve recently merged with Ian Cordasco’s charade fork, so now we have one coherent version that works for Python 2.6+.

maintainer:Dan Blanchard
maintainer:Ian Cordasco
 
File Type Py Version Uploaded on Size
chardet-2.2.1-py2.py3-none-any.whl (md5) Python Wheel 2.7 2013-12-18 176KB
chardet-2.2.1.tar.gz (md5) Source 2013-12-18 176KB
  • Downloads (All Versions):
  • 4041 downloads in the last day
  • 19601 downloads in the last week
  • 75168 downloads in the last month