GuessIt - a library for guessing information from video files.
Project description
GuessIt
GuessIt is a python library that tries to extract as much information as possible from a video file.
It has a very powerful filename matcher that allows to guess a lot of metadata from a video using only its filename. This matcher works with both movies and tv shows episodes.
For example, GuessIt can do the following:
$ guessit "Treme.1x03.Right.Place,.Wrong.Time.HDTV.XviD-NoTV.avi" For: Treme.1x03.Right.Place,.Wrong.Time.HDTV.XviD-NoTV.avi GuessIt found: { [1.00] "mimetype": "video/x-msvideo", [0.80] "episodeNumber": 3, [0.80] "videoCodec": "XviD", [1.00] "container": "avi", [1.00] "format": "HDTV", [0.70] "series": "Treme", [0.50] "title": "Right Place, Wrong Time", [0.80] "releaseGroup": "NoTV", [0.80] "season": 1, [1.00] "type": "episode" }
Features
At the moment, the filename matcher is able to recognize the following property types:
[ title, # for movies and episodes series, season, # for episodes only episodeNumber, episodeDetails, # for episodes only date, year, # 'date' instance of datetime.date language, subtitleLanguage, # instances of babelfish.Language country, # instances of babelfish.Country fileSize, duration, # when detecting video file metadata container, format, videoCodec, audioCodec, videoProfile, audioProfile, audioChannels, screenSize, releaseGroup, website, cdNumber, cdNumberTotal, filmNumber, filmSeries, bonusNumber, edition, idNumber, # tries to identify a hash or a serial number other ]
GuessIt also allows you to compute a whole lof of hashes from a file, namely all the ones you can find in the hashlib python module (md5, sha1, …), but also the Media Player Classic hash that is used (amongst others) by OpenSubtitles and SMPlayer, as well as the ed2k hash.
If you have the ‘guess-language’ python package installed, GuessIt can also analyze a subtitle file’s contents and detect which language it is written in.
If you have the ‘enzyme’ python package installed, GuessIt can also detect the properties from the actual video file metadata.
Install
Installing GuessIt is simple with pip:
$ pip install guessit
or, with easy_install:
$ easy_install guessit
But, you really shouldn’t do that.
Support
The project website for GuessIt is hosted at ReadTheDocs. There you will also find the User guide and Developer documentation.
This project is hosted on GitHub: https://github.com/wackou/guessit
Please report issues and/or feature requests via the bug tracker.
You can also report issues using the command-line tool:
$ guessit --bug "filename.that.fails.avi"
Contribute
GuessIt is under active development, and contributions are more than welcome!
Check for open issues or open a fresh issue to start a discussion around a feature idea or a bug. There is a Contributor Friendly tag for issues that should be ideal for people who are not very familiar with the codebase yet.
Fork the repository on Github to start making your changes to the master branch (or branch off of it).
Write a test which shows that the bug was fixed or that the feature works as expected.
Send a pull request and bug the maintainer until it gets merged and published. :)
License
GuessIt is licensed under the LGPLv3 license.
History
0.8 (2014-06-06)
New webservice that allows to use GuessIt just by sending a POST request to the http://guessit.io/guess url
Command-line util can now report bugs to the http://guessit.io/bugs service by specifying the -b or --bug flag
GuessIt can now use the Enzyme python package to detect metadata out of the actual video file metadata instead of the filename
Finished transition to babelfish.Language and babelfish.Country
New property: duration which returns the duration of the video in seconds This requires the Enzyme package to work
New property: fileSize which returns the size of the file in bytes
Renamed property special to episodeDetails
Added support for Python 3.4
Optimization and bugfixes
0.7.1 (2014-03-03)
New property “special”: values can be trailer, pilot, unaired
New options for the guessit cmdline util: -y, --yaml outputs the result in yaml format and -n, --name-only analyzes the input as simple text (instead of filename)
Added properties formatters and validators
Removed support for python 3.2
A healthy amount of code cleanup/refactoring and fixes :)
0.7 (2014-01-29)
New plugin API that allows to register custom patterns / transformers
Uses Babelfish for language and country detection
Added Quality API to rate file quality from guessed property values
Better and more accurate overall detection
Added roman and word numeral detection
Added ‘videoProfile’ and ‘audioProfile’ property
Moved boolean properties to ‘other’ property value (‘is3D’ became ‘other’ = ‘3D’)
Added more possible values for various properties.
Added command line option to list available properties and values
Fixes for Python3 support
0.6.2 (2013-11-08)
Added support for nfo files
GuessIt can now output advanced information as json (‘-a’ on the command line)
Better language detection
Added new property: ‘is3D’
0.6.1 (2013-09-18)
New property “idNumber” that tries to identify a hash value or a serial number
The usual bugfixes
0.6 (2013-07-16)
Better packaging: unittests and doc included in source tarball
Fixes everywhere: unicode, release group detection, language detection, …
A few speed optimizations
0.5.4 (2013-02-11)
guessit can be installed as a system wide script (thanks @dplarson)
Enhanced logging facilities
Fixes for episode number and country detection
0.5.3 (2012-11-01)
GuessIt can now optionally act as a wrapper around the ‘guess-language’ python module, and thus provide detection of the natural language in which a body of text is written
Lots of fixes everywhere, mostly for properties and release group detection
0.5.2 (2012-10-02)
Much improved auto-detection of filetype
Fixed some issues with the detection of release groups
0.5.1 (2012-09-23)
now detects ‘country’ property; also detect ‘year’ property for series
more patterns and bugfixes
0.5 (2012-07-29)
Python3 compatibility
the usual assortment of bugfixes
0.4.2 (2012-05-19)
added Language.tmdb language code property for TheMovieDB
added ability to recognize list of episodes
bugfixes for Language.__nonzero__ and episode regexps
0.4.1 (2012-05-12)
bugfixes for unicode, paths on Windows, autodetection, and language issues
0.4 (2012-04-28)
much improved language detection, now also detect language variants
supports more video filetypes (thanks to Rob McMullen)
0.3.1 (2012-03-15)
fixed package installation from PyPI
better imports for the transformations (thanks Diaoul!)
some small language fixes
0.3 (2012-03-12)
fix to recognize 1080p format (thanks to Jonathan Lauwers)
0.3b2 (2012-03-02)
fixed the package installation
0.3b1 (2012-03-01)
refactored quite a bit, code is much cleaner now
fixed quite a few tests
re-vamped the documentation, wrote some more
0.2 (2011-05-27)
new parser/matcher completely replaced the old one
quite a few more unittests and fixes
0.2b1 (2011-05-20)
brand new parser/matcher that is much more flexible and powerful
lots of cleaning and a bunch of unittests
0.1 (2011-05-10)
fixed a few minor issues & heuristics
0.1b2 (2011-03-12)
Added PyPI trove classifiers
fixed version number in setup.py
0.1b1 (2011-03-12)
first pre-release version; imported from Smewt with a few enhancements already in there.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.