Skip to main content

GuessIt - a library for guessing information from video filenames.

Project description

GuessIt
=======

.. image:: http://img.shields.io/pypi/v/guessit.svg
:target: https://pypi.python.org/pypi/guessit
:alt: Latest Version

.. image:: http://img.shields.io/badge/license-LGPLv3-blue.svg
:target: https://pypi.python.org/pypi/guessit
:alt: LGPLv3 License

.. image:: http://img.shields.io/travis/guessit-io/guessit.svg
:target: https://travis-ci.org/guessit-io/guessit
:alt: Build Status

.. image:: http://img.shields.io/coveralls/guessit-io/guessit/master.svg
:target: https://coveralls.io/github/guessit-io/guessit?branch=master
:alt: Coveralls

.. image:: https://img.shields.io/badge/Hu-Board-7965cc.svg
:target: https://huboard.com/guessit-io/guessit
:alt: HuBoard


GuessIt is a python library that extracts as much information as possible from a video filename.

It has a very powerful matcher that allows to guess properties from a video using its filename only.
This matcher works with both movies and tv shows episodes.

For example, GuessIt can do the following::

$ guessit "Treme.1x03.Right.Place,.Wrong.Time.HDTV.XviD-NoTV.avi"
For: Treme.1x03.Right.Place,.Wrong.Time.HDTV.XviD-NoTV.avi
GuessIt found: {
"title": "Treme",
"season": 1,
"episode": 3,
"episode_title": "Right Place, Wrong Time",
"source": "HDTV",
"video_codec": "Xvid",
"release_group": "NoTV",
"container": "avi",
"mimetype": "video/x-msvideo",
"type": "episode"
}

More information are available at `ReadTheDocs <http://guessit.readthedocs.org/>`_.

Migration note
--------------
GuessIt 2 has been rewriten from scratch. GuessIt is now a release name parser only, and support for additional
features like hashes computations has been dropped.

To migrate from guessit ``0.x`` or ``1.x`` to ``guessit 2.x``, please read
`migration.rst <https://github.com/guessit-io/guessit/blob/master/docs/migration.rst>`_.

To migrate from guessit ``2.x`` to ``3.x``, please read `migration2to3.rst <https://github.com/guessit-io/guessit/blob/master/docs/migration2to3.rst>`_.

Install
-------

Installing GuessIt is simple with `pip <http://www.pip-installer.org/>`_::

$ pip install guessit

You can also `install GuessIt from sources <https://github.com/guessit-io/guessit/blob/master/docs/sources.rst>`_

Usage
-----

GuessIt can be used from command line::

$ guessit
usage: guessit [-h] [-t TYPE] [-n] [-Y] [-D] [-L ALLOWED_LANGUAGES]
[-C ALLOWED_COUNTRIES] [-E] [-T EXPECTED_TITLE] [-G EXPECTED_GROUP]
[-f INPUT_FILE] [-v] [-P SHOW_PROPERTY] [-a] [-1] [-l] [-j] [-y]
[-c CONFIG] [--no-embedded-config] [-p] [-V] [--version]
[filename [filename ...]]

positional arguments:
filename Filename or release name to guess

optional arguments:
-h, --help show this help message and exit

Naming:
-t TYPE, --type TYPE The suggested file type: movie, episode. If undefined,
type will be guessed.
-n, --name-only Parse files as name only, considering "/" and "\" like
other separators.
-Y, --date-year-first
If short date is found, consider the first digits as
the year.
-D, --date-day-first If short date is found, consider the second digits as
the day.
-L ALLOWED_LANGUAGES, --allowed-languages ALLOWED_LANGUAGES
Allowed language (can be used multiple times)
-C ALLOWED_COUNTRIES, --allowed-countries ALLOWED_COUNTRIES
Allowed country (can be used multiple times)
-E, --episode-prefer-number
Guess "serie.213.avi" as the episode 213. Without this
option, it will be guessed as season 2, episode 13
-T EXPECTED_TITLE, --expected-title EXPECTED_TITLE
Expected title to parse (can be used multiple times)
-G EXPECTED_GROUP, --expected-group EXPECTED_GROUP
Expected release group (can be used multiple times)

Input:
-f INPUT_FILE, --input-file INPUT_FILE
Read filenames from an input text file. File should
use UTF-8 charset.

Output:
-v, --verbose Display debug output
-P SHOW_PROPERTY, --show-property SHOW_PROPERTY
Display the value of a single property (title, series,
video_codec, year, ...)
-a, --advanced Display advanced information for filename guesses, as
json output
-s, --single-value Keep only first value found for each property
-l, --enforce-list Wrap each found value in a list even when property has
a single value
-j, --json Display information for filename guesses as json
output
-y, --yaml Display information for filename guesses as yaml
output

Configuration:
-c CONFIG, --config CONFIG
Filepath to the configuration file. Configuration
contains the same options as those command line
options, but option names have "-" characters replaced
with "_". If not defined, guessit tries to read a
configuration default configuration file at
~/.guessit/options.(json|yml|yaml) and
~/.config/guessit/options.(json|yml|yaml). Set to
"false" to disable default configuration file loading.
--no-embedded-config Disable default configuration.

Information:
-p, --properties Display properties that can be guessed.
-V, --values Display property values that can be guessed.
--version Display the guessit version.


It can also be used as a python module::

>>> from guessit import guessit
>>> guessit('Treme.1x03.Right.Place,.Wrong.Time.HDTV.XviD-NoTV.avi')
MatchesDict([('title', 'Treme'), ('season', 1), ('episode', 3), ('episode_title', 'Right Place, Wrong Time'), ('source', 'HDTV'), ('video_codec', 'Xvid'), ('release_group', 'NoTV'), ('container', 'avi'), ('mimetype', 'video/x-msvideo'), ('type', 'episode')])

``MatchesDict`` is a dict that keeps matches ordering.

Command line options can be given as dict or string to the second argument.

Docker
------

GuessIt is also available on `Docker Hub <https://hub.docker.com/r/guessit/guessit/>`_ as a Docker Image.::

$ docker run -it guessit/guessit "Treme.1x03.Right.Place,.Wrong.Time.HDTV.XviD-NoTV.avi"
For: Treme.1x03.Right.Place,.Wrong.Time.HDTV.XviD-NoTV.avi
GuessIt found: {
"title": "Treme",
"season": 1,
"episode": 3,
"episode_title": "Right Place, Wrong Time",
"source": "HDTV",
"video_codec": "Xvid",
"release_group": "NoTV",
"container": "avi",
"mimetype": "video/x-msvideo",
"type": "episode"
}

Support
-------

This project is hosted on `GitHub <https://github.com/guessit-io/guessit>`_. Feel free to open an issue if you think you
have found a bug or something is missing in guessit.

GuessIt relies on `Rebulk <https://github.com/Toilal/rebulk>`_ project for pattern and rules registration.

License
-------

GuessIt is licensed under the `LGPLv3 license <http://www.gnu.org/licenses/lgpl.html>`_.


History
=======

3.0.1 (2018-10-17)
------------------

- Removed `Extras` and `Bonus` values from `episode_details` property as those tags may also appear in movies
- Add `Scalable Video Coding`, `Advanced Video Codec High Definition` and `High Efficiency Video Coding` values to
`video_profile`
- Add support for Python 3.7
- Add `mk3d` value to `container`
- Better title cleanup containing acronyms (like `Marvel's Agents of S.H.I.E.L.D`)
- Fix issue with ES audio_profile breaking titles
- Fix crash for files ending with `Rip`


3.0.0 (2018-05-22)
------------------

- Renamed `format` property to `source`.
- `source` property `Cam` is now `Camera` or `HD Camera`
- `source` property `Telesync` is now `Telesync` or `HD Telesync`
- `source` property `PPV` is now `Pay-per-view`
- `source` property `DVB` is now `Digital TV`
- `source` property `VOD` is now `Video on Demand`
- `source` property `WEBRip` is now `Web` with additional property `other`: `Rip`
- `source` property `WEB-DL` is now `Web`
- `source` property `AHDTV` is now `Analog HDTV`
- `source` property `UHDTV` is now `Ultra HDTV`
- `source` property `HDTC` is now `HD Telecine`
- `screen_size` property `360i` was added.
- `screen_size` property `480i` was added.
- `screen_size` property `576i` was added.
- `screen_size` property `900i` was added.
- `screen_size` property `1440p` was added.
- `screen_size` property `4K` is now `2160p`
- `screen_size` property `4320p` was added.
- `video_codec` property `h264` is now `H.264`
- `video_codec` property `h265` is now `H.265`
- `video_codec` property `Mpeg2` is now `MPEG-2`
- `video_codec` property `Real` is now `RealVideo`
- `video_codec` property `XviD` is now `Xvid`
- `video_profile` property `BP` is now `Baseline`.
- `video_profile` property `HP` is now `High`.
- `video_profile` property `XP` is now `Extended`.
- `video_profile` property `MP` is now `Main`.
- `video_profile` property `Hi422P` is now `High 4:2:2`.
- `video_profile` property `Hi444PP` is now `High 4:4:4 Predictive`.
- `video_profile` property `High 10` was added.
- `video_profile` property `8bit` was removed. `8bit` is detected as `color_depth`: `8-bit`
- `video_profile` property `10bit` was removed. `10bit` is detected as `color_depth`: `10-bit`
- `audio_codec` property `DTS-HD` was added.
- `audio_codec` property `AC3` is now `Dolby Digital`
- `audio_codec` property `EAC3` is now `Dolby Digital Plus`
- `audio_codec` property `TrueHD` is now `Dolby TrueHD`
- `audio_codec` property `DolbyAtmos` is now `Dolby Atmos`.
- `audio_profile` property `HE` is now `High Efficiency`.
- `audio_profile` property `LC` is now `Low Complexity`.
- `audio_profile` property `HQ` is now `High Quality`.
- `audio_profile` property `HDMA` is now `Master Audio`.
- `edition` property `Collector Edition` is now `Collector`
- `edition` property `Special Edition` is now `Special`
- `edition` property `Criterion Edition` is now `Criterion`
- `edition` property `Deluxe Edition` is now `Deluxe`
- `edition` property `Limited Edition` is now `Limited`
- `edition` property `Theatrical Edition` is now `Theatrical`
- `edition` property `Director's Definitive Cut` was added.
- `episode_details` property `Oav` and `Ova` were removed. They are now `other: Original Animated Video`
- `episode_details` property `Omake` is now `Extras`
- `episode_details` property `Final` was added.
- `other` property `Rip` was added.
- `other` property `DDC` was removed. `DDC` is now `edition`: `Director's Definitive Cut`
- `other` property `CC` was removed. `CC` is now `edition`: `Criterion`
- `other` property `FINAL` was removed. `FINAL` is now `episode_details`: `Final`
- `other` property `Original Animated Video` was added.
- `other` property `OV` is now `Original Video`
- `other` property `AudioFix` is now `Audio Fixed`
- `other` property `SyncFix` is now `Sync Fixed`
- `other` property `DualAudio` is now `Dual Audio`
- `other` property `Fansub` is now `Fan Subtitled`
- `other` property `Fastsub` is now `Fast Subtitled`
- `other` property `FullHD` is now `Full HD`
- `other` property `UltraHD` is now `Ultra HD`
- `other` property `mHD` and `HDLight` are now `Micro HD`
- `other` property `HQ` is now `High Quality`
- `other` property `HR` is now `High Resolution`
- `other` property `LD` is now `Line Dubbed`
- `other` property `MD` is now `Mic Dubbed`
- `other` property `Low Definition` was added.
- `other` property `LiNE` is now `Line Audio`
- `other` property `R5` is now `Region 5`
- `other` property `Region C` was added.
- `other` property `ReEncoded` is now `Reencoded`
- `other` property `WideScreen` is now `Widescreen`
- Added `Ultra HD Blu-ray` as new `source` possible value.
- Added `Standard Dynamic Range` as new `other` possible value.
- Added `HDR10` as new `other` possible value.
- Added `Dolby Vision` as new `other` possible value.
- Added `BT.2020` as new `other` possible value.
- Added `12-bit` as new `color_depth` possible value.
- Added `IMAX` as new `edition` possible value.
- Added `Upscaled` as new `other` possible value.
- Added `High Frame Rate` as new `other` possible value.
- Added `Ultimate` as new `edition` possible value.
- Added `Fan` as new `edition` possible value.
- Added `High Resolution Audio` as new `audio_profile` possible value.
- Added `Extended Surround` as new `audio_profile` possible value.
- Added `EX` as new `audio_profile` possible value
- Added `Opus` as new `audio_codec` possible value
- Added `aspect_ratio` as new property. Also used to validate if a screen_size is a standard resolution.
- Fixed unwanted language and country detection for exotic languages.
- Added default and configurable list of allowed languages and countries
- Added `VC-1` as new `video_codec` possible value
- Enhanced dash-separated `release_group` detection.
- Changed `size` output to return `guessit.Quantity` object.
- Changed `size` output to return `guessit.Size` object.
- Added `audio_video_rate` as new possible property.
- Added `video_video_rate` as new possible property.
- Added `frame_rate` as new possible property.
- Added `disc` as a new possible property.
- Added `H.263` as new `video_codec` possible value.
- Added `VP7` as new `video_codec` possible value.
- Added `VP8` as new `video_codec` possible value.
- Added `VP9` as new `video_codec` possible value.
- Added `Vorbis` as new `audio_codec` possible value.
- Added `PCM` as new `audio_codec` possible value.
- Added `LPCM` as new `audio_codec` possible value.
- Added `Digital Master` as new `source` possible value.
- Added several new values for `streaming_service`.
- Added new options `--includes` and `--excludes`.
- Added `Sample` as new `other` possible value.
- Added `Obfuscated` as new `other` possible value.
- Added `Proof` as new `other` possible value.
- Added `Repost` as new `other` possible value.
- Added advanced guessit configuration to config files.
- Add support for `pathlib.Path` objects on guessit API input.

2.1.4 (2017-06-01)
------------------

- Fix broken match function when using `rebulk>=0.9.0`.

2.1.3 (2017-05-31)
------------------

- Add `nzb` as new `container` possible value
- Add `EAC3` as new `audio_codec` possible value
- Add `FullHD` as new `other` possible value
- Added python 3.6 support
- Dropped python 2.6 support
- Make `container` values consistent and always lowercase
- Fix `--type movie` being ignored for movies that starts with numbers
- Fix invalid `language` detection due the common words `audio`, `true` and `unknown`
- Fix `episode` type detection when series name contains `year` followed by SEE pattern

2.1.2 (2017-04-03)
------------------

- Many fixes, additions and improvements (thanks to @ratoaq2).

2.1.1 (2016-12-04)
------------------

- Add `~` to episode/season separators.
- Add `AHDTV`, `HDTC`, `SATRip` as new `format` possible values.
- Add `streaming_service` property.
- Add `DDP` pattern as `audio_codec`=`DolbyDigital`.
- Add `LDTV` as possible tag for `other`=`LD`.
- Add `StripSeparators` Post Processor to strip separators from all matches.
- Fix invalid guess `1 x 2` with `--type episode`.
- Fix `part` property.
- Fix `cd_count` issue with `x264-CD`.
- Fix `HDD` group detected as `DolbyDigital`.
- Fix invalid comparator in `audio_codec` conflict solver.
- Fix validation of `film` property.
- Fix `date` followed by `screen_size` invalid guess.
- Fix `episode` not detected when smaller filepart repeats the `season` and uses `SSEE` pattern.
- Enhance `season`/`episode` conflict solver to keep most specific value.
- Enhance `video_profile` detection.
- Enhance `episode`/`season` range and sequence guessing.
- Enhance performance with rebulk upgrade to `0.8.2`.
- Enhance `season`/`episode`.
- Enhance `other`=`Complete` guessing.
- Enhance `release_group` guessing.
- Enhance command line options parsing related to unicode.
- Ensure roman numbers are surrounded with separators to be guessed as numbers.

2.1.0 (2016-09-08)
------------------

- Drop support for `regex` native module.
- Remove dependency constraint on `python-dateutil`.
- Enhance langage/country guessing in edge cases.
- Enhance rule to guess `release_group` in more file templates.
- Fix edge cases for subtitle language detection.
- Fix invalid conflict solving in `season`/`episode` occuring between `SssEee` and `ssXee` pattern.
- Fix issue when running guessit in non-interactive shell with python 2
- Guess Dolby keyword as DolbyDigital in `audio_codec`.
- Avoid `title` to be guessed as `website` (Dark.Net)
- Avoid `season`/`episode` to be guessed when pattern is included inside words.
- Enhance `screen_size` to detect `720pHD` and `1080pHD`
- Add support for `format` and `video_codec` when no separators between themselves. (HDTVx264, PDTVx264, ...)
- Add rebulk version in `--version` option.
- Upgrade rebulk to `0.7.3`.

2.0.5 (2016-04-10)
------------------

- Fix inconsistent properties returned by guessit -p.
- Add support for titles containing dots.
- Lock python-dateutil dependency to <2.5.2.

2.0.4 (2016-02-03)
------------------

- Add an Exception Report when an unexpected exception occurs.


2.0.3 (2016-01-30)
------------------

- Something goes wrong with 2.0.2 release ...


2.0.2 (2016-01-30)
------------------

- Fix possible issue with unicode characters encoding/decoding.
- Pypy is now supported.


2.0.1 (2016-01-28)
------------------

- Add support for any type of string with python 2 and python 3 (binary, str, unicode).


2.0.0 (2016-01-27)
------------------

- Final release.


2.0rc8 (2016-01-26)
-------------------

- Remove regex native module from required dependencies. It will now be used only if present.


2.0rc7 (2016-01-18)
-------------------

- Fix packaging issues on Python 2.7.


2.0rc6 (2016-01-18)
-------------------

- Fix packaging issues.


2.0rc5 (2016-01-18)
-------------------

- Guessit is now available as a docker container on Docker Hub (https://hub.docker.com/r/toilal/guessit).
- `country` 2-letter code is not added to `title` value anymore.
- All `container` values are now capitalized.
- `alternativeTitle` has been renamed to `alternative_title` and added to the docs.
- `mimetype` property is now in the docs.
- Add more excluded words for `language` property.
- Add more possible values for `other` property.
- Fix an issue occuring with `title` values starting with `Scr`.
- `film` property is now guessed only if less than `100` to avoid possible conflicts with `crc32`.


2.0rc4 (2015-12-03)
-------------------

- Add docs.
- Add exotic `screen_size` patterns support like `720hd` and `720p50`.
- Rename `audio_codec` value `true-HD` to `trueHD`.


2.0rc3 (2015-11-29)
-------------------

- Add ``__version__`` to main module.


2.0rc2 (2015-11-28)
-------------------

- Single digit episodes are now guessed for ``--type episode`` instead of ``--episode-prefer-number``.
- Fix separators that could cause some titles to be splited with & and ;.
- Avoid possible ``NoneType`` error.


2.0rc1 (2015-11-27)
-------------------

- Fallback to default title guessing when ``expected-title`` is not found.


2.0b4 (2015-11-24)
------------------

- Add ``expected-group`` option.
- Add validation rule for single digit ``episode`` to avoid false positives.
- Add ``verbose`` option.
- Fix ``expected-title`` option.
- Better unicode support in ``expected-group``/``expected-title`` option.


2.0b3 (2015-11-15)
------------------

- Add support for ``part`` with no space before number.
- Avoid ``uuid`` and ``crc32`` collision with ``season``/``episode`` properties.
- Add better space support for ``season``/``episode`` properties.
- Ensure ``date`` property is found when conflicting with ``season``/``episode`` properties.
- Fix ``IndexError`` when input has a closing group character with no opening one before.
- Add ``--type`` option.
- Add rebulk implicit option support.

2.0b2 (2015-11-14)
------------------

- Add python 2.6 support.


2.0b1 (2015-11-11)
------------------

- Enhance title guessing.
- Upgrade rebulk to ``0.6.1``.
- Rename ``properCount`` to ``proper_count``
- Avoid crash when using ``-p``/``-V`` option with ``--yaml`` and ``yaml`` module is not available.

2.0a4 (2015-11-09)
------------------

- Add ``-p``/``-V`` options to display properties and values that can be guessed.


2.0a3 (2015-11-08)
------------------

- Allow rebulk customization in API module.

2.0a2 (2015-11-07)
------------------

- Raise TypeError instead of AssertionError when non text is given to guessit API.
- Fix packaging issues with previous release blocking installation.

2.0a1 (2015-11-07)
------------------

- Rewrite from scratch using Rebulk.
- Read MIGRATION.rst for migration guidelines.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

guessit-3.0.1.tar.gz (134.8 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page