skip to navigation
skip to content

videogrep 0.4.4

Python utility for creating video out of their subtitle files

Latest Version: 0.5.0


Videogrep searches through dialog in video files (using .srt subtitle tracks or pocketsphinx transcriptions) and makes supercuts based on what it finds.


Install with pip
pip install videogrep
Install [ffmpeg]( with Ogg/Vorbis support. If you're on a mac with homebrew you can install ffmpeg with:
brew install ffmpeg --with-libvpx --with-libvorbis

(OPTIONAL) Install pocketsphinx for word-level transcriptions. On a mac:
brew tap watsonbox/cmu-sphinx
brew install --HEAD watsonbox/cmu-sphinx/cmu-sphinxbase
brew install --HEAD watsonbox/cmu-sphinx/cmu-sphinxtrain # optional
brew install --HEAD watsonbox/cmu-sphinx/cmu-pocketsphinx

##How to use it
The most basic use:
videogrep --input path/to/video_or_folder --search 'search phrase'
You can put any regular expression in the search phrase.

You can also search for part-of-speech tags using Pattern. See the [Pattern-Search documentation]( for some details about how this works, and the [Penn Tree bank tag set]( for a list of usuable part-of-speech tags. For example the following will search for every line of dialog that contains an adjective (JJ) followed by a singular noun (NN):
videogrep --input path/to/video_or_folder --search 'JJ NN' --search-type pos
You can also do a [hypernym]( search - which essentially searches for words that fit into a specific category. The following, for example, will search for any line of dialog that references a liquid (like water, coffee, beer, etc.):
videogrep --input path/to/video_or_folder --search 'liquid' --search-type hyper

**NOTE: videogrep requires the subtitle track and the video file to have the exact same name, up to the extension.** For example, my_movie.mp4 and will work, my_movie.mp4 and will not work.


videogrep can take a number of options:

####--input / -i
Video or subtitle file, or folder containing multiple files

####--output / -o
Name of the file to generate. By default this is "supercut.mp4"

####--search / -s
Search term

####--search-type / -st
Type of search you want to perform. There are three options:
* re: [regular expression]( (this is the default).
* pos: part of speech search (uses []( For example 'JJ NN' would return all lines of dialog that contain an adjective followed by a noun.
* hyper: hypernym search. For example 'body parts' grabs all lines of dialog that reference a body part
* word: extract individual words - for multiple words use the '|' symbol (requires pocketsphinx).
* franken: create a "frankenstein" sentence (requires pocketsphinx)
* fragment: multiple words with allowed wildcards like 'blue \*' (requires pocketsphinx)

####--max-clips / -m
Maximum number of clips to use for the supercut

####--demo / -d
Show the search results without making the supercut

####--randomize / -r
Randomize the order of the clips

####--padding / -p
Padding in milliseconds to add to the start and end of each clip

####--transcribe / -tr
Transcribe the video using audiogrep/pocketsphinx. You must install pocketsphinx first!

####--use-transcript / -t
Use the pocketsphinx transcript rather than a subtitle file for searching. If this is enabled you can do
word-level searches.

* [All the instances of the phrase "time" in the movie "In Time"](
* [All the one to two second silences in "Total Recall"](
* [The President's former press secretary telling us what he can tell us](

###Use it as a module

from videogrep import videogrep

videogrep.videogrep('path/to/your/files','output_file_name.mp4', 'search_term', 'search_type')
The videogrep module accepts the same parameters as the command line script. To see the usage check out the source.  
File Type Py Version Uploaded on Size
videogrep-0.4.4-py2-none-any.whl (md5) Python Wheel 2.7 2015-06-30 17KB
videogrep-0.4.4.tar.gz (md5) Source 2015-06-30 14KB