sumy 0.1.0

Module for automatic summarization of text documents and HTML pages.

Latest Version: 0.6.0

Make sure you have Python 2.6+/3.2+ and pip (Windows, Linux) installed. Run simply (preferred way):

$ [sudo] pip install sumy

Or for the fresh version:

$ [sudo] pip install git+git://

Or if you have to:

$ wget # download the sources
$ unzip # extract the downloaded file
$ cd sumy-master/
$ [sudo] python install # install the package


Sumy contains command line utility for quick summarization of documents.

$ sumy lex-rank --length=10 --url= # what's summarization?
$ sumy luhn --language=czech --url=
$ sumy edmundson --language=czech --length=3% --url=
$ sumy --help # for more info

Various evaluation methods for some summarization method can be executed by commands below:

$ sumy_eval lex-rank reference_summary.txt --url=
$ sumy_eval lsa reference_summary.txt --language=czech --url=
$ sumy_eval edmundson reference_summary.txt --language=czech --url=
$ sumy_eval --help # for more info

Python API

Or you can use sumy like a library in your project.

# -*- coding: utf8 -*-

from __future__ import absolute_import
from __future__ import division, print_function, unicode_literals

from sumy.parsers.html import HtmlParser
from sumy.nlp.tokenizers import Tokenizer
from sumy.summarizers.lsa import LsaSummarizer
from sumy.nlp.stemmers.czech import stem_word
from sumy.utils import get_stop_words

if __name__ == "__main__":
    url = ""
    parser = HtmlParser.from_url(url, Tokenizer("czech"))

    summarizer = LsaSummarizer(stem_word)
    summarizer.stop_words = get_stop_words("czech")

    for sentence in summarizer(parser.document, 20):


Run tests via

$ nosetests-2.6 && nosetests-3.2 && nosetests-2.7 && nosetests-3.3


0.1.0 (2013-10-20)

  • First public release.
