Skip to main content

GISMO is a NLP tool to rank and organize a corpus of documents according to a query.

Project description

GISMO

https://img.shields.io/pypi/v/gismo.svg https://img.shields.io/travis/balouf/gismo.svg Documentation Status Code Coverage

GISMO is a NLP tool to rank and organize a corpus of documents according to a query.

Gismo stands for Generic Information Search… with a Mind of its Own.

Features

Gismo combines three main ideas:

  • TF-IDTF: a symmetric version of the TF-IDF embedding.

  • DI-Iteration: a fast, push-based, variant of the PageRank algorithm.

  • Fuzzy dendrogram: a variant of the Louvain clustering algorithm.

Quickstart

Install gismo:

$ pip install gismo

Import gismo in a Python project:

import gismo as gs

Credits

Thomas Bonald, Anne Bouillard, Marc-Olivier Buob, Dohy Hong.

This package was created with Cookiecutter and the francois-durand/package_helper project template.

History

0.2.3 (2020-05-04)

  • ACM and DBLP dataset creation added.

0.2.2 (2020-05-04)

  • Notebook tutorials added (early version)

0.2.1 (2020-05-03)

  • Actual code

  • Coverage badge

0.1.0 (2020-04-30)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gismo-0.2.3.tar.gz (25.5 kB view hashes)

Uploaded Source

Built Distribution

gismo-0.2.3-py2.py3-none-any.whl (23.0 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page