Skip to main content

Graph compression with Answer Set Programming

Project description

# PowerGrASP
The Python [Powergraph analysis](https://en.wikipedia.org/wiki/Power_graph_analysis) tool,
based on [Answer Set Programming](https://en.wikipedia.org/wiki/Answer_set_programming) solving and formal concept analysis.

More technical documentation about PowerGrASP can be found in the [documentation file](doc/documentation.mkd).
PowerGrASP results, compared to the existing BIOTEC tool, can be found in the [results file](doc/results.mkd).



## Installation & Requirements

pip install powergrasp

Just be sure that you provides a clingo binary in your $PATH.


## Basic use
PowerGrASP can be used as a script:

python3 -m powergrasp --graph-data=human_proteom.lp --output-file=for_cytoscape.bbl

Or can be embedded in any python program:

import powergrasp
powergrasp.recipes.powergraph('human_proteom.lp', 'for_cytoscape.bbl')

## Other recipes
Some other recipes are implemented, including oriented graph compression, and alternative scoring methods.


## General overview
The compression is configurable through command line arguments or compress function parameters.
Used ASP source code can be changed, interactive mode can be set,… Please look at help and docstring:

# in terminal
python3 -m powergrasp --help
# in python
>>> help(powergrasp)
# or, in the PowerGrASP git directory, with make
make help



## I/O
NB: Create another output or input format support is possible by implement a new Converter class (see *powergrasp/converter/*).

### Input file
PowerGrASP doesn't generate logging for pleasure : it actually perform a treatment on input data, if provided.
The supported input file formats are currently :
- ASP: atoms edge/2, with edge(X,Y) describing a link between nodes X and Y.
- SBML: a regular SBML file, when species and reactions will be treated as nodes.
- GML: a regular Graph Modeling Language file, readable by networkx python module.
- GraphML: a regular [GraphML](http://graphml.graphdrawing.org/) file, readable by [networkx python module](https://networkx.github.io/documentation/networkx-1.9/reference/readwrite.graphml.html).

Example of inputs can be found in the [powergrasp/test/](powergrasp/tests) directory.
Some of these are simple and used for unit testing. Take a look to [ddiam](powergrasp/tests/double_biclique.lp), [coli](powergrasp/tests/ecoli_2896-23.gml) or [star](powergrasp/tests/star.lp).

Other formats will be supported in the future.

### Output

#### Bubble formatted file
The output of PowerGrASP is a *Bubble* formatted file. This file can be used to get a visualization of the compressed graph.
The format *Bubble* is an endemic format designed by Royer et Al for power graph describing.

Bubble files can be used in several ways:

##### Cytoscape
Print a power graph through Cytoscape is made possible by the [CyOog](http://www.biotec.tu-dresden.de/research/schroeder/powergraphs/) plugin,
which handle the format *Bubble*.
Cytoscape, in order to using the CyOog plugin, must be in version __[2.x](http://www.cytoscape.org/download_old_versions.html)__.


###### Caveats
CyOog does not handles edges that are printed multiple times, nor oriented edges.
Thus, oriented compression is not efficiently reprensented using Cytoscape.

##### Oog Command line tool
BIOTEC team also released a [command line tool](http://www.biotec.tu-dresden.de/research/schroeder/powergraphs/download-command-line-tool.html) for power graph analysis.
This tool allow to print a bubble file without Cytoscape usage, with something like:

java -jar Oog.jar -inputfiles=path/to/bubble.bbl -img -f=png

Add *&>/dev/null* to prevent any logging output.


##### Bubbletools
[This python lib](https://github.com/Aluriak/bubble-tools) could help to generate any type of output from bubble files. It provides some examples, like [GEXF](https://gephi.org/gexf/format/) for [Gephi](https://gephi.org/) (that [stops to handle it](https://github.com/gephi/graphstore/pull/128) since last release). It is used by PowerGrASP itself to convert bubble output to another formats.


## Standard output management
By default, PowerGrASP generates lots of outputs in stdout, essentially for debugging and compression tracking.
With the option *loglevel*, its possible to control this behavior:

python3 -m powergrasp --graph-data=tests/proteome_yeast_2.lp --loglevel=warning

This will block all outputs with a strictly lesser priority than warning.
Available levels comes from logging API:

log level | PowerGrASP
-----------------|-------------------
critical | totally silencious
error | very rarely disturbing
warning | rarely disturbing
info | trackable
debug | trackable with __high__ verbosity
notset | kraken released

Please note that some options (notabily *count-models* and *count-cc*) are completely independant of this logging management, as they work with the standard output.


## Statistics
The compression compute some statistics about itself, and generate the final results
at the end of the compression in the standard output.
With some arguments, you can also show a colored graphic :

python3 -m powergrasp --graph-data=tests/proteome_yeast_2.lp --stats-file=data/statistics.csv --plot-stats

Instead of show it, powergrasp can save it in png (note that the *--plot-stats* flag is not necessary when *plot-file* option is given):

python3 -m powergrasp --graph-data=tests/proteome_yeast_2.lp --stats-file=data/statistics.csv --plot-file=data/statistics.png


## Answer Set Programming
ASP is a declarative and logic language, designed for the treatment of combinatorial problems (like graph compression).
The implementation used in this project is the [*Potsdam Answer Set Solving Collection*](http://potassco.sourceforge.net/index.html).

All ASP source codes necessary for the PowerGrASP program can be found in *powergrasp/ASPsources/* directory.



## Interests & References

The [Power Graph approach for graph compression](https://en.wikipedia.org/wiki/Power_graph_analysis) allows a lossless compression with an emphasis on biological meaning.
In fact, formal concepts used by Power Graph analysis have a sens in biology, especially in the case of proteomes.

All graphs can be compressed through Power Graph, and will be more readable once compressed,
but interactomes, at least, also gain in interpretability.

The main inspiration of PowerGrASP : PowerGraph Analysis:

Loïc Royer, Matthias Reimann, Bill Andreopoulos, and Michael Schroeder.
Unraveling Protein Networks with Power Graph Analysis.
PLoS Comput Biol, 4(7):e1000108, July 2008.

Usage of the PowerGraph Analysis:

Loic Royer, Matthias Reimann, A. Francis Stewart, and Michael Schroeder.
Network Compression as a Quality Measure for Protein Interaction Networks.
PLoS ONE, 7(6):e35729, June 2012.

Yun Zhang, Charles A Phillips, Gary L Rogers, Erich J Baker, Elissa J Chesler, and Michael A Langston.
On finding bicliques in bipartite graphs: a novel algorithm and
its application to the integration of diverse biological data types.
BMC Bioinformatics, 15(1):110, 2014.

ASP through Potassco implementation:

M. Gebser, R. Kaminski, B. Kaufmann, M. Ostrowski, T. Schaub, and M. Schneider.
Potassco: The Potsdam answer set solving collection.
AI Communications, 24(2):107–124, 2011.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

powergrasp-0.7.0.tar.gz (71.0 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page