Skip to main content

RNAlysis provides a modular analysis pipeline for RNA sequencing data. The package includes various methods for filtering, data visualisation, exploratory analyses, enrichment anslyses and clustering.

Project description

https://raw.githubusercontent.com/GuyTeichman/RNAlysis/master/docs/source/logo.png

Useful links: Documentation | Source code | Bug reports | pipimage | travisci | downloads


What is RNAlysis?

RNAlysis is a python package providing modular analysis pipeline for RNA sequencing data. The package includes various filtering methods, data visualisation, clustering analyses, enrichment anslyses and exploratory analyses.

RNAlysis allows you to perform filtering and analyses at any order you wish. It has the ability to save or load your progress at any given phase, Wand document the order of operations you performed in the saved file names.

RNAlysis works with sequencing count matrices and differential expression output files in general, and integrates in particular with python’s HTSeq-count and R’s DESeq2.


Main Features

  • Filter your read count matrices, fold change data, differential expression tables, and tabular data in general.

  • Normalize your read count data

  • Visualise, explore and describe your sequencing data

  • Find global relationships between sample expression profiles with clustering and dimensionality reduction

  • Create and share analysis pipelines

  • Perform enrichment analysis with pre-determined Gene Ontology terms, or with used-defined attributes


Dependencies

  • numpy

  • pandas

  • scipy

  • matplotlib

  • seaborn

  • tissue_enrichment_analysis

  • statsmodels

  • scikit-learn

  • ipyparallel

  • grid_strategy

  • Distance

  • pyyaml

  • UpSetPlot

  • matplotlib-venn


Where to get it

Use the following command in the python prompt:

pip install RNAlysis

Credits

Development Lead

Contributors

  • Or Ganon

  • Netta Dunsky

  • Shachar Shani


This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

History

1.3.5 (2020-05-27)

  • This version introduces minor bug fixes and a few more visualization options.

Added

  • Added the visualization function CountFilter.box_plot().

Changed

  • Updated docstrings and printouts of several functions.

  • Slightly improved speed and performance across the board.

  • Filter.feature_string() is now sorted alphabetically.

  • Enrichment randomization functions in the enrichment module now accept a ‘random_seed’ argument, to be able to generate consistent results over multiple sessions.

  • Enrichment randomization functions can now return the matplotlib Figure object, in addition to the results table.

Fixed

  • Fixed DepracationWarning on parsing functions from the general module.

  • Fixed bug where saving csv files on Linux systems would save the files under the wrong directory.

  • Fixed a bug where UTF-8-encoded Reference Tables won’t be loaded correctly

1.3.4 (2020-04-07)

  • This version fixed a bug that prevented installation of the package.

Changed

  • Updated docstrings and printouts of several functions

Fixed

  • Fixed a bug with installation of the previous version

1.3.3 (2020-03-28)

  • First stable release on PyPI.

Added

  • Added Filter.sort(), and upgraded the functionality of Filter.filter_top_n().

  • Added UpSet plots and Venn diagrams to enrichment module.

  • User-defined biotype reference tables can now be used.

  • Filter operations now print out the result of the operation.

  • Enrichment randomization tests now also support non-WBGene indexing.

  • Filter.biotypes() and FeatureSet.biotypes() now report genes that don’t appear in the biotype reference table.

  • Filter.biotypes() can now give a long-form report with descriptive statistics of all columns, grouped by biotype.

  • Added code examples to the user guide and to the docstrings of most functions.

Changed

  • Changed argument order and default values in filtering.CountFilter.from_folder().

  • Changed default title in scatter_sample_vs_sample().

  • Changed default filename in CountFilter.fold_change().

  • Settings are now saved in a .yaml format. Reading and writing of settings have been modified.

  • Changed argument name ‘deseq_highlight’ to ‘highlight’ in scatter_sample_vs_sample(). It can now accept any Filter object.

  • Updated documentation and default ‘mode’ value for FeatureSet.go_enrichment().

  • Updated the signature and function of general.load_csv() to be clearer and more predictable.

  • Changed argument names in CountFilter.from_folder().

  • Modified names and signatures of .csv test files functions to make them more comprehensible.

  • Renamed ‘Filter.filter_by_ref_table_attr()’ to ‘Filter.filter_by_attribute()’.

  • Renamed ‘Filter.split_by_ref_table_attr()’ to ‘Filter.split_by_attribute()’.

  • Renamed ‘Filter.norm_reads_with_size_factor()’ to ‘Filter.normalize_with_scaling_factors()’. It can now use any set of scaling factors to normalize libraries.

  • Renamed ‘Filter.norm_reads_to_rpm()’ to ‘Filter.normalize_to_rpm()’.

  • Made some functions in the general module hidden.

Fixed

  • Various bug fixes

Removed

  • Removed the ‘feature_name_to_wbgene’ module from RNAlysis.

1.3.2 (2019-12-11)

  • First beta release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

RNAlysis-1.3.5.tar.gz (2.8 MB view hashes)

Uploaded Source

Built Distribution

RNAlysis-1.3.5-py2.py3-none-any.whl (673.7 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page