isovar

Assemble transcript sequences fragments near variants

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
Environment
- Console
Intended Audience
- Science/Research
License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python
Topic
- Scientific/Engineering :: Bio-Informatics

Project description

[![DOI](https://zenodo.org/badge/18834/hammerlab/isovar.svg)](https://zenodo.org/badge/latestdoi/18834/hammerlab/isovar) [![Build Status](https://travis-ci.org/hammerlab/isovar.svg?branch=master)](https://travis-ci.org/hammerlab/isovar) [![Coverage Status](https://coveralls.io/repos/github/hammerlab/isovar/badge.svg?branch=master)](https://coveralls.io/github/hammerlab/isovar?branch=master)

# isovar
Abundance quantification of distinct transcript sequences containing somatic variants from cancer RNAseq

## Example

```sh
$ isovar-protein-sequences.py \
--vcf somatic-variants.vcf \
--bam rnaseq.bam \
--genome hg19 \
--min-reads 2 \
--protein-sequence-length 30 \
--output isovar-results.csv

chr pos ref alt amino_acids \
0 22 46931060 A C FGVEAVDHGWPSMSSGSSWRASRGPPPPPR
1 22 46931062 G A CFGVEAVDHGWPPMSLAHGGPAVVHRLHPEA

variant_aa_interval_start variant_aa_interval_end ends_with_stop_codon \
0 16 17 False
1 16 17 False

frameshift translations_count supporting_variant_reads_count \
0 False 1 1
1 False 1 1

total_variant_reads supporting_transcripts_count total_transcripts \
0 130 2 2
1 127 2 2

gene
0 CELSR1
1 CELSR1
```

## Algorithm/Design

The one line explanation of isovar: `ProteinSequence = VariantSequence + ReferenceContext`.

A little more detail about the algorithm:
1. Scan through an RNAseq BAM file and extract sequences overlapping a variant locus (represented by `ReadAtLocus`)
2. Make sure that the read contains the variant allele and split its sequence into prefix/alt/suffix string parts (represented by `VariantRead`)
3. Combine multiple `VariantRead` records into a `VariantSequence`
4. Gather possible reading frames for distinct reference sequences around the variant locus (represented by `ReferenceContext`).
5. Use the reading frame from a `ReferenceContext` to translate a `VariantSequence` into a protein fragment (represented by `Translation`).
6. Multiple distinct variant sequences and reference contexts can generate the same translations, so we aggregate those equivalent `Translation` objects into a `ProteinSequence`.

Since we may not want to deal with *every* possible translation of *every* distinct sequence detected around a variant, `isovar` sorts the variant sequences by the number of supporting reads and the reference contexts in order of protein length and a configurable number of
translated protein fragments can be kept from this ordering.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
Environment
- Console
Intended Audience
- Science/Research
License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python
Topic
- Scientific/Engineering :: Bio-Informatics

Release history Release notifications | RSS feed

1.3.0

Aug 1, 2023

1.1.1

Aug 19, 2020

1.1.0

Oct 24, 2019

1.0.10

Aug 21, 2019

1.0.9

Jun 19, 2019

1.0.8

Jun 18, 2019

1.0.7

Jun 18, 2019

1.0.6

Jun 17, 2019

1.0.5

Jun 17, 2019

1.0.4

Jun 16, 2019

1.0.3

Jun 16, 2019

1.0.2

Jun 16, 2019

1.0.1

Jun 16, 2019

1.0.0

Jun 12, 2019

0.9.0

Jul 31, 2018

0.8.5

Jul 30, 2018

0.8.3

May 21, 2018

0.8.2

May 21, 2018

0.8.1

May 20, 2018

0.8.0

May 20, 2018

0.7.5

Feb 26, 2018

0.7.4

Feb 26, 2018

0.7.3

Feb 26, 2018

0.7.2

Feb 23, 2018

0.7.1

Feb 21, 2018

0.7.0

Jun 21, 2017

0.6.1

May 24, 2017

0.6.0

Jan 18, 2017

0.5.2

Dec 2, 2016

0.5.0

Nov 25, 2016

0.4.0

Nov 17, 2016

0.2.4

Oct 13, 2016

This version

0.2.3

Oct 13, 2016

0.2.2

Sep 23, 2016

0.2.1

Sep 16, 2016

0.2.0

Aug 6, 2016

0.1.5

Aug 2, 2016

0.1.3

Jul 5, 2016

0.1.0

Jul 4, 2016

0.0.6

Jun 14, 2016

0.0.5

Jun 10, 2016

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

isovar-0.2.3.tar.gz (44.6 kB view hashes)

Uploaded Oct 13, 2016 Source

Hashes for isovar-0.2.3.tar.gz

Hashes for isovar-0.2.3.tar.gz
Algorithm	Hash digest
SHA256	`e65c1f1ccda0022b405216e0723559ed999b7c24cbc4362a9f36f8e0da69932f`
MD5	`e427c8a8dbd61490c205057c9e9b03fc`
BLAKE2b-256	`72bf82e36d137b0866be27e0a27c1545fa4fb0b519e1aac31734b239fff5bff9`