Represent genomic annotations in Python. Equivalent to Bioconductors [GRanges](https://bioconductor.org/packages/release/bioc/html/GenomicRanges.html)
Project description
GenomicRanges
Python equivalent to Bioconductor's GenomicRanges to represent genomic locations and support genomic analysis. It uses efficient structures already available in the Python/Pandas/numpy eco-system adds an familiar interfaces.
Install
Package is deployed to PyPI
pip install genomicranges
Usage
The package provide several ways to represent genomic intervals
Pandas DataFrame
A common representation in Python is a pandas DataFrame for all tabular datasets. One can convert this into GenomicRanges
.
Note: The DataFrame must contain columns seqname
, start
and end
that represent chromosome and genomic coordinates.
from genomicranges import GenomicRanges
gr = GenomicRanges.fromPandas(<PANDAS DATA FRAME>)
From UCSC or GTF file
Methods are available to easily access UCSC genomes or load a genome annotation from GTF
from genomicranges import GenomicRanges
gr = GenomicRanges.fromGTF(<PATH TO GTF>)
# OR
gr = GenomicRanges.fromUCSC(genome="hg19")
Interval Operations
Currently supports Nearest Genomic positions operation in Bioconductor, but more coming soon.
subject = GenomicRanges.fromUCSC(genome="hg38")
query = GenomicRanges.fromPandas(
pd.DataFrame(
{
"seqnames": ["chr1", "chr2", "chr3"],
"starts": [100, 115, 119],
"ends": [103, 116, 120],
}
)
)
hits = subject.nearest(query)
print(hits)
For more use cases, checkout the documentation
Note
This project has been set up using PyScaffold 4.1.1. For details and usage information on PyScaffold see https://pyscaffold.org/.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for GenomicRanges-0.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | da75f2904909631d0751558dd0c7ffadba8be4365fdbfed3edc6dce8580ab817 |
|
MD5 | 15ce5e2b160a44c8abc9b6b110797ab2 |
|
BLAKE2b-256 | 31298def5a85d78239bc4ece55984934faf89c0e0fa3afe7eddae7e83603d101 |