skip to navigation
skip to content

Not Logged In

pytabix 0.0.2

Python interface for tabix

April 16, 2014

This module allows fast random access to files compressed with bgzip and indexed by tabix. It includes a C extension with code from klib. The bgzip and tabix programs are available here.

Installation

pip install --user pytabix

Synopsis

Genomics data is often in a table where each row corresponds to a genomic region (start, end) or a position:

chrom  pos      snp
1      1000760  rs75316104
1      1000894  rs114006445
1      1000910  rs79750022
1      1001177  rs4970401
1      1001256  rs78650406

With tabix, you can quickly retrieve all rows in a genomic region by specifying a query with a sequence name, start, and end:

import tabix

# Open a remote or local file.
url = "ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20100804/"
url += "ALL.2of4intersection.20100804.genotypes.vcf.gz"

tb = tabix.open(url)

# These queries are identical. A query returns an iterator over the results.
records = tb.query("1", 1000000, 1250000)

records = tb.queryi(0, 1000000, 1250000)

records = tb.querys("1:1000000-1250000")

# Each record is a list of strings.
for record in records:
    print record[:5]
    break
['1', '1000071', '.', 'C', 'T']
 
File Type Py Version Uploaded on Size
pytabix-0.0.2.tar.gz (md5) Source 2015-03-21 45KB
  • Downloads (All Versions):
  • 28 downloads in the last day
  • 201 downloads in the last week
  • 727 downloads in the last month