Skip to main content

A hashtable based counter implemented in CUDA for high throughput.

Project description

CuCounter: A Python kmer frequency counter object based on a massively parallel CUDA hash table

Installation

CuCounter requires NumPy and CuPy. It also currently only supports Nvidia GPUs with CUDA.

Cucounter can be installed using pip:

pip install kage-cucounter

.. or via manual installation:

  • clone the CuCounter repository
  • use pip to install all necessary dependencies as well as CuCounter from inside the cloned repository
git clone https://github.com/jorgenwh/cucounter.git
cd cucounter
pip install -r requirements.txt
pip install .

Usage

All of CuCounter's methods (including its constructor) will accept either NumPy or CuPy arrays. CuPy arrays are preferred as it circumvents having to copy memory back and fourth between the host and device. NumPy is used in the example below, but the same code would work if NumPy had been replaced with CuPy.

from cucounter import Counter
import numpy as np

# Create a static set of 100 million unique 64-bit encoded kmers as keys for the counter
unique_kmers = np.arange(100000000, dtype=np.uint64)

# Create counter object
counter = Counter(keys=unique_kmers)

# Create a chunk of 200 million kmers to count
kmers = np.random.randint(low=0, high=0xFFFFFFFFFFFFFFFF, size=(200000000,), dtype=np.uint64)

# Count the observed kmer frequencies. Kmers not present in the original key set are ignored
counter.count(kmers)

# Fetch the observed frequencies for the original key set
counts = counter[unique_kmers] 

counts.dtype # np.uint32
counts.shape # (100000000,)

CuCounter also supports counting the reverse complements of kmers aswell as the original kmer.

counter.count(kmers, count_revcomps=True)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kage-cucounter-1.0.1.tar.gz (9.7 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page