Data file generation for CGAP's Higlass browsers
Project description
higlass-data
Package that creates data files for CGAP's Higlass browsers
Installation
Simply run pip install cgap-higlass-data
to install the package. You need at least Python 3.8.
To develop this package, clone this repo, make sure poetry
is installed on your system and run make install
.
Commands
After installation the following commands can be run from the command line:
Convert BED file to BW (bigWig) file
Assume you have a BED file of the form
# HEADER LINE 1
# HEADER LINE 2
chr1 0 1024 . 423
chr1 1024 2048 . 32
chr1 2048 3072 . 734
This BED file can be converted to a BW file with the following command
# -i input BED file path
# -o output BW file path
# -a assembly (currently only 'hg38' is supported
# -l number of header lines in the BED file
convert-bed-to-bw -i ./PATH/input.bed \
-o ./PATH/output.bw \
-a hg38 \
-l 2
Note that the bedGraphToBigWig
must be installed on your system for this to work. It can be installed via conda (conda install -c bioconda ucsc-bedgraphtobigwig
). You can also download the binary here: http://hgdownload.soe.ucsc.edu/admin/exe/
Create variant-level VCF for CGAP's cohort browser
This command creates a multiresolution VCF file that is compatible to CGAP's cohort browser. Typically, the input VCF will be VEP annotated and has at least the info field level_most_severe_consequence
(which is one of HIGH
, LOW
, MODERATE
, MODIFIER
) and an importance value that can ranks/sorts the variants. The info field that is used for that purpose can be set dynamically.
# -i input VCF path
# -o output VCF path
# -c info field in the input VCF that ranks the variants
# -m maximal tile values per consequence. Controls how may variants are displayed at once and a certain zoom level
# -q quiet True / False. Toggles verbose output
create-cohort-vcf -i ./PATH/input.vcf \
-o ./PATH/output.vcf \
-c p_value_negative_log_10 \
-q True
Create coverage BED file from VCF
Counts the number of variants in a 1024bp window and creates a BED file with the results.
# -i input VCF path
# -o output VCF path
# -a assembly
# -q quiet True / False. Toggles verbose output
create-coverage-bed -i ./PATH/input.vcf \
-o ./PATH/output.bed \
-a hg38 \
-q True
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for cgap_higlass_data-0.3.0b0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | a4d0e86db6a5362bdb940c95e2fe57656b317ac05580c367f90725d62e8efd6d |
|
MD5 | c986575d88dc1ab2d8a74fef8d5bed73 |
|
BLAKE2b-256 | 3bbe817cfabbf804a6e7f2829af0a17859827dfb11443dda002b91a654764fe0 |
Hashes for cgap_higlass_data-0.3.0b0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5fee67d87dfdd22d9c923c6b93e6b6d8cb62ea9877e78a49b9da84bc25529e06 |
|
MD5 | dea95103f71bf7122a37d029e063dd87 |
|
BLAKE2b-256 | 49e8594fa41571ff6cc24028d8bb663d1de6bb97161e3a90647d449f0e0d6580 |