skip to navigation
skip to content

Not Logged In

genmod 0.9.6

Annotate genetic inheritance models in variant files

GENMOD
======

Tool for annotating patterns of inheritance Variant Call Format (VCF)
files.

Each variant in the VCF-file will be annotated with which genetic models
that are followed in the family. The genetic models that are checked are
the following:

-  Autsomal Recessive, denoted 'AR'
-  Autsomal Recessive denovo, denoted 'AR\_dn'
-  Autsomal Dominant, 'AD'
-  Autsomal Dominant denovo, 'AD\_dn'
-  Autosomal Compound Heterozygote, 'AR\_comp'
-  X-linked dominant, 'XD'
-  X-linked dominant de novo, 'XD\_dn'
-  X-linked Recessive, 'XR'
-  X-linked Recessive de novo, 'XR\_dn'

**GENMOD** will add entrys to the INFO column for the given VCF file.
The new entrys are:

-  GM: A colon separated list with genetic models followed
-  ANN: Colon separated list with features overlapped in the annotation
   file
-  Comp: Colon separated list with compound pairs(if any). These are
   described like CHR\_POS\_REF\_ALT.
-  MS: Model Score, a phred-score based on the genotype qualities to
   describe the uncertainty of the genetic model.

Installation:
-------------

genmod works with Python 2.7 and Python 3.

::

    pip install genmod

or

::

    git clone git@github.com:moonso/genmod.git
    cd genmod
    python setup.py install

USAGE:
~~~~~~

::

    genmod ped_file variant_file annotation_file

Conditions for Genetic Models
-----------------------------

Short explanation of genotype calls in VCF format:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Since we only look at humans, that are diploid, the genotypes represent
what we see on both alleles in a single position. 0 represents the
reference sequence, 1 is the first of the alternative alleles, 2 second
alternative and so on. If no phasing has been done the genotype is an
unordered pair on the form x/x, so 0/1 means that the individual is
heterozygote in this given position with the reference base on one of
the alleles and the first of the alternatives on the other. 2/2 means
that we see the second of the alternatives on both alleles. Some
chromosomes are only present in one copy in humans, here it is allowed
to only use a single digit to show the genotype. A 0 would mean
reference and 1 first of alternatives.

If phasing has been done the pairs are not unordered anymore and the
delimiter is then changed to '\|', so one can be heterozygote in two
ways; 0\|1 or 1\|0.

Autosomal Recessive
~~~~~~~~~~~~~~~~~~~

For this model individuals can be carriers so healthy individuals can be
heterozygous. Both alleles need to have the variant for an individual to
be sick so a healthy individual can not be homozygous alternative and a
sick individual *has* to be homozygous alternative.

-  Affected individuals have to be homozygous alternative (hom. alt.)
-  Healthy individuals cannot be hom. alt.
-  Variant is considered *de novo* if both parents are genotyped and do
   not carry the variant

Autosomal Dominant
~~~~~~~~~~~~~~~~~~

-  Affected individuals have to be heterozygous (het.)
-  Healthy individuals cannot have the alternative variant
-  Variant is considered *de novo* if both parents are genotyped and do
   not carry the variant

Autosomal Compound Heterozygote
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This model includes pairs of variants that are present within the same
feature.

1. Non-phased data:

   -  Affected individuals have to be het. for both variants
   -  Healthy individuals can be het. for one of the variants but cannot
      have both variants
   -  Variant is considered *de novo* if only one or no variant is found
      in the parents

2. Phased data:

   -  All affected individuals have to be het. for both variants and the
      variants has to be on two different alleles
   -  Healthy individuals can be heterozygous for one or both of the
      variants if they are on the allele
   -  If only one or no variant is found in parents it is considered *de
      novo*

X-Linked Dominant
~~~~~~~~~~~~~~~~~

These traits are inherited on the x-chromosome, of which men have one
allele and women have two.

-  Variant has to be on chromosome X
-  Affected individuals have to be het. or hom. alt.
-  Healthy individuals cannot carry the variant
-  If sex is male the variant is considered *de novo* if mother is
   genotyped and does not carry the variant
-  If sex is female variant is considered *de novo* if none of the
   parents carry the variant

X Linked Recessive
~~~~~~~~~~~~~~~~~~

-  Variant has to be on chromosome X
-  Affected individuals have to be het. or hom. alt.
-  Healthy individuals cannot be hom. alt.
-  If sex is male the variant is considered *de novo* if mother is
   genotyped and does not carry the variant
-  If sex is female variant is considered *de novo* if not both parents
   carry the variant

.. raw:: html

   <!-- ## Detailed Structure ##

   Here all attributes and methods of the classes will be showed:

   ### Genotype ###

   Store the genotype information of a variant that is specific for an individual

   **Attributes:**

   * genotype STRING
   * allele_1 STRING
   * allele_2 STRING
   * nocall BOOL
   * heterozygote BOOL
   * homo_alt BOOL
   * homo_ref BOOL
   * has_variant BOOL
   * filter STRING
   * ref_depth INT
   * alt_depth INT
   * phred_likelihoods TUPLE with INT
   * depth_of_coverage INT
   * genotype_quality FLOAT


   ### Variant ###

   Holds the info of a variant and it's specific behaviour in a family.

   **Attributes**

   * chr STRING Have to be string since X, Y
   * start INT
   * stop INT
   * ref STRING Reference nucleotide(s)
   * alt STRING Alternative sequence
   * identity STRING dbSNP-id
   * var_info DICT A dictionary with all the info from the variant file
   * qual STRING A value for the score of the base call
   * filter STRING The filter status
   * genotypes LIST A list with the genotypes found for this variants
   * gene STRING Semicolon separated string with ensemble gene names
   * ad BOOL If following Autosomal Dominant pattern
   * ad_dn BOOL If following Autosomal Dominant De novo pattern
   * ar BOOL If following Autosomal Recessive pattern
   * ar_dn BOOL If following Autosomal Recessive De nove pattern
   * ar_comp BOOL If following Autosomal Recessive compound pattern
   * ar_comp_dn BOOL If following Autosomal Recessive Compound De Novo pattern

   **Methods**

   * get_variant(self):
       Returns a dictionary with basic info to stdout
   * print_model_info(self):
       Print for each variant which patterns of inheritance they follow.
   * print_vcf_variant(self):
       Print the variant in vcf-format to stdout
   * print_original_version(self, header_columns):
       Prints the variant in its original format.
   * check_noncomplete_call(self):
       Check if GATK have missed to report some info.
   * get_genotype(self):
       Returns the list with genotypes for this variant.


   ### Individual ###

   Holds the information about an individual and the individual specific genotypes.

   **Attributes**

   * ind STRING Can be any id unique within the family
   * family STRING Can be any unique id within the cohort
   * mother STRING The ind_id of the mother or [0,-9] if info is missing
   * father STRING ---------||------ father --------------||---------------
   * sex INT 1=male 2=female 0=unknown
   * phenotype INT 1=unaffected, 2=affected, missing = [0,-9]
   * genotypes DICT Container with genotype information on the form {<variant_id>: <Genotype>}
   * phasing BOOL If the genotype information includes phasing for this individual

   ### Family ###

   **Attributes**

   * individuals DICT dictionary with family members on the form {<ind_id>:<Individual>}
   * variants DICT dictionary with all the variants that exists in the family on the form {<var_id>:<Variant>} -->
 
File Type Py Version Uploaded on Size
genmod-0.9.6.tar.gz (md5) Source 2014-04-15 24KB
  • Downloads (All Versions):
  • 10 downloads in the last day
  • 3003 downloads in the last week
  • 9742 downloads in the last month