Genotype¶
-
class
hail.representation.
Genotype
(gt, ad=None, dp=None, gq=None, pl=None)[source]¶ An object that represents an individual’s genotype at a genomic locus.
Parameters: - gt (int or None) – Genotype hard call
- ad (list of int or None) – allelic depth (1 element per allele including reference)
- dp (int or None) – total depth
- gq (int or None) – genotype quality
- pl (list of int or None) – phred-scaled posterior genotype likelihoods (1 element per possible genotype)
Attributes
ad
Returns the allelic depth. dp
Returns the total depth. gp
Returns the linear-scaled genotype probabilities. gq
Returns the phred-scaled genotype quality. gt
Returns the hard genotype call. pl
Returns the phred-scaled genotype posterior likelihoods. Methods
__init__
Initialize a Genotype object. dosage
Returns the expected value of the genotype based on genotype probabilities, \(\mathrm{P}(\mathrm{Het}) + 2 \mathrm{P}(\mathrm{HomVar})\). fraction_reads_ref
Returns the fraction of reads that are reference reads. is_called
True if the genotype call is non-missing. is_called_non_ref
True if the genotype call contains any non-reference alleles. is_het
True if the genotype call contains two different alleles. is_het_non_ref
True if the genotype call contains two different alternate alleles. is_het_ref
True if the genotype call contains one reference and one alternate allele. is_hom_ref
True if the genotype call is 0/0 is_hom_var
True if the genotype call contains two identical alternate alleles. is_not_called
True if the genotype call is missing. num_alt_alleles
Returns the count of non-reference alleles. od
Returns the difference between the total depth and the allelic depth sum. one_hot_alleles
Returns a list containing the one-hot encoded representation of the called alleles. one_hot_genotype
Returns a list containing the one-hot encoded representation of the genotype call. p_ab
Returns the p-value associated with finding the given allele depth ratio. -
ad
¶ Returns the allelic depth.
Return type: list of int or None
-
dosage
()[source]¶ Returns the expected value of the genotype based on genotype probabilities, \(\mathrm{P}(\mathrm{Het}) + 2 \mathrm{P}(\mathrm{HomVar})\). Genotype must be bi-allelic.
Return type: float
-
dp
¶ Returns the total depth.
Return type: int or None
-
fraction_reads_ref
()[source]¶ Returns the fraction of reads that are reference reads.
Equivalent to:
>>> g.ad[0] / sum(g.ad)
Return type: float or None
-
gp
¶ Returns the linear-scaled genotype probabilities.
Return type: list of float of None
-
gq
¶ Returns the phred-scaled genotype quality.
Returns: int or None
-
gt
¶ Returns the hard genotype call.
Return type: int or None
-
is_called_non_ref
()[source]¶ True if the genotype call contains any non-reference alleles.
Return type: bool
-
is_het_non_ref
()[source]¶ True if the genotype call contains two different alternate alleles.
Return type: bool
-
is_het_ref
()[source]¶ True if the genotype call contains one reference and one alternate allele.
Return type: bool
-
is_hom_var
()[source]¶ True if the genotype call contains two identical alternate alleles.
Return type: bool
-
num_alt_alleles
()[source]¶ Returns the count of non-reference alleles.
This function returns None if the genotype call is missing.
Return type: int or None
-
od
()[source]¶ Returns the difference between the total depth and the allelic depth sum.
Equivalent to:
g.dp - sum(g.ad)
Return type: int or None
-
one_hot_alleles
(num_alleles)[source]¶ Returns a list containing the one-hot encoded representation of the called alleles.
This one-hot representation is the positional sum of the one-hot encoding for each called allele. For a biallelic variant, the one-hot encoding for a reference allele is [1, 0] and the one-hot encoding for an alternate allele is [0, 1]. Thus, with the following variables:
num_alleles = 2 hom_ref = Genotype(0) het = Genotype(1) hom_var = Genotype(2)
All the below statements are true:
hom_ref.one_hot_alleles(num_alleles) == [2, 0] het.one_hot_alleles(num_alleles) == [1, 1] hom_var.one_hot_alleles(num_alleles) == [0, 2]
This function returns None if the genotype call is missing.
Parameters: num_alleles (int) – number of possible alternate alleles Return type: list of int or None
-
one_hot_genotype
(num_genotypes)[source]¶ Returns a list containing the one-hot encoded representation of the genotype call.
A one-hot encoding is a vector with one ‘1’ and many ‘0’ values, like [0, 0, 1, 0] or [1, 0, 0, 0]. This function is useful for transforming the genotype call (gt) into a one-hot encoded array. With the following variables:
num_genotypes = 3 hom_ref = Genotype(0) het = Genotype(1) hom_var = Genotype(2)
All the below statements are true:
hom_ref.one_hot_genotype(num_genotypes) == [1, 0, 0] het.one_hot_genotype(num_genotypes) == [0, 1, 0] hom_var.one_hot_genotype(num_genotypes) == [0, 0, 1]
This function returns None if the genotype call is missing.
Parameters: num_genotypes (int) – number of possible genotypes Return type: list of int or None
-
p_ab
(theta=0.5)[source]¶ Returns the p-value associated with finding the given allele depth ratio.
This function uses a one-tailed binomial test.
This function returns None if the allelic depth (ad) is missing.
Parameters: theta (float) – null reference probability for binomial model Return type: float
-
pl
¶ Returns the phred-scaled genotype posterior likelihoods.
Return type: list of int or None