Types¶
Aggregable¶
An Aggregable
is a Hail data type representing a distributed row or column of a matrix. Hail exposes a number of methods to compute on aggregables depending on the data type.
Aggregable[Array[Double]]¶
- sum(): Array[Double] – Compute the sum by index. All elements in the aggregable must have the same length.
Aggregable[Array[Float]]¶
- sum(): Array[Float] – Compute the sum by index. All elements in the aggregable must have the same length.
Aggregable[Array[Int]]¶
sum(): Array[Int]
Compute the sum by index. All elements in the aggregable must have the same length.
Examples
Count the total number of occurrences of each allele across samples, per variant:
>>> vds_result = vds.annotate_variants_expr('va.AC = gs.map(g => g.oneHotAlleles(v)).sum()')
Aggregable[Array[Long]]¶
- sum(): Array[Long] – Compute the sum by index. All elements in the aggregable must have the same length.
Aggregable[Double]¶
hist(start: Double, end: Double, bins: Int): Struct{binEdges:Array[Double],binFrequencies:Array[Long],nLess:Long,nGreater:Long}
- binEdges (Array[Double]) – Array of bin cutoffs
- binFrequencies (Array[Long]) – Number of elements that fall in each bin.
- nLess (Long) – Number of elements less than the minimum bin
- nGreater (Long) – Number of elements greater than the maximum bin
Compute frequency distributions of numeric parameters.
Examples
Compute GQ-distributions per variant:
>>> vds_result = vds.annotate_variants_expr('va.gqHist = gs.map(g => g.gq).hist(0, 100, 20)')Compute global GQ-distribution:
>>> gq_hist = vds.query_genotypes('gs.map(g => g.gq).hist(0, 100, 100)')Notes
- The start, end, and bins params are no-scope parameters, which means that while computations like 100 / 4 are acceptable, variable references like
global.nBins
are not.- Bin size is calculated from (
end
-start
) /bins
- (
bins
+ 1) breakpoints are generated from the range (start to end by binsize)- Each bin is left-inclusive, right-exclusive except the last bin, which includes the maximum value. This means that if there are N total bins, there will be N + 1 elements in
binEdges
. For the invocationhist(0, 3, 3)
,binEdges
would be[0, 1, 2, 3]
where the bins are[0, 1), [1, 2), [2, 3]
.Arguments
- start (Double) – Starting point of first bin
- end (Double) – End point of last bin
- bins (Int) – Number of bins to create.
max(): Double – Compute the maximum of all non-missing elements. The empty max is missing.
min(): Double – Compute the minimum of all non-missing elements. The empty min is missing.
product(): Double – Compute the product of all non-missing elements. The empty product is one.
stats(): Struct{mean:Double,stdev:Double,min:Double,max:Double,nNotMissing:Long,sum:Double}
- mean (Double) – Mean value
- stdev (Double) – Standard deviation
- min (Double) – Minimum value
- max (Double) – Maximum value
- nNotMissing (Long) – Number of non-missing values
- sum (Double) – Sum of all elements
Compute summary statistics about a numeric aggregable.
Examples
Compute the mean genotype quality score per variant:
>>> vds_result = vds.annotate_variants_expr('va.gqMean = gs.map(g => g.gq).stats().mean')Compute summary statistics on the number of singleton calls per sample:
>>> [singleton_stats] = (vds.sample_qc() ... .query_samples(['samples.map(s => sa.qc.nSingleton).stats()']))Compute GQ and DP statistics stratified by genotype call:
>>> gq_dp = [ ... 'va.homrefGQ = gs.filter(g => g.isHomRef()).map(g => g.gq).stats()', ... 'va.hetGQ = gs.filter(g => g.isHet()).map(g => g.gq).stats()', ... 'va.homvarGQ = gs.filter(g => g.isHomVar()).map(g => g.gq).stats()', ... 'va.homrefDP = gs.filter(g => g.isHomRef()).map(g => g.dp).stats()', ... 'va.hetDP = gs.filter(g => g.isHet()).map(g => g.dp).stats()', ... 'va.homvarDP = gs.filter(g => g.isHomVar()).map(g => g.dp).stats()'] >>> vds_result = vds.annotate_variants_expr(gq_dp)Notes
The
stats()
aggregator can be used to replicate some of the values computed byvariant_qc()
andsample_qc()
such asdpMean
anddpStDev
.sum(): Double – Compute the sum of all non-missing elements. The empty sum is zero.
Aggregable[Float]¶
- max(): Float – Compute the maximum of all non-missing elements. The empty max is missing.
- min(): Float – Compute the minimum of all non-missing elements. The empty min is missing.
- sum(): Float – Compute the sum of all non-missing elements. The empty sum is zero.
Aggregable[Genotype]¶
callStats(f: Genotype => Variant): Struct{AC:Array[Int],AF:Array[Double],AN:Int,GC:Array[Int]}
- AC (Array[Int]) – Allele count. One element per allele including reference. There are two elements for a biallelic variant, or 4 for a variant with three alternate alleles.
- AF (Array[Double]) – Allele frequency. One element per allele including reference. Sums to 1.
- AN (Int) – Allele number. This is equal to the sum of AC, or 2 * the total number of called genotypes in the aggregable.
- GC (Array[Int]) – Genotype count. One element per possible genotype, including reference genotypes – 3 for biallelic, 6 for triallelic, 10 for 3 alt alleles, and so on. The sum of this array is the number of called genotypes in the aggregable.
Compute four commonly-used metrics over a set of genotypes in a variant.
Examples
Compute phenotype-specific call statistics:
>>> pheno_stats = [ ... 'va.case_stats = gs.filter(g => sa.pheno.isCase).callStats(g => v)', ... 'va.control_stats = gs.filter(g => !sa.pheno.isCase).callStats(g => v)'] >>> vds_result = vds.annotate_variants_expr(pheno_stats)
va.eur_stats.AC
will be the allele count (AC) computed from individuals marked as “EUR”.Arguments
- f (Genotype => Variant) – Variant lambda expression such as
g => v
.hardyWeinberg(): Struct{rExpectedHetFrequency:Double,pHWE:Double}
- rExpectedHetFrequency (Double) – Expected rHeterozygosity based on Hardy Weinberg Equilibrium
- pHWE (Double) – p-value
Compute Hardy-Weinberg equilibrium p-value.
Examples
Add a new variant annotation that calculates HWE p-value by phenotype:
>>> vds_result = vds.annotate_variants_expr([ ... 'va.hweCase = gs.filter(g => sa.pheno.isCase).hardyWeinberg()', ... 'va.hweControl = gs.filter(g => !sa.pheno.isCase).hardyWeinberg()'])Notes
Hail computes the exact p-value with mid-p-value correction, i.e. the probability of a less-likely outcome plus one-half the probability of an equally-likely outcome. See this document for details on the Levene-Haldane distribution and references.
inbreeding(af: Genotype => Double): Struct{Fstat:Double,nTotal:Long,nCalled:Long,expectedHoms:Double,observedHoms:Long}
- Fstat (Double) – Inbreeding coefficient
- nTotal (Long) – Number of genotypes analyzed
- nCalled (Long) – number of genotypes with non-missing calls
- expectedHoms (Double) – Expected number of homozygote calls
- observedHoms (Long) – Total number of homozygote calls observed
Compute inbreeding metric. This aggregator is equivalent to the `–het` method in PLINK.
Examples
Calculate the inbreeding metric per sample:
>>> vds_result = (vds.variant_qc() ... .annotate_samples_expr('sa.inbreeding = gs.inbreeding(g => va.qc.AF)'))To obtain the same answer as PLINK, use the following series of commands:
>>> vds_result = (vds.variant_qc() ... .filter_variants_expr('va.qc.AC > 1 && va.qc.AF >= 1e-8 && va.qc.nCalled * 2 - va.qc.AC > 1 && va.qc.AF <= 1 - 1e-8 && v.isAutosomal()') ... .annotate_samples_expr('sa.inbreeding = gs.inbreeding(g => va.qc.AF)'))Notes
The Inbreeding Coefficient (F) is computed as follows:
- For each variant and sample with a non-missing genotype call,
E
, the expected number of homozygotes (computed from user-defined expression for minor allele frequency), is computed as1.0 - (2.0*maf*(1.0-maf))
- For each variant and sample with a non-missing genotype call,
O
, the observed number of homozygotes, is computed as0 = heterozygote; 1 = homozygote
- For each variant and sample with a non-missing genotype call,
N
is incremented by 1- For each sample,
E
,O
, andN
are combined across variantsF
is calculated by(O - E) / (N - E)
Arguments
- af (Genotype => Double) – Lambda expression for the alternate allele frequency.
infoScore(): Struct{score:Double,nIncluded:Int}
- score (Double) – IMPUTE info score
- nIncluded (Int) – Number of samples with non-missing genotype probability distribution
Compute the IMPUTE information score.
Examples
Calculate the info score per variant:
>>> (hc.import_gen("data/example.gen", "data/example.sample") ... .annotate_variants_expr('va.infoScore = gs.infoScore()'))Calculate group-specific info scores per variant:
>>> vds_result = (hc.import_gen("data/example.gen", "data/example.sample") ... .annotate_samples_expr("sa.isCase = pcoin(0.5)") ... .annotate_variants_expr(["va.infoScore.case = gs.filter(g => sa.isCase).infoScore()", ... "va.infoScore.control = gs.filter(g => !sa.isCase).infoScore()"]))Notes
We implemented the IMPUTE info measure as described in the supplementary information from Marchini & Howie. Genotype imputation for genome-wide association studies. Nature Reviews Genetics (2010).
To calculate the info score \(I_{A}\) for one SNP:
\[\begin{split}I_{A} = \begin{cases} 1 - \frac{\sum_{i=1}^{N}(f_{i} - e_{i}^2)}{2N\hat{\theta}(1 - \hat{\theta})} & \text{when } \hat{\theta} \in (0, 1) \\ 1 & \text{when } \hat{\theta} = 0, \hat{\theta} = 1\\ \end{cases}\end{split}\]
- \(N\) is the number of samples with imputed genotype probabilities [\(p_{ik} = P(G_{i} = k)\) where \(k \in \{0, 1, 2\}\)]
- \(e_{i} = p_{i1} + 2p_{i2}\) is the expected genotype per sample
- \(f_{i} = p_{i1} + 4p_{i2}\)
- \(\hat{\theta} = \frac{\sum_{i=1}^{N}e_{i}}{2N}\) is the MLE for the population minor allele frequency
Hail will not generate identical results as QCTOOL for the following reasons:
- The floating point number Hail stores for each genotype probability is slightly different than the original data due to rounding and normalization of probabilities.
- Hail automatically removes genotype probability distributions that do not meet certain requirements on data import with
import_gen()
andimport_bgen()
.- Hail does not use the population frequency to impute genotype probabilities when a genotype probability distribution has been set to missing.
- Hail calculates the same statistic for sex chromosomes as autosomes while QCTOOL incorporates sex information
Warning
- The info score Hail reports will be extremely different from qctool when a SNP has a high missing rate.
- If the genotype data was not imported using the
import_gen()
orimport_bgen()
commands, then the results for all variants will bescore = NA
andnIncluded = 0
.- It only makes sense to compute the info score for an Aggregable[Genotype] per variant. While a per-sample info score will run, the result is meaningless.
Aggregable[Int]¶
- max(): Int – Compute the maximum of all non-missing elements. The empty max is missing.
- min(): Int – Compute the minimum of all non-missing elements. The empty min is missing.
- sum(): Int – Compute the sum of all non-missing elements. The empty sum is zero.
Aggregable[Long]¶
- max(): Long – Compute the maximum of all non-missing elements. The empty max is missing.
- min(): Long – Compute the minimum of all non-missing elements. The empty min is missing.
- product(): Long – Compute the product of all non-missing elements. The empty product is one.
- sum(): Long – Compute the sum of all non-missing elements. The empty sum is zero.
Aggregable[T]¶
collect(): Array[T]
Returns an array with all of the elements in the aggregable. Order is not guaranteed.
Examples
Collect the list of sample IDs with heterozygote genotype calls per variant:
>>> vds_result = vds.annotate_variants_expr('va.hetSamples = gs.filter(g => g.isHet()).map(g => s).collect()')
va.hetSamples
will have the typeArray[String]
.collectAsSet(): Set[T] – Returns the set of all unique elements in the aggregable.
count(): Long
Counts the number of elements in an aggregable.
Examples
Count the number of heterozygote genotype calls in an aggregable of genotypes (
gs
):>>> vds_result = vds.annotate_variants_expr('va.nHets = gs.filter(g => g.isHet()).count()')counter(): Dict[T, Long]
Counts the number of occurrences of each element in an aggregable.
Examples
Compute the number of indels in each chromosome:
>>> [indels_per_chr] = vds.query_variants(['variants.filter(v => v.altAllele().isIndel()).map(v => v.contig).counter()'])Notes
We recommend this function is used with the Python counter object.
>>> [counter] = vds.query_variants(['variants.flatMap(v => v.altAlleles).counter()']) >>> from collections import Counter >>> counter = Counter(counter) >>> print(counter.most_common(5)) [(AltAllele(C, T), 129L), (AltAllele(G, A), 112L), (AltAllele(C, A), 60L), (AltAllele(A, G), 46L), (AltAllele(T, C), 44L)]exists(expr: T => Boolean): Boolean
Returns true if any element in the aggregator satisfies the condition given by
expr
and false otherwise.Arguments
- expr (T => Boolean) – Lambda expression.
filter(f: T => Boolean): Aggregable[T]
Subsets an aggregable by evaluating
f
for each element and keeping those elements that evaluate to true.Examples
Compute Hardy Weinberg Equilibrium for cases only:
>>> vds_result = vds.annotate_variants_expr("va.hweCase = gs.filter(g => sa.isCase).hardyWeinberg()")Arguments
- f (T => Boolean) – Boolean lambda expression.
flatMap(f: T => Set[U]): Aggregable[U]
Returns a new aggregable by applying a function
f
to each element and concatenating the resulting sets.Compute a list of genes per sample with loss of function variants (result does not have duplicate entries):
>>> vds_result = vds.annotate_samples_expr('sa.lof_genes = gs.filter(g => va.consequence == "LOF" && g.nNonRefAlleles() > 0).flatMap(g => va.genes.toSet()).collect()')Arguments
- f (T => Set[U]) – Lambda expression.
flatMap(a: T => Array[U]): Aggregable[U]
Returns a new aggregable by applying a function
f
to each element and concatenating the resulting arrays.Examples
Compute a list of genes per sample with loss of function variants (result may have duplicate entries):
>>> vds_result = vds.annotate_samples_expr('sa.lof_genes = gs.filter(g => va.consequence == "LOF" && g.nNonRefAlleles() > 0).flatMap(g => va.genes).collect()')forall(expr: T => Boolean): Boolean
Returns a true if all elements in the array satisfies the condition given by
expr
and false otherwise.Arguments
- expr (T => Boolean) – Lambda expression.
fraction(a: T => Boolean): Double
Computes the ratio of the number of occurrences for which a boolean condition evaluates to true, divided by the number of included elements in the aggregable.
Examples
Filter variants with a call rate less than 95%:
>>> vds_result = vds.filter_variants_expr('gs.fraction(g => g.isCalled()) > 0.90')Compute the differential missingness at SNPs and indels:
>>> exprs = ['sa.SNPmissingness = gs.filter(g => v.altAllele().isSNP()).fraction(g => g.isNotCalled())', ... 'sa.indelmissingness = gs.filter(g => v.altAllele().isIndel()).fraction(g => g.isNotCalled())'] >>> vds_result = vds.annotate_samples_expr(exprs)map(f: T => U): Aggregable[U]
Change the type of an aggregable by evaluating
f
for each element.Examples
Convert an aggregable of genotypes (
gs
) to an aggregable of genotype quality scores and then compute summary statistics:>>> vds_result = vds.annotate_variants_expr("va.gqStats = gs.map(g => g.gq).stats()")Arguments
- f (T => U) – Lambda expression.
take(n: Int): Array[T]
Take the first
n
items of an aggregable.Examples
Collect the first 5 sample IDs with at least one alternate allele per variant:
>>> vds_result = vds.annotate_variants_expr("va.nonRefSamples = gs.filter(g => g.nNonRefAlleles() > 0).map(g => s).take(5)")Arguments
- n (Int) – Number of items to take.
takeBy(f: T => String, n: Int): Array[T]
Returns the first
n
items of an aggregable in ascending order, ordered by the result off
.NA
always appears last. If the aggregable contains less thann
items, then the result will contain as many elements as the aggregable contains.Arguments
- f (T => String) – Lambda expression for mapping an aggregable to an ordered value.
- n (Int) – Number of items to take.
takeBy(f: T => Double, n: Int): Array[T]
Returns the first
n
items of an aggregable in ascending order, ordered by the result off
.NA
always appears last. If the aggregable contains less thann
items, then the result will contain as many elements as the aggregable contains.Note that
NaN
always appears after any finite or infinite floating-point numbers but beforeNA
. For example, consider an aggregable containing these elements:Infinity, -1, 1, 0, -Infinity, NA, NaNThe expression
gs.takeBy(x => x, 7)
would return the array:[-Infinity, -1, 0, 1, Infinity, NaN, NA]The expression
gs.takeBy(x => -x, 7)
would return the array:[Infinity, 1, 0, -1, -Infinity, NaN, NA]Arguments
- f (T => Double) – Lambda expression for mapping an aggregable to an ordered value.
- n (Int) – Number of items to take.
takeBy(f: T => Float, n: Int): Array[T]
Returns the first
n
items of an aggregable in ascending order, ordered by the result off
.NA
always appears last. If the aggregable contains less thann
items, then the result will contain as many elements as the aggregable contains.Note that
NaN
always appears after any finite or infinite floating-point numbers but beforeNA
. For example, consider an aggregable containing these elements:Infinity, -1, 1, 0, -Infinity, NA, NaNThe expression
gs.takeBy(x => x, 7)
would return the array:[-Infinity, -1, 0, 1, Infinity, NaN, NA]The expression
gs.takeBy(x => -x, 7)
would return the array:[Infinity, 1, 0, -1, -Infinity, NaN, NA]Arguments
- f (T => Float) – Lambda expression for mapping an aggregable to an ordered value.
- n (Int) – Number of items to take.
takeBy(f: T => Long, n: Int): Array[T]
Returns the first
n
items of an aggregable in ascending order, ordered by the result off
.NA
always appears last. If the aggregable contains less thann
items, then the result will contain as many elements as the aggregable contains.Examples
Consider an aggregable
gs
containing these elements:7, 6, 3, NA, 1, 2, NA, 4, 5, -1The expression
gs.takeBy(x => x, 5)
would return the array:[-1, 1, 2, 3, 4]The expression
gs.takeBy(x => -x, 5)
would return the array:[7, 6, 5, 4, 3]The expression
gs.takeBy(x => x, 10)
would return the array:[-1, 1, 2, 3, 4, 5, 6, 7, NA, NA]Returns the 10 samples with the least number of singletons:
>>> samplesMostSingletons = (vds ... .sample_qc() ... .query_samples('samples.takeBy(s => sa.qc.nSingleton, 10)'))Arguments
- f (T => Long) – Lambda expression for mapping an aggregable to an ordered value.
- n (Int) – Number of items to take.
takeBy(f: T => Int, n: Int): Array[T]
Returns the first
n
items of an aggregable in ascending order, ordered by the result off
.NA
always appears last. If the aggregable contains less thann
items, then the result will contain as many elements as the aggregable contains.Examples
Consider an aggregable
gs
containing these elements:7, 6, 3, NA, 1, 2, NA, 4, 5, -1The expression
gs.takeBy(x => x, 5)
would return the array:[-1, 1, 2, 3, 4]The expression
gs.takeBy(x => -x, 5)
would return the array:[7, 6, 5, 4, 3]The expression
gs.takeBy(x => x, 10)
would return the array:[-1, 1, 2, 3, 4, 5, 6, 7, NA, NA]Returns the 10 samples with the least number of singletons:
>>> samplesMostSingletons = (vds ... .sample_qc() ... .query_samples('samples.takeBy(s => sa.qc.nSingleton, 10)'))Arguments
- f (T => Int) – Lambda expression for mapping an aggregable to an ordered value.
- n (Int) – Number of items to take.
AltAllele¶
An AltAllele
is a Hail data type representing an alternate allele in the Variant Dataset.
- alt: String – Alternate allele base sequence.
- category(): String – the alt allele type, i.e one of SNP, Insertion, Deletion, Star, MNP, Complex
- isComplex(): Boolean – True if not a SNP, MNP, star, insertion, or deletion.
- isDeletion(): Boolean – True if
v.ref
begins with and is longer thanv.alt
.- isIndel(): Boolean – True if an insertion or a deletion.
- isInsertion(): Boolean – True if
v.alt
begins with and is longer thanv.ref
.- isMNP(): Boolean – True if
v.ref
andv.alt
are the same length and differ in more than one position.- isSNP(): Boolean – True if
v.ref
andv.alt
are the same length and differ in one position.- isStar(): Boolean – True if
v.alt
is*
.- isTransition(): Boolean – True if a purine-purine or pyrimidine-pyrimidine SNP.
- isTransversion(): Boolean – True if a purine-pyrimidine SNP.
- ref: String – Reference allele base sequence.
Array¶
An Array
is a collection of items that all have the same data type (ex: Int, String) and are indexed. Arrays can be constructed by specifying [item1, item2, ...]
and they are 0-indexed.
An example of constructing an array and accessing an element is:
let a = [1, 10, 3, 7] in a[1]
result: 10
They can also be nested such as Array[Array[Int]]:
let a = [[1, 2, 3], [4, 5], [], [6, 7]] in a[1]
result: [4, 5]
Array[Array[T]]¶
flatten(): Array[T]
Flattens a nested array by concatenating all its rows into a single array.
let a = [[1, 3], [2, 4]] in a.flatten() result: [1, 3, 2, 4]
Array[Boolean]¶
sort(ascending: Boolean): Array[Boolean]
Sort the collection with the ordering specified by
ascending
.Arguments
- ascending (Boolean) – If true, sort the collection in ascending order. Otherwise, sort in descending order.
sort(): Array[Boolean] – Sort the collection in ascending order.
Array[Double]¶
max(): Double – Largest element in the collection.
mean(): Double – Mean value of the collection.
median(): Double – Median value of the collection.
min(): Double – Smallest element in the collection.
product(): Double – Product of all elements in the collection (returns 1 if empty).
sort(ascending: Boolean): Array[Double]
Sort the collection with the ordering specified by
ascending
.Arguments
- ascending (Boolean) – If true, sort the collection in ascending order. Otherwise, sort in descending order.
sort(): Array[Double] – Sort the collection in ascending order.
sum(): Double – Sum of all elements in the collection.
Array[Float]¶
max(): Float – Largest element in the collection.
mean(): Double – Mean value of the collection.
median(): Float – Median value of the collection.
min(): Float – Smallest element in the collection.
product(): Float – Product of all elements in the collection (returns 1 if empty).
sort(ascending: Boolean): Array[Float]
Sort the collection with the ordering specified by
ascending
.Arguments
- ascending (Boolean) – If true, sort the collection in ascending order. Otherwise, sort in descending order.
sort(): Array[Float] – Sort the collection in ascending order.
sum(): Float – Sum of all elements in the collection.
Array[Int]¶
max(): Int – Largest element in the collection.
mean(): Double – Mean value of the collection.
median(): Int – Median value of the collection.
min(): Int – Smallest element in the collection.
product(): Int – Product of all elements in the collection (returns 1 if empty).
sort(ascending: Boolean): Array[Int]
Sort the collection with the ordering specified by
ascending
.Arguments
- ascending (Boolean) – If true, sort the collection in ascending order. Otherwise, sort in descending order.
sort(): Array[Int] – Sort the collection in ascending order.
sum(): Int – Sum of all elements in the collection.
Array[Long]¶
max(): Long – Largest element in the collection.
mean(): Double – Mean value of the collection.
median(): Long – Median value of the collection.
min(): Long – Smallest element in the collection.
product(): Long – Product of all elements in the collection (returns 1 if empty).
sort(ascending: Boolean): Array[Long]
Sort the collection with the ordering specified by
ascending
.Arguments
- ascending (Boolean) – If true, sort the collection in ascending order. Otherwise, sort in descending order.
sort(): Array[Long] – Sort the collection in ascending order.
sum(): Long – Sum of all elements in the collection.
Array[String]¶
mkString(delimiter: String): String
Concatenates all elements of this array into a single string where each element is separated by the
delimiter
.let a = ["a", "b", "c"] in a.mkString("::") result: "a::b::c"
Arguments
- delimiter (String) – String that separates each element.
sort(ascending: Boolean): Array[String]
Sort the collection with the ordering specified by
ascending
.Arguments
- ascending (Boolean) – If true, sort the collection in ascending order. Otherwise, sort in descending order.
sort(): Array[String] – Sort the collection in ascending order.
Array[T]¶
[:j]: Array[T]
Returns a slice of the array from the first element until the j*th* element (0-indexed). Negative indices are interpreted as offsets from the end of the array.
let a = [0, 2, 4, 6, 8, 10] in a[:4] result: [0, 2, 4, 6]
let a = [0, 2, 4, 6, 8, 10] in a[:-4] result: [0, 2]
Arguments
- j (Int) – End index of the slice (not included in result).
[:]: Array[T]
Returns a copy of the array.
let a = [0, 2, 4, 6] in a[:] result: [0, 2, 4, 6]
[i:j]: Array[T]
Returns a slice of the array from the i*th* element until the j*th* element (both 0-indexed). Negative indices are interpreted as offsets from the end of the array.
let a = [0, 2, 4, 6, 8, 10] in a[2:4] result: [4, 6]
let a = [0, 2, 4, 6, 8, 10] in a[-3:-1] result: [6, 8]
A handy way to understand the behavior of negative indicies is to recall this rule:
s[-i:-j] == s[s.length - i, s.length - j]Arguments
- i (Int) – Starting index of the slice.
- j (Int) – End index of the slice (not included in result).
[i:]: Array[T]
Returns a slice of the array from the i*th* element (0-indexed) to the end. Negative indices are interpreted as offsets from the end of the array.
let a = [0, 2, 4, 6, 8, 10] in a[3:] result: [6, 8, 10]
let a = [0, 2, 4, 6, 8, 10] in a[-5:] result: [2, 4, 6, 8, 10]
Arguments
- i (Int) – Starting index of the slice.
[i]: T
Returns the i*th* element (0-indexed) of the array, or throws an exception if
i
is an invalid index.let a = [0, 2, 4, 6, 8, 10] in a[2] result: 4
Arguments
- i (Int) – Index of the element to return.
append(a: T): Array[T] – Returns the result of adding the element a to the end of this Array.
exists(expr: T => Boolean): Boolean
Returns a boolean which is true if any element in the array satisfies the condition given by
expr
. false otherwise.let a = [1, 2, 3, 4, 5, 6] in a.exists(e => e > 4) result: true
Arguments
- expr (T => Boolean) – Lambda expression.
extend(a: Array[T]): Array[T] – Returns the concatenation of this Array followed by Array a.
filter(expr: T => Boolean): Array[T]
Returns a new array subsetted to the elements where
expr
evaluates to true.let a = [1, 4, 5, 6, 10] in a.filter(e => e % 2 == 0) result: [4, 6, 10]
Arguments
- expr (T => Boolean) – Lambda expression.
find(expr: T => Boolean): T
Returns the first non-missing element of the array for which expr is true. If no element satisfies the predicate, find returns NA.
let a = ["cat", "dog", "rabbit"] in a.find(e => 'bb' ~ e) result: "rabbit"
Arguments
- expr (T => Boolean) – Lambda expression.
flatMap(expr: T => Array[U]): Array[U]
Returns a new array by applying a function to each subarray and concatenating the resulting arrays.
let a = [[1, 2, 3], [4, 5], [6]] in a.flatMap(e => e + 1) result: [2, 3, 4, 5, 6, 7]
Arguments
- expr (T => Array[U]) – Lambda expression.
forall(expr: T => Boolean): Boolean
Returns a boolean which is true if all elements in the array satisfies the condition given by
expr
and false otherwise.let a = [0, 2, 4, 6, 8, 10] in a.forall(e => e % 2 == 0) result: true
Arguments
- expr (T => Boolean) – Lambda expression.
groupBy(a: T => U): Dict[U, Array[T]]
head(): T – Selects the first element.
isEmpty(): Boolean – Returns true if the number of elements is equal to 0. false otherwise.
length(): Int – Number of elements in the collection.
map(expr: T => U): Array[U]
Returns a new array produced by applying
expr
to each element.let a = [0, 1, 2, 3] in a.map(e => pow(2, e)) result: [1, 2, 4, 8]
Arguments
- expr (T => U) – Lambda expression.
size(): Int – Number of elements in the collection.
sortBy(f: T => String, ascending: Boolean): Array[T]
Sort the collection with the ordering specified by
ascending
after evaluatingf
for each element.Arguments
- f (T => String) – Lambda expression.
- ascending (Boolean) – If true, sort the collection in ascending order. Otherwise, sort in descending order.
sortBy(f: T => String): Array[T]
Sort the collection in ascending order after evaluating
f
for each element.Arguments
- f (T => String) – Lambda expression.
sortBy(f: T => Double, ascending: Boolean): Array[T]
Sort the collection with the ordering specified by
ascending
after evaluatingf
for each element.Arguments
- f (T => Double) – Lambda expression.
- ascending (Boolean) – If true, sort the collection in ascending order. Otherwise, sort in descending order.
sortBy(f: T => Double): Array[T]
Sort the collection in ascending order after evaluating
f
for each element.Arguments
- f (T => Double) – Lambda expression.
sortBy(f: T => Float, ascending: Boolean): Array[T]
Sort the collection with the ordering specified by
ascending
after evaluatingf
for each element.Arguments
- f (T => Float) – Lambda expression.
- ascending (Boolean) – If true, sort the collection in ascending order. Otherwise, sort in descending order.
sortBy(f: T => Float): Array[T]
Sort the collection in ascending order after evaluating
f
for each element.Arguments
- f (T => Float) – Lambda expression.
sortBy(f: T => Long, ascending: Boolean): Array[T]
Sort the collection with the ordering specified by
ascending
after evaluatingf
for each element.Arguments
- f (T => Long) – Lambda expression.
- ascending (Boolean) – If true, sort the collection in ascending order. Otherwise, sort in descending order.
sortBy(f: T => Long): Array[T]
Sort the collection in ascending order after evaluating
f
for each element.Arguments
- f (T => Long) – Lambda expression.
sortBy(f: T => Int, ascending: Boolean): Array[T]
Sort the collection with the ordering specified by
ascending
after evaluatingf
for each element.Arguments
- f (T => Int) – Lambda expression.
- ascending (Boolean) – If true, sort the collection in ascending order. Otherwise, sort in descending order.
sortBy(f: T => Int): Array[T]
Sort the collection in ascending order after evaluating
f
for each element.Arguments
- f (T => Int) – Lambda expression.
sortBy(f: T => Boolean, ascending: Boolean): Array[T]
Sort the collection with the ordering specified by
ascending
after evaluatingf
for each element.Arguments
- f (T => Boolean) – Lambda expression.
- ascending (Boolean) – If true, sort the collection in ascending order. Otherwise, sort in descending order.
sortBy(f: T => Boolean): Array[T]
Sort the collection in ascending order after evaluating
f
for each element.Arguments
- f (T => Boolean) – Lambda expression.
tail(): Array[T] – Selects all elements except the first.
toArray(): Array[T] – Convert collection to an Array.
toSet(): Set[T] – Convert collection to a Set.
Boolean¶
- max(a: Boolean): Boolean – Returns the maximum value.
- min(a: Boolean): Boolean – Returns the minimum value.
- toDouble(): Double – Convert value to a Double. Returns 1.0 if true, else 0.0.
- toFloat(): Float – Convert value to a Float. Returns 1.0 if true, else 0.0.
- toInt(): Int – Convert value to an Integer. Returns 1 if true, else 0.
- toLong(): Long – Convert value to a Long. Returns 1L if true, else 0L.
Call¶
A Call
is a Hail data type representing a genotype call (ex: 0/0) in the Variant Dataset.
gt: Int – the integer
gt = k*(k+1)/2 + j
for callj/k
(0 = 0/0, 1 = 0/1, 2 = 1/1, 3 = 0/2, etc.).gtj(): Int – the index of allele
j
for callj/k
(0 = ref, 1 = first alt allele, etc.).gtk(): Int – the index of allele
k
for callj/k
(0 = ref, 1 = first alt allele, etc.).isCalled(): Boolean – True if the call is not
./.
.isCalledNonRef(): Boolean – True if either
isHet
orisHomVar
is true.isHet(): Boolean – True if this call is heterozygous.
isHetNonRef(): Boolean – True if this call is
j/k
withj>0
.isHetRef(): Boolean – True if this call is
0/k
withk>0
.isHomRef(): Boolean – True if this call is
0/0
.isHomVar(): Boolean – True if this call is
j/j
withj>0
.isNotCalled(): Boolean – True if the call is
./.
.nNonRefAlleles(): Int – the number of called alternate alleles.
oneHotAlleles(v: Variant): Array[Int]
Produce an array of called counts for each allele in the variant (including reference). For example, calling this function with a biallelic variant on hom-ref, het, and hom-var calls will produce
[2, 0]
,[1, 1]
, and[0, 2]
respectively.Arguments
- v (Variant) – Variant
oneHotGenotype(v: Variant): Array[Int]
Produces an array with one element for each possible genotype in the variant, where the called genotype is 1 and all else 0. For example, calling this function with a biallelic variant on hom-ref, het, and hom-var calls will produce
[1, 0, 0]
,[0, 1, 0]
, and[0, 0, 1]
respectively.Arguments
- v (Variant) – Variant
toGenotype(): Genotype – Convert this call to a Genotype.
Dict¶
A Dict
is an unordered collection of key-value pairs. Each key can only appear once in the collection.
[k]: U
Returns the value for
k
, or throws an exception if the key is not found.Arguments
- k (T) – Key in the Dict to query.
contains(k: T): Boolean
Returns true if the Dict has a key equal to
k
, otherwise false.Arguments
- k (T) – Key name to query.
get(a: T): U – Returns the value of the Dict for key
k
, or returnsNA
if the key is not found.isEmpty(): Boolean – Returns true if the number of elements is equal to 0. false otherwise.
keySet(): Set[T] – Returns a Set containing the keys of the Dict.
keys(): Array[T] – Returns an Array containing the keys of the Dict.
mapValues(expr: U => V): Dict[T, V]
Returns a new Dict produced by applying
expr
to each value. The keys are unmodified.Arguments
- expr (U => V) – Lambda expression.
size(): Int – Number of elements in the collection.
values(): Array[U] – Returns an Array containing the values of the Dict.
Double¶
- abs(): Double – Returns the absolute value of a number.
- max(a: Double): Double – Returns the maximum value.
- min(a: Double): Double – Returns the minimum value.
- signum(): Int – Returns the sign of a number (1, 0, or -1).
- toDouble(): Double – Convert value to a Double.
- toFloat(): Float – Convert value to a Float.
- toInt(): Int – Convert value to an Integer.
- toLong(): Long – Convert value to a Long.
Float¶
- abs(): Float – Returns the absolute value of a number.
- max(a: Float): Float – Returns the maximum value.
- min(a: Float): Float – Returns the minimum value.
- signum(): Int – Returns the sign of a number (1, 0, or -1).
- toDouble(): Double – Convert value to a Double.
- toFloat(): Float – Convert value to a Float.
- toInt(): Int – Convert value to an Integer.
- toLong(): Long – Convert value to a Long.
Genotype¶
A Genotype
is a Hail data type representing a genotype in the Variant Dataset. It is referred to as g
in the expression language.
ad: Array[Int] – allelic depth for each allele.
call(): Call – the integer
gt = k*(k+1)/2 + j
for callj/k
(0 = 0/0, 1 = 0/1, 2 = 1/1, 3 = 0/2, etc.).dosage: Double – the expected number of non-reference alleles based on genotype probabilities.
dp: Int – the total number of informative reads.
fakeRef: Boolean – True if this genotype was downcoded in
split_multi()
. This can happen if a1/2
call is split to0/1
,0/1
.fractionReadsRef(): Double – the ratio of ref reads to the sum of all informative reads.
gp: Array[Double] – the linear-scaled probabilities.
gq: Int – the difference between the two smallest PL entries.
gt: Int – the integer
gt = k*(k+1)/2 + j
for callj/k
(0 = 0/0, 1 = 0/1, 2 = 1/1, 3 = 0/2, etc.).gtj(): Int – the index of allele
j
for callj/k
(0 = ref, 1 = first alt allele, etc.).gtk(): Int – the index of allele
k
for callj/k
(0 = ref, 1 = first alt allele, etc.).isCalled(): Boolean – True if the genotype is not
./.
.isCalledNonRef(): Boolean – True if either
g.isHet
org.isHomVar
is true.isHet(): Boolean – True if this call is heterozygous.
isHetNonRef(): Boolean – True if this call is
j/k
withj>0
.isHetRef(): Boolean – True if this call is
0/k
withk>0
.isHomRef(): Boolean – True if this call is
0/0
.isHomVar(): Boolean – True if this call is
j/j
withj>0
.isLinearScale: Boolean – True if the data was imported from
import_gen()
orimport_bgen()
.isNotCalled(): Boolean – True if the genotype is
./.
.nNonRefAlleles(): Int – the number of called alternate alleles.
od(): Int –
od = dp - ad.sum
.oneHotAlleles(v: Variant): Array[Int]
Produce an array of called counts for each allele in the variant (including reference). For example, calling this function with a biallelic variant on hom-ref, het, and hom-var genotypes will produce
[2, 0]
,[1, 1]
, and[0, 2]
respectively.Arguments
- v (Variant) – Variant
oneHotGenotype(v: Variant): Array[Int]
Produces an array with one element for each possible genotype in the variant, where the called genotype is 1 and all else 0. For example, calling this function with a biallelic variant on hom-ref, het, and hom-var genotypes will produce
[1, 0, 0]
,[0, 1, 0]
, and[0, 0, 1]
respectively.Arguments
- v (Variant) – Variant
pAB(): Double – p-value for pulling the given allelic depth from a binomial distribution with mean 0.5. Missing if the call is not heterozygous.
pl: Array[Int]
phred-scaled normalized genotype likelihood values. The conversion between
g.pl
(Phred-scaled likelihoods) andg.gp
(linear-scaled probabilities) assumes a uniform prior.
Int¶
- abs(): Int – Returns the absolute value of a number.
- max(a: Int): Int – Returns the maximum value.
- min(a: Int): Int – Returns the minimum value.
- signum(): Int – Returns the sign of a number (1, 0, or -1).
- toDouble(): Double – Convert value to a Double.
- toFloat(): Float – Convert value to a Float.
- toInt(): Int – Convert value to an Integer.
- toLong(): Long – Convert value to a Long.
Interval¶
An Interval
is a Hail data type representing a range of genomic locations in the Variant Dataset.
contains(locus: Locus): Boolean
Returns true if the
locus
is in the interval.let i = Interval(Locus("1", 1000), Locus("1", 2000)) in i.contains(Locus("1", 1500)) result: true
Arguments
- locus (Locus) – Locus
end: Locus – Locus at the end of the interval (exclusive).
start: Locus – Locus at the start of the interval (inclusive).
Locus¶
A Locus
is a Hail data type representing a specific genomic location in the Variant Dataset.
- contig: String – String representation of contig.
- position: Int – Chromosomal position.
Long¶
- abs(): Long – Returns the absolute value of a number.
- max(a: Long): Long – Returns the maximum value.
- min(a: Long): Long – Returns the minimum value.
- signum(): Int – Returns the sign of a number (1, 0, or -1).
- toDouble(): Double – Convert value to a Double.
- toFloat(): Float – Convert value to a Float.
- toInt(): Int – Convert value to an Integer.
- toLong(): Long – Convert value to a Long.
Set¶
A Set
is an unordered collection with no repeated values of a given data type (ex: Int, String). Sets can be constructed by specifying [item1, item2, ...].toSet()
.
let s = ["rabbit", "cat", "dog", "dog"].toSet()
result: Set("cat", "dog", "rabbit")
They can also be nested such as Set[Set[Int]]:
let s = [[1, 2, 3].toSet(), [4, 5, 5].toSet()].toSet()
result: Set(Set(1, 2, 3), Set(4, 5))
Set[Double]¶
- max(): Double – Largest element in the collection.
- mean(): Double – Mean value of the collection.
- median(): Double – Median value of the collection.
- min(): Double – Smallest element in the collection.
- product(): Double – Product of all elements in the collection (returns 1 if empty).
- sum(): Double – Sum of all elements in the collection.
Set[Float]¶
- max(): Float – Largest element in the collection.
- mean(): Double – Mean value of the collection.
- median(): Float – Median value of the collection.
- min(): Float – Smallest element in the collection.
- product(): Float – Product of all elements in the collection (returns 1 if empty).
- sum(): Float – Sum of all elements in the collection.
Set[Int]¶
- max(): Int – Largest element in the collection.
- mean(): Double – Mean value of the collection.
- median(): Int – Median value of the collection.
- min(): Int – Smallest element in the collection.
- product(): Int – Product of all elements in the collection (returns 1 if empty).
- sum(): Int – Sum of all elements in the collection.
Set[Long]¶
- max(): Long – Largest element in the collection.
- mean(): Double – Mean value of the collection.
- median(): Long – Median value of the collection.
- min(): Long – Smallest element in the collection.
- product(): Long – Product of all elements in the collection (returns 1 if empty).
- sum(): Long – Sum of all elements in the collection.
Set[Set[T]]¶
flatten(): Set[T]
Flattens a nested set by concatenating all its elements into a single set.
let s = [[1, 2].toSet(), [3, 4].toSet()].toSet() in s.flatten() result: Set(1, 2, 3, 4)
Set[String]¶
mkString(delimiter: String): String
Concatenates all elements of this set into a single string where each element is separated by the
delimiter
.let s = [1, 2, 3].toSet() in s.mkString(",") result: "1,2,3"
Arguments
- delimiter (String) – String that separates each element.
Set[T]¶
add(a: T): Set[T] – Returns the result of adding the element a to this Set.
contains(x: T): Boolean
Returns true if the element
x
is contained in the set, otherwise false.let s = [1, 2, 3].toSet() in s.contains(5) result: false
Arguments
- x (T) – Value to test.
difference(a: Set[T]): Set[T] – Returns the elements of this Set that are not in Set a.
exists(expr: T => Boolean): Boolean
Returns a boolean which is true if any element in the set satisfies the condition given by
expr
and false otherwise.let s = [0, 2, 4, 6, 8, 10].toSet() in s.exists(e => e % 2 == 1) result: false
Arguments
- expr (T => Boolean) – Lambda expression.
filter(expr: T => Boolean): Set[T]
Returns a new set subsetted to the elements where
expr
evaluates to true.let s = [1, 4, 5, 6, 10].toSet() in s.filter(e => e >= 5) result: Set(5, 6, 10)
Arguments
- expr (T => Boolean) – Lambda expression.
find(expr: T => Boolean): T
Returns the first non-missing element of the array for which expr is true. If no element satisfies the predicate, find returns NA.
let s = [1, 2, 3].toSet() in s.find(e => e % 3 == 0) result: 3
Arguments
- expr (T => Boolean) – Lambda expression.
flatMap(expr: T => Set[U]): Set[U]
Returns a new set by applying a function to each subset and concatenating the resulting sets.
let s = [["a", "b", "c"].toSet(), ["d", "e"].toSet(), ["f"].toSet()].toSet() in s.flatMap(e => e + "1") result: Set("a1", "b1", "c1", "d1", "e1", "f1")
Arguments
- expr (T => Set[U]) – Lambda expression.
forall(expr: T => Boolean): Boolean
Returns a boolean which is true if all elements in the set satisfies the condition given by
expr
and false otherwise.let s = [0.1, 0.5, 0.3, 1.0, 2.5, 3.0].toSet() in s.forall(e => e > 1.0 == 0) result: false
Arguments
- expr (T => Boolean) – Lambda expression.
groupBy(a: T => U): Dict[U, Set[T]]
head(): T – Select one element.
intersection(a: Set[T]): Set[T] – Returns the intersection of this Set and Set a.
isEmpty(): Boolean – Returns true if the number of elements is equal to 0. false otherwise.
issubset(a: Set[T]): Boolean – Returns true if this Set is a subset of Set a.
map(expr: T => U): Set[U]
Returns a new set produced by applying
expr
to each element.let s = [1, 2, 3].toSet() in s.map(e => e * 3) result: Set(3, 6, 9)
Arguments
- expr (T => U) – Lambda expression.
remove(a: T): Set[T] – Returns the result of removing the element a from this Set.
size(): Int – Number of elements in the collection.
tail(): Set[T] – Select all elements except the element returned by
head
.toArray(): Array[T] – Convert collection to an Array.
toSet(): Set[T] – Convert collection to a Set.
union(a: Set[T]): Set[T] – Returns the union of this Set and Set a.
String¶
[:j]: String
Returns a slice of the string from the first unicode-codepoint until the j*th* unicode-codepoint (0-indexed). Negative indices are interpreted as offsets from the end of the string.
let a = "abcdef" in a[:4] result: "abcd"
let a = "abcdef" in a[:-3] result: "abc"
Arguments
- j (Int) – End index of the slice (not included in result).
[:]: String
Returns a copy of the string.
let a = "abcd" in a[:] result: "abcd"
[i:j]: String
Returns a slice of the string from the i*th* unicode-codepoint until the j*th* unicode-codepoint (both 0-indexed). Negative indices are interpreted as offsets from the end of the string instead of the beginning.
let a = "abcdef" in a[2:4] result: "cd"
let a = "abcdef" in a[-3:-1] result: "de"
A handy way to understand the behavior of negative indicies is to recall this rule, which holds for any positive integers i and j:
s[-i:-j] == s[s.length - i, s.length - j]Arguments
- i (Int) – Starting index of the slice.
- j (Int) – End index of the slice (not included in result).
[i:]: String
Returns a slice of the string from the i*th* unicode-codepoint (0-indexed) to the end. Negative indices are interpreted as offsets from the end of the string.
let a = "abcdef" in a[3:] result: "def"
let a = "abcdef" in a[-5:] result: "bcdef"
Arguments
- i (Int) – Starting index of the slice.
[i]: String
Returns the i*th* element (0-indexed) of the string, or throws an exception if
i
is an invalid index.let s = "genetics" in s[6] result: "c"
Arguments
- i (Int) – Index of the character to return.
entropy(): Double
Computes the Shannon entropy in bits of the character frequency distribution.
length(): Int – Length of the string.
max(a: String): String – Returns the maximum value.
min(a: String): String – Returns the minimum value.
replace(pattern1: String, pattern2: String): String
Replaces each substring of this string that matches the given regular expression (
pattern1
) with the given replacement (pattern2
).let s = "1kg-NA12878" in a.replace("1kg-", "") result: "NA12878"
Arguments
- pattern1 (String) – Substring to replace.
- pattern2 (String) – Replacement string.
split(delim: String, n: Int): Array[String]
Returns an array of strings, split on the given regular expression delimiter with the pattern applied
n
times. See the documentation on regular expression syntax delimiter. If you need to split on special characters, escape them with double backslash (\).let s = "1kg-NA12878" in s.split("-") result: ["1kg", "NA12878"]
Arguments
- delim (String) – Regular expression delimiter.
- n (Int) – Number of times the pattern is applied. See the Java documentation for more information.
split(delim: String): Array[String]
Returns an array of strings, split on the given regular expression delimiter. See the documentation on regular expression syntax delimiter. If you need to split on special characters, escape them with double backslash (\).
let s = "1kg-NA12878" in s.split("-") result: ["1kg", "NA12878"]
Arguments
- delim (String) – Regular expression delimiter.
toDouble(): Double – Convert value to a Double.
toFloat(): Float – Convert value to a Float.
toInt(): Int – Convert value to an Integer.
toLong(): Long – Convert value to a Long.
Struct¶
A Struct
is like a Python tuple where the fields are named and the set of fields is fixed.
An example of constructing and accessing the fields in a Struct
is
let s = {gene: "ACBD", function: "LOF", nHet: 12} in s.gene
result: "ACBD"
A field of the Struct
can also be another Struct
. For example, va.info.AC
selects the struct info
from the struct va
, and then selects the array AC
from the struct info
.
Variant¶
A Variant
is a Hail data type representing a variant in the Variant Dataset. It is referred to as v
in the expression language.
The pseudoautosomal region (PAR) is currently defined with respect to reference GRCh37:
- X: 60001 - 2699520, 154931044 - 155260560
- Y: 10001 - 2649520, 59034050 - 59363566
Most callers assign variants in PAR to X.
- alt(): String – Alternate allele sequence. Assumes biallelic.
- altAllele(): AltAllele – The alternate allele. Assumes biallelic.
- altAlleles: Array[AltAllele] – The alternate alleles.
- contig: String – String representation of contig, exactly as imported. NB: Hail stores contigs as strings. Use double-quotes when checking contig equality.
- inXNonPar(): Boolean – True if chromosome is X and start is not in pseudoautosomal region of X.
- inXPar(): Boolean – True if chromosome is X and start is in pseudoautosomal region of X.
- inYNonPar(): Boolean – True if chromosome is Y and start is not in pseudoautosomal region of Y.
- inYPar(): Boolean – True if chromosome is Y and start is in pseudoautosomal region of Y. NB: most callers assign variants in PAR to X.
- isAutosomal(): Boolean – True if chromosome is not X, not Y, and not MT.
- isBiallelic(): Boolean – True if v has one alternate allele.
- locus(): Locus – Chromosomal locus (chr, pos) of this variant
- nAlleles(): Int – Number of alleles.
- nAltAlleles(): Int – Number of alternate alleles, equal to
nAlleles - 1
.- nGenotypes(): Int – Number of genotypes.
- ref: String – Reference allele sequence.
- start: Int – SNP position or start of an indel.