CallExpression

class hail.expr.CallExpression[source]

Expression of type tcall.

>>> call = hl.call(0, 1, phased=False)

Attributes

`dtype`	The data type of the expression.
`phased`	True if the call is phased.
`ploidy`	Return the number of alleles of this call.

Methods

`contains_allele`	Returns true if the call has one or more called alleles of the given index.
`is_diploid`	True if the call has ploidy equal to 2.
`is_haploid`	True if the call has ploidy equal to 1.
`is_het`	Evaluate whether the call includes two different alleles.
`is_het_non_ref`	Evaluate whether the call includes two different alleles, neither of which is reference.
`is_het_ref`	Evaluate whether the call includes two different alleles, one of which is reference.
`is_hom_ref`	Evaluate whether the call includes two reference alleles.
`is_hom_var`	Evaluate whether the call includes two identical alternate alleles.
`is_non_ref`	Evaluate whether the call includes one or more non-reference alleles.
`n_alt_alleles`	Returns the number of non-reference alleles.
`one_hot_alleles`	Returns an array containing the summed one-hot encoding of the alleles.
`unphase`	Returns an unphased version of this call.
`unphased_diploid_gt_index`	Return the genotype index for unphased, diploid calls.

__eq__(other)

Returns True if the two expressions are equal.

Examples

>>> x = hl.literal(5)
>>> y = hl.literal(5)
>>> z = hl.literal(1)

>>> hl.eval(x == y)
True

>>> hl.eval(x == z)
False

Notes

This method will fail with an error if the two expressions are not of comparable types.

Parameters:: other (Expression) – Expression for equality comparison.
Returns:: BooleanExpression – True if the two expressions are equal.

__ge__(other): Return self>=value.

__getitem__(item)[source]

Get the i*th* allele.

Examples

Index with a single integer:

>>> hl.eval(call[0])
0

>>> hl.eval(call[1])
1

Parameters:: item (int or Expression of type tint32) – Allele index.
Returns:: Expression of type tint32

__gt__(other): Return self>value.

__le__(other): Return self<=value.

__lt__(other): Return self<value.

__ne__(other)

Returns True if the two expressions are not equal.

Examples

>>> x = hl.literal(5)
>>> y = hl.literal(5)
>>> z = hl.literal(1)

>>> hl.eval(x != y)
False

>>> hl.eval(x != z)
True

Notes

This method will fail with an error if the two expressions are not of comparable types.

Parameters:: other (Expression) – Expression for inequality comparison.
Returns:: BooleanExpression – True if the two expressions are not equal.

collect(_localize=True)

Collect all records of an expression into a local list.

Examples

Collect all the values from C1:

>>> table1.C1.collect()
[2, 2, 10, 11]

Warning

Extremely experimental.

Warning

The list of records may be very large.

Returns:: list

contains_allele(allele)[source]

Returns true if the call has one or more called alleles of the given index.

>>> c = hl.call(0, 3)

>>> hl.eval(c.contains_allele(3))
True

>>> hl.eval(c.contains_allele(1))
False

Returns:: BooleanExpression

describe(handler=<built-in function print>): Print information about type, index, and dependencies.

property dtype

The data type of the expression.

Returns:: HailType

export(path, delimiter='\t', missing='NA', header=True)

Export a field to a text file.

Examples

>>> small_mt.GT.export('output/gt.tsv')
>>> with open('output/gt.tsv', 'r') as f:
...     for line in f:
...         print(line, end='')
locus   alleles 0       1       2       3
1:1     ["A","C"]       0/1     0/0     0/1     0/0
1:2     ["A","C"]       1/1     0/1     0/1     0/1
1:3     ["A","C"]       0/0     0/1     0/0     0/0
1:4     ["A","C"]       0/1     1/1     0/1     0/1

>>> small_mt.GT.export('output/gt-no-header.tsv', header=False)
>>> with open('output/gt-no-header.tsv', 'r') as f:
...     for line in f:
...         print(line, end='')
1:1     ["A","C"]       0/1     0/0     0/1     0/0
1:2     ["A","C"]       1/1     0/1     0/1     0/1
1:3     ["A","C"]       0/0     0/1     0/0     0/0
1:4     ["A","C"]       0/1     1/1     0/1     0/1

>>> small_mt.pop.export('output/pops.tsv')
>>> with open('output/pops.tsv', 'r') as f:
...     for line in f:
...         print(line, end='')
sample_idx      pop
0       1
1       2
2       2
3       2

>>> small_mt.ancestral_af.export('output/ancestral_af.tsv')
>>> with open('output/ancestral_af.tsv', 'r') as f:
...     for line in f:
...         print(line, end='')
locus   alleles ancestral_af
1:1     ["A","C"]       3.8152e-01
1:2     ["A","C"]       7.0588e-01
1:3     ["A","C"]       4.9991e-01
1:4     ["A","C"]       3.9616e-01

>>> small_mt.bn.export('output/bn.tsv')
>>> with open('output/bn.tsv', 'r') as f:
...     for line in f:
...         print(line, end='')
bn
{"n_populations":3,"n_samples":4,"n_variants":4,"n_partitions":4,"pop_dist":[1,1,1],"fst":[0.1,0.1,0.1],"mixture":false}

Notes

For entry-indexed expressions, if there is one column key field, the result of calling str() on that field is used as the column header. Otherwise, each compound column key is converted to JSON and used as a column header. For example:

>>> small_mt = small_mt.key_cols_by(s=small_mt.sample_idx, family='fam1')
>>> small_mt.GT.export('output/gt-no-header.tsv')
>>> with open('output/gt-no-header.tsv', 'r') as f:
...     for line in f:
...         print(line, end='')
locus   alleles {"s":0,"family":"fam1"} {"s":1,"family":"fam1"} {"s":2,"family":"fam1"} {"s":3,"family":"fam1"}
1:1     ["A","C"]       0/1     0/0     0/1     0/0
1:2     ["A","C"]       1/1     0/1     0/1     0/1
1:3     ["A","C"]       0/0     0/1     0/0     0/0
1:4     ["A","C"]       0/1     1/1     0/1     0/1

Parameters:

path (str) – The path to which to export.
delimiter (str) – The string for delimiting columns.
missing (str) – The string to output for missing values.
header (bool) – When True include a header line.

is_diploid()[source]

True if the call has ploidy equal to 2.

Examples

>>> hl.eval(call.is_diploid())
True

Returns:: BooleanExpression

is_haploid()[source]

True if the call has ploidy equal to 1.

Examples

>>> hl.eval(call.is_haploid())
False

Returns:: BooleanExpression

is_het()[source]

Evaluate whether the call includes two different alleles.

Examples

>>> hl.eval(call.is_het())
True

Notes

In the diploid biallelic case, a 0/1 call will return True, and 0/0 and 1/1 will return False.

Returns:: BooleanExpression – True if the two alleles are different, False if they are the same.

is_het_non_ref()[source]

Evaluate whether the call includes two different alleles, neither of which is reference.

Examples

>>> hl.eval(call.is_het_non_ref())
False

Notes

A biallelic variant may never have a het-non-ref call. Examples of these calls are 1/2 and 2/4.

Returns:: BooleanExpression – True if the call includes two different alternate alleles, False otherwise.

is_het_ref()[source]

Evaluate whether the call includes two different alleles, one of which is reference.

Examples

>>> hl.eval(call.is_het_ref())
True

Returns:: BooleanExpression – True if the call includes one reference and one alternate allele, False otherwise.

is_hom_ref()[source]

Evaluate whether the call includes two reference alleles.

Examples

>>> hl.eval(call.is_hom_ref())
False

Returns:: BooleanExpression – True if the call includes two reference alleles, False otherwise.

is_hom_var()[source]

Evaluate whether the call includes two identical alternate alleles.

Examples

>>> hl.eval(call.is_hom_var())
False

Returns:: BooleanExpression – True if the call includes two identical alternate alleles, False otherwise.

is_non_ref()[source]

Evaluate whether the call includes one or more non-reference alleles.

Examples

>>> hl.eval(call.is_non_ref())
True

Notes

In the diploid biallelic case, a 0/0 call will return False, and 0/1 and 1/1 will return True.

Returns:: BooleanExpression – True if at least one allele is non-reference, False otherwise.

n_alt_alleles()[source]

Returns the number of non-reference alleles.

Examples

>>> hl.eval(call.n_alt_alleles())
1

Notes

For diploid biallelic calls, this method is equivalent to the alternate allele dosage. For instance, 0/0 will return 0, 0/1 will return 1, and 1/1 will return 2.

Returns:: Expression of type tint32 – The number of non-reference alleles.

one_hot_alleles(alleles)[source]

Returns an array containing the summed one-hot encoding of the alleles.

Examples

Compute one-hot encoding when number of total alleles is 2.

>>> hl.eval(call.one_hot_alleles(2))
[1, 1]

DEPRECATED: Compute one-hot encoding based on length of list of alleles.

>>> hl.eval(call.one_hot_alleles(['A', 'T']))
[1, 1]

This one-hot representation is the positional sum of the one-hot encoding for each called allele. For a biallelic variant, the one-hot encoding for a reference allele is [1, 0] and the one-hot encoding for an alternate allele is [0, 1]. Diploid calls would produce the following arrays: [2, 0] for homozygous reference, [1, 1] for heterozygous, and [0, 2] for homozygous alternate.

Parameters:: alleles (Int32Expression or ArrayExpression of tstr.) – Number of total alleles, including the reference, or array of variant alleles.
Returns:: ArrayExpression of tint32 – An array of summed one-hot encodings of allele indices.

property phased

True if the call is phased.

Examples

>>> hl.eval(call.phased)
False

Returns:: BooleanExpression

property ploidy

Return the number of alleles of this call.

Examples

>>> hl.eval(call.ploidy)
2

Notes

Currently only ploidy 1 and 2 are supported.

Returns:: Expression of type tint32

show(n=None, width=None, truncate=None, types=True, handler=None, n_rows=None, n_cols=None)

Print the first few records of the expression to the console.

If the expression refers to a value on a keyed axis of a table or matrix table, then the accompanying keys will be shown along with the records.

Examples

>>> table1.SEX.show()
+-------+-----+
|    ID | SEX |
+-------+-----+
| int32 | str |
+-------+-----+
|     1 | "M" |
|     2 | "M" |
|     3 | "F" |
|     4 | "F" |
+-------+-----+

>>> hl.literal(123).show()
+--------+
| <expr> |
+--------+
|  int32 |
+--------+
|    123 |
+--------+

Notes

The output can be passed piped to another output source using the handler argument:

>>> ht.foo.show(handler=lambda x: logging.info(x))  

Parameters:

n (int) – Maximum number of rows to show.
width (int) – Horizontal width at which to break columns.
truncate (int, optional) – Truncate each field to the given number of characters. If None, truncate fields to the given width.
types (bool) – Print an extra header line with the type of each field.

summarize(handler=None): Compute and print summary information about the expression.

Danger

This functionality is experimental. It may not be tested as well as other parts of Hail and the interface is subject to change.

take(n, _localize=True)

Collect the first n records of an expression.

Examples

Take the first three rows:

>>> table1.X.take(3)
[5, 6, 7]

Warning

Extremely experimental.

Parameters:: n (int) – Number of records to take.
Returns:: list

unphase()[source]

Returns an unphased version of this call.

Returns:: CallExpression

unphased_diploid_gt_index()[source]

Return the genotype index for unphased, diploid calls.

Examples

>>> hl.eval(call.unphased_diploid_gt_index())
1

Returns:: Expression of type tint32