Expressions

Expression Base class for Hail expressions.
ArrayExpression Expression of type tarray.
ArrayNumericExpression Expression of type tarray with a numeric type.
BooleanExpression Expression of type tbool.
CallExpression Expression of type tcall.
CollectionExpression Expression of type tarray or tset
DictExpression Expression of type tdict.
IntervalExpression Expression of type tinterval.
LocusExpression Expression of type tlocus.
NumericExpression Expression of numeric type.
Int32Expression Expression of type tint32.
Int64Expression Expression of type tint64.
Float32Expression Expression of type tfloat32.
Float64Expression Expression of type tfloat64.
SetExpression Expression of type tset.
StringExpression Expression of type tstr.
StructExpression Expression of type tstruct.
class hail.expr.expressions.Expression(ir: hail.ir.base_ir.IR, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]

Base class for Hail expressions.

__eq__(other)[source]

Returns True if the two expressions are equal.

Examples

>>> x = hl.literal(5)
>>> y = hl.literal(5)
>>> z = hl.literal(1)
>>> hl.eval(x == y)
True
>>> hl.eval(x == z)
False

Notes

This method will fail with an error if the two expressions are not of comparable types.

Parameters:other (Expression) – Expression for equality comparison.
Returns:BooleanExpressionTrue if the two expressions are equal.
__ne__(other)[source]

Returns True if the two expressions are not equal.

Examples

>>> x = hl.literal(5)
>>> y = hl.literal(5)
>>> z = hl.literal(1)
>>> hl.eval(x != y)
False
>>> hl.eval(x != z)
True

Notes

This method will fail with an error if the two expressions are not of comparable types.

Parameters:other (Expression) – Expression for inequality comparison.
Returns:BooleanExpressionTrue if the two expressions are not equal.
collect()[source]

Collect all records of an expression into a local list.

Examples

Collect all the values from C1:

>>> first3 = table1.C1.collect()
[2, 2, 10, 11]

Warning

Extremely experimental.

Warning

The list of records may be very large.

Returns:list
describe(handler=<built-in function print>)[source]

Print information about type, index, and dependencies.

dtype

The data type of the expression.

Returns:HailType
show(n=10, width=90, truncate=None, types=True, handler=<built-in function print>)[source]

Print the first few rows of the table to the console.

Examples

>>> table1.SEX.show()
+-------+-----+
|    ID | SEX |
+-------+-----+
| int32 | str |
+-------+-----+
|     1 | M   |
|     2 | M   |
|     3 | F   |
|     4 | F   |
+-------+-----+
>>> hl.literal(123).show()
+--------+
| <expr> |
+--------+
|  int32 |
+--------+
|    123 |
+--------+

Warning

Extremely experimental.

Parameters:
  • n (int) – Maximum number of rows to show.
  • width (int) – Horizontal width at which to break columns.
  • truncate (int, optional) – Truncate each field to the given number of characters. If None, truncate fields to the given width.
  • types (bool) – Print an extra header line with the type of each field.
take(n)[source]

Collect the first n records of an expression.

Examples

Take the first three rows:

>>> first3 = table1.X.take(3)
[5, 6, 7]

Warning

Extremely experimental.

Parameters:n (int) – Number of records to take.
Returns:list
class hail.expr.expressions.ArrayExpression(ir: hail.ir.base_ir.IR, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]

Bases: hail.expr.expressions.typed_expressions.CollectionExpression

Expression of type tarray.

>>> names = hl.literal(['Alice', 'Bob', 'Charlie'])
__getitem__(item)[source]

Index into or slice the array.

Examples

Index with a single integer:

>>> hl.eval(names[1])
'Bob'
>>> hl.eval(names[-1])
'Charlie'

Slicing is also supported:

>>> hl.eval(names[1:])
['Bob', 'Charlie']
Parameters:item (slice or Expression of type tint32) – Index or slice.
Returns:Expression – Element or array slice.
append(item)[source]

Append an element to the array and return the result.

Examples

>>> hl.eval(names.append('Dan'))
['Alice', 'Bob', 'Charlie', 'Dan']

Note

This method does not mutate the caller, but instead returns a new array by copying the caller and adding item.

Parameters:item (Expression) – Element to append, same type as the array element type.
Returns:ArrayExpression
contains(item)[source]

Returns a boolean indicating whether item is found in the array.

Examples

>>> hl.eval(names.contains('Charlie'))
True
>>> hl.eval(names.contains('Helen'))
False
Parameters:item (Expression) – Item for inclusion test.

Warning

This method takes time proportional to the length of the array. If a pipeline uses this method on the same array several times, it may be more efficient to convert the array to a set first (set()).

Returns:BooleanExpressionTrue if the element is found in the array, False otherwise.
extend(a)[source]

Concatenate two arrays and return the result.

Examples

>>> hl.eval(names.extend(['Dan', 'Edith']))
['Alice', 'Bob', 'Charlie', 'Dan', 'Edith']
Parameters:a (ArrayExpression) – Array to concatenate, same type as the callee.
Returns:ArrayExpression
scan(f, zero)[source]

Map each element of the array to cumulative value of function f, with initial value zero.

Examples

>>> a = [0, 1, 2]
>>> hl.eval(hl.array_scan(lambda i, j: i + j, 0, a))
[0, 0, 1, 3]
Parameters:
  • f (function ( (Expression, Expression) -> Expression)) – Function which takes the cumulative value and the next element, and returns a new value.
  • zero (Expression) – Initial value to pass in as left argument of f.
Returns:

ArrayExpression.

class hail.expr.expressions.ArrayNumericExpression(ir: hail.ir.base_ir.IR, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]

Bases: hail.expr.expressions.typed_expressions.ArrayExpression

Expression of type tarray with a numeric type.

Numeric arrays support arithmetic both with scalar values and other arrays. Arithmetic between two numeric arrays requires that the length of each array is identical, and will apply the operation positionally (a1 * a2 will multiply the first element of a1 by the first element of a2, the second element of a1 by the second element of a2, and so on). Arithmetic with a scalar will apply the operation to each element of the array.

>>> a1 = hl.literal([0, 1, 2, 3, 4, 5])
>>> a2 = hl.literal([1, -1, 1, -1, 1, -1])
__add__(other)[source]

Positionally add an array or a scalar.

Examples

>>> hl.eval(a1 + 5)
[5, 6, 7, 8, 9, 10]
>>> hl.eval(a1 + a2)
[1, 0, 3, 2, 5, 4]
Parameters:other (NumericExpression or ArrayNumericExpression) – Value or array to add.
Returns:ArrayNumericExpression – Array of positional sums.
__floordiv__(other)[source]

Positionally divide by an array or a scalar using floor division.

Examples

>>> hl.eval(a1 // 2)
[0, 0, 1, 1, 2, 2]
Parameters:other (NumericExpression or ArrayNumericExpression)
Returns:ArrayNumericExpression
__mod__(other)[source]

Positionally compute the left modulo the right.

Examples

>>> hl.eval(a1 % 2)
[0, 1, 0, 1, 0, 1]
Parameters:other (NumericExpression or ArrayNumericExpression)
Returns:ArrayNumericExpression
__mul__(other)[source]

Positionally multiply by an array or a scalar.

Examples

>>> hl.eval(a2 * 5)
[5, -5, 5, -5, 5, -5]
>>> hl.eval(a1 * a2)
[0, -1, 2, -3, 4, -5]
Parameters:other (NumericExpression or ArrayNumericExpression) – Value or array to multiply by.
Returns:ArrayNumericExpression – Array of positional products.
__neg__()[source]

Negate elements of the array.

Examples

>>> hl.eval(-a1)
[0, -1, -2, -3, -4, -5]
Returns:ArrayNumericExpression – Array expression of the same type.
__pow__(other)[source]

Positionally raise to the power of an array or a scalar.

Examples

>>> hl.eval(a1 ** 2)
[0.0, 1.0, 4.0, 9.0, 16.0, 25.0]
>>> hl.eval(a1 ** a2)
[0.0, 1.0, 2.0, 0.3333333333333333, 4.0, 0.2]
Parameters:other (NumericExpression or ArrayNumericExpression)
Returns:ArrayNumericExpression
__sub__(other)[source]

Positionally subtract an array or a scalar.

Examples

>>> hl.eval(a2 - 1)
[0, -2, 0, -2, 0, -2]
>>> hl.eval(a1 - a2)
[-1, 2, 1, 4, 3, 6]
Parameters:other (NumericExpression or ArrayNumericExpression) – Value or array to subtract.
Returns:ArrayNumericExpression – Array of positional differences.
class hail.expr.expressions.BooleanExpression(ir: hail.ir.base_ir.IR, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]

Bases: hail.expr.expressions.typed_expressions.NumericExpression

Expression of type tbool.

>>> t = hl.literal(True)
>>> f = hl.literal(False)
>>> na = hl.null(hl.tbool)
>>> hl.eval(t)
True
>>> hl.eval(f)
False
>>> hl.eval(na)
None
__and__(other)[source]

Return True if the left and right arguments are True.

Examples

>>> hl.eval(t & f)
False
>>> hl.eval(t & na)
None
>>> hl.eval(f & na)
False

The & and | operators have higher priority than comparison operators like ==, <, or >. Parentheses are often necessary:

>>> x = hl.literal(5)
>>> hl.eval((x < 10) & (x > 2))
True
Parameters:other (BooleanExpression) – Right-side operand.
Returns:BooleanExpressionTrue if both left and right are True.
__invert__()[source]

Return the boolean negation.

Examples

>>> hl.eval(~t)
False
>>> hl.eval(~f)
True
>>> hl.eval(~na)
None
Returns:BooleanExpression – Boolean negation.
__or__(other)[source]

Return True if at least one of the left and right arguments is True.

Examples

>>> hl.eval(t | f)
True
>>> hl.eval(t | na)
True
>>> hl.eval(f | na)
None

The & and | operators have higher priority than comparison operators like ==, <, or >. Parentheses are often necessary:

>>> x = hl.literal(5)
>>> hl.eval((x < 10) | (x > 20))
True
Parameters:other (BooleanExpression) – Right-side operand.
Returns:BooleanExpressionTrue if either left or right is True.
class hail.expr.expressions.CallExpression(ir: hail.ir.base_ir.IR, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]

Bases: hail.expr.expressions.base_expression.Expression

Expression of type tcall.

>>> call = hl.call(0, 1, phased=False)
__getitem__(item)[source]

Get the i*th* allele.

Examples

Index with a single integer:

>>> hl.eval(call[0])
0
>>> hl.eval(call[1])
1
Parameters:item (int or Expression of type tint32) – Allele index.
Returns:Expression of type tint32
is_diploid()[source]

True if the call has ploidy equal to 2.

Examples

>>> hl.eval(call.is_diploid())
True
Returns:BooleanExpression
is_haploid()[source]

True if the call has ploidy equal to 1.

Examples

>>> hl.eval(call.is_haploid())
False
Returns:BooleanExpression
is_het()[source]

Evaluate whether the call includes two different alleles.

Examples

>>> hl.eval(call.is_het())
True
Returns:BooleanExpressionTrue if the two alleles are different, False if they are the same.
is_het_non_ref()[source]

Evaluate whether the call includes two different alleles, neither of which is reference.

Examples

>>> hl.eval(call.is_het_non_ref())
False
Returns:BooleanExpressionTrue if the call includes two different alternate alleles, False otherwise.
is_het_ref()[source]

Evaluate whether the call includes two different alleles, one of which is reference.

Examples

>>> hl.eval(call.is_het_ref())
True
Returns:BooleanExpressionTrue if the call includes one reference and one alternate allele, False otherwise.
is_hom_ref()[source]

Evaluate whether the call includes two reference alleles.

Examples

>>> hl.eval(call.is_hom_ref())
False
Returns:BooleanExpressionTrue if the call includes two reference alleles, False otherwise.
is_hom_var()[source]

Evaluate whether the call includes two identical alternate alleles.

Examples

>>> hl.eval(call.is_hom_var())
False
Returns:BooleanExpressionTrue if the call includes two identical alternate alleles, False otherwise.
is_non_ref()[source]

Evaluate whether the call includes one or more non-reference alleles.

Examples

>>> hl.eval(call.is_non_ref())
True
Returns:BooleanExpressionTrue if at least one allele is non-reference, False otherwise.
n_alt_alleles()[source]

Returns the number of non-reference alleles.

Examples

>>> hl.eval(call.n_alt_alleles())
1
Returns:Expression of type tint32 – The number of non-reference alleles.
one_hot_alleles(alleles)[source]

Returns an array containing the summed one-hot encoding of the alleles.

Examples

>>> hl.eval(call.one_hot_alleles(['A', 'T']))
[1, 1]

This one-hot representation is the positional sum of the one-hot encoding for each called allele. For a biallelic variant, the one-hot encoding for a reference allele is [1, 0] and the one-hot encoding for an alternate allele is [0, 1]. Diploid calls would produce the following arrays: [2, 0] for homozygous reference, [1, 1] for heterozygous, and [0, 2] for homozygous alternate.

Parameters:alleles (ArrayStringExpression) – Variant alleles.
Returns:ArrayInt32Expression – An array of summed one-hot encodings of allele indices.
phased

True if the call is phased.

Examples

>>> hl.eval(call.phased)
False
Returns:BooleanExpression
ploidy

Return the number of alleles of this call.

Examples

>>> hl.eval(call.ploidy)
2
Returns:Expression of type tint32
unphased_diploid_gt_index()[source]

Return the genotype index for unphased, diploid calls.

Examples

>>> hl.eval(call.unphased_diploid_gt_index())
1
Returns:Expression of type tint32
class hail.expr.expressions.CollectionExpression(ir: hail.ir.base_ir.IR, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]

Bases: hail.expr.expressions.base_expression.Expression

Expression of type tarray or tset

>>> a = hl.literal([1, 2, 3, 4, 5])
>>> s3 = hl.literal({'Alice', 'Bob', 'Charlie'})
all(f)[source]

Returns True if f returns True for every element.

Examples

>>> hl.eval(a.all(lambda x: x < 10))
True

Notes

This method returns True if the collection is empty.

Parameters:f (function ( (arg) -> BooleanExpression)) – Function to evaluate for each element of the collection. Must return a BooleanExpression.
Returns:BooleanExpression. – True if f returns True for every element, False otherwise.
any(f)[source]

Returns True if f returns True for any element.

Examples

>>> hl.eval(a.any(lambda x: x % 2 == 0))
True
>>> hl.eval(s3.any(lambda x: x[0] == 'D'))
False

Notes

This method always returns False for empty collections.

Parameters:f (function ( (arg) -> BooleanExpression)) – Function to evaluate for each element of the collection. Must return a BooleanExpression.
Returns:BooleanExpression. – True if f returns True for any element, False otherwise.
filter(f)[source]

Returns a new collection containing elements where f returns True.

Examples

>>> hl.eval(a.filter(lambda x: x % 2 == 0))
[2, 4]
>>> hl.eval(s3.filter(lambda x: ~(x[-1] == 'e')))
{'Bob'}

Notes

Returns a same-type expression; evaluated on a SetExpression, returns a SetExpression. Evaluated on an ArrayExpression, returns an ArrayExpression.

Parameters:f (function ( (arg) -> BooleanExpression)) – Function to evaluate for each element of the collection. Must return a BooleanExpression.
Returns:CollectionExpression – Expression of the same type as the callee.
find(f)[source]

Returns the first element where f returns True.

Examples

>>> hl.eval(a.find(lambda x: x ** 2 > 20))
5
>>> hl.eval(s3.find(lambda x: x[0] == 'D'))
None

Notes

If f returns False for every element, then the result is missing.

Parameters:f (function ( (arg) -> BooleanExpression)) – Function to evaluate for each element of the collection. Must return a BooleanExpression.
Returns:Expression – Expression whose type is the element type of the collection.
flatmap(f)[source]

Map each element of the collection to a new collection, and flatten the results.

Examples

>>> hl.eval(a.flatmap(lambda x: hl.range(0, x)))
[0, 0, 1, 0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 3, 4]
>>> hl.eval(s3.flatmap(lambda x: hl.set(hl.range(0, x.length()).map(lambda i: x[i]))))
{'A', 'B', 'C', 'a', 'b', 'c', 'e', 'h', 'i', 'l', 'o', 'r'}
Parameters:f (function ( (arg) -> CollectionExpression)) – Function from the element type of the collection to the type of the collection. For instance, flatmap on a set<str> should take a str and return a set.
Returns:CollectionExpression
fold(f, zero)[source]

Reduces the collection with the given function f, provided the initial value zero.

Examples

>>> a = [0, 1, 2]
>>> hl.eval(hl.fold(lambda i, j: i + j, 0, a))
3
Parameters:
  • f (function ( (Expression, Expression) -> Expression)) – Function which takes the cumulative value and the next element, and returns a new value.
  • zero (Expression) – Initial value to pass in as left argument of f.
Returns:

Expression.

group_by(f)[source]

Group elements into a dict according to a lambda function.

Examples

>>> hl.eval(a.group_by(lambda x: x % 2 == 0))
{False: [1, 3, 5], True: [2, 4]}
>>> hl.eval(s3.group_by(lambda x: x.length()))
{3: {'Bob'}, 5: {'Alice'}, 7: {'Charlie'}}
Parameters:f (function ( (arg) -> Expression)) – Function to evaluate for each element of the collection to produce a key for the resulting dictionary.
Returns:DictExpression. – Dictionary keyed by results of f.
length()[source]

Returns the size of a collection.

Examples

>>> hl.eval(a.length())
5
>>> hl.eval(s3.length())
3
Returns:Expression of type tint32 – The number of elements in the collection.
map(f)[source]

Transform each element of a collection.

Examples

>>> hl.eval(a.map(lambda x: x ** 3))
[1.0, 8.0, 27.0, 64.0, 125.0]
>>> hl.eval(s3.map(lambda x: x.length()))
{3, 5, 7}
Parameters:f (function ( (arg) -> Expression)) – Function to transform each element of the collection.
Returns:CollectionExpression. – Collection where each element has been transformed according to f.
size()[source]

Returns the size of a collection.

Examples

>>> hl.eval(a.size())
5
>>> hl.eval(s3.size())
3
Returns:Expression of type tint32 – The number of elements in the collection.
class hail.expr.expressions.DictExpression(ir, type, indices=Indices(axes=set(), source=None), aggregations=List())[source]

Bases: hail.expr.expressions.base_expression.Expression

Expression of type tdict.

>>> d = hl.literal({'Alice': 43, 'Bob': 33, 'Charles': 44})
__getitem__(item)[source]

Get the value associated with key item.

Examples

>>> hl.eval(d['Alice'])
43

Notes

Raises an error if item is not a key of the dictionary. Use DictExpression.get() to return missing instead of an error.

Parameters:item (Expression) – Key expression.
Returns:Expression – Value associated with key item.
contains(item)[source]

Returns whether a given key is present in the dictionary.

Examples

>>> hl.eval(d.contains('Alice'))
True
>>> hl.eval(d.contains('Anne'))
False
Parameters:item (Expression) – Key to test for inclusion.
Returns:BooleanExpressionTrue if item is a key of the dictionary, False otherwise.
get(item, default=None)[source]

Returns the value associated with key k or a default value if that key is not present.

Examples

>>> hl.eval(d.get('Alice'))
43
>>> hl.eval(d.get('Anne'))
None
>>> hl.eval(d.get('Anne', 0))
0
Parameters:
  • item (Expression) – Key.
  • default (Expression) – Default value. Must be same type as dictionary values.
Returns:

Expression – The value associated with item, or default.

key_set()[source]

Returns the set of keys in the dictionary.

Examples

>>> hl.eval(d.key_set())
{'Alice', 'Bob', 'Charles'}
Returns:SetExpression – Set of all keys.
keys()[source]

Returns an array with all keys in the dictionary.

Examples

>>> hl.eval(d.keys())
['Bob', 'Charles', 'Alice']
Returns:ArrayExpression – Array of all keys.
map_values(f)[source]

Transform values of the dictionary according to a function.

Examples

>>> hl.eval(d.map_values(lambda x: x * 10))
{'Alice': 430, 'Bob': 330, 'Charles': 440}
Parameters:f (function ( (arg) -> Expression)) – Function to apply to each value.
Returns:DictExpression – Dictionary with transformed values.
size()[source]

Returns the size of the dictionary.

Examples

>>> hl.eval(d.size())
3
Returns:Expression of type tint32 – Size of the dictionary.
values()[source]

Returns an array with all values in the dictionary.

Examples

>>> hl.eval(d.values())
[33, 44, 43]
Returns:ArrayExpression – All values in the dictionary.
class hail.expr.expressions.IntervalExpression(ir: hail.ir.base_ir.IR, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]

Bases: hail.expr.expressions.base_expression.Expression

Expression of type tinterval.

>>> interval = hl.interval(3, 11)
>>> locus_interval = hl.parse_locus_interval("1:53242-90543")
contains(value)[source]

Tests whether a value is contained in the interval.

Examples

>>> hl.eval(interval.contains(3))
True
>>> hl.eval(interval.contains(11))
False
Parameters:value – Object with type matching the interval point type.
Returns:BooleanExpressionTrue if value is contained in the interval, False otherwise.
end

Returns the end point.

Examples

>>> hl.eval(interval.end)
11
Returns:Expression
includes_end

True if the interval includes the end point.

Examples

>>> hl.eval(interval.includes_end)
False
Returns:BooleanExpression
includes_start

True if the interval includes the start point.

Examples

>>> hl.eval(interval.includes_start)
True
Returns:BooleanExpression
overlaps(interval)[source]

True if the the supplied interval contains any value in common with this one.

Examples

>>> hl.eval(interval.overlaps(hl.interval(5, 9)))
True
>>> hl.eval(interval.overlaps(hl.interval(11, 20)))
False
Parameters:interval (Expression with type tinterval) – Interval object with the same point type.
Returns:BooleanExpression
start

Returns the start point.

Examples

>>> hl.eval(interval.start)
3
Returns:Expression
class hail.expr.expressions.LocusExpression(ir: hail.ir.base_ir.IR, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]

Bases: hail.expr.expressions.base_expression.Expression

Expression of type tlocus.

>>> locus = hl.locus('1', 1034245)
contig

Returns the chromosome.

Examples

>>> hl.eval(locus.contig)
'1'
Returns:StringExpression – The chromosome for this locus.
global_position()[source]

Returns a zero-indexed absolute position along the reference genome.

The global position is computed as position - 1 plus the sum of the lengths of all the contigs that precede this locus’s contig in the reference genome’s ordering of contigs.

See also locus_from_global_position().

Examples

A locus with position 1 along chromosome 1 will have a global position of 0 along the reference genome GRCh37.

>>> hl.eval(hl.locus('1', 1).global_position())
0

A locus with position 1 along chromosome 2 will have a global position of (1-1) + 249250621, where 249250621 is the length of chromosome 1 on GRCh37.

>>> hl.eval(hl.locus('2', 1).global_position())
249250621

A different reference genome than the default results in a different global position.

>>> hl.eval(hl.locus('chr2', 1, 'GRCh38').global_position())
248956422
Returns:Expression of type tint64 – Global base position of locus along the reference genome.
in_autosome()[source]

Returns True if the locus is on an autosome.

Notes

All contigs are considered autosomal except those designated as X, Y, or MT by ReferenceGenome.

Examples

>>> hl.eval(locus.in_autosome())
True
Returns:BooleanExpression
in_autosome_or_par()[source]

Returns True if the locus is on an autosome or a pseudoautosomal region of chromosome X or Y.

Examples

>>> hl.eval(locus.in_autosome_or_par())
True
Returns:BooleanExpression
in_mito()[source]

Returns True if the locus is on mitochondrial DNA.

Examples

>>> hl.eval(locus.in_mito())
True
Returns:BooleanExpression
in_x_nonpar()[source]

Returns True if the locus is in a non-pseudoautosomal region of chromosome X.

Examples

>>> hl.eval(locus.in_x_nonpar())
False
Returns:BooleanExpression
in_x_par()[source]

Returns True if the locus is in a pseudoautosomal region of chromosome X.

Examples

>>> hl.eval(locus.in_x_par())
False
Returns:BooleanExpression
in_y_nonpar()[source]

Returns True if the locus is in a non-pseudoautosomal region of chromosome Y.

Examples

>>> hl.eval(locus.in_y_nonpar())
False

Note

Many variant callers only generate variants on chromosome X for the pseudoautosomal region. In this case, all loci mapped to chromosome Y are non-pseudoautosomal.

Returns:BooleanExpression
in_y_par()[source]

Returns True if the locus is in a pseudoautosomal region of chromosome Y.

Examples

>>> hl.eval(locus.in_y_par())
False

Note

Many variant callers only generate variants on chromosome X for the pseudoautosomal region. In this case, all loci mapped to chromosome Y are non-pseudoautosomal.

Returns:BooleanExpression
position

Returns the position along the chromosome.

Examples

>>> hl.eval(locus.position)
1034245
Returns:Expression of type tint32 – This locus’s position along its chromosome.
sequence_context(before=0, after=0)[source]

Return the reference genome sequence at the locus.

Examples

Get the reference allele at a locus:

>>> hl.eval(locus.sequence_context()) 
"G"

Get the reference sequence at a locus including the previous 5 bases:

>>> hl.eval(locus.sequence_context(before=5)) 
"ACTCGG"

Notes

This function requires that this locus’ reference genome has an attached reference sequence. Use ReferenceGenome.add_sequence() to load and attach a reference sequence to a reference genome.

Parameters:
  • before (Expression of type tint32, optional) – Number of bases to include before the locus. Truncates at contig boundary.
  • after (Expression of type tint32, optional) – Number of bases to include after the locus. Truncates at contig boundary.
Returns:

StringExpression

class hail.expr.expressions.NumericExpression(ir: hail.ir.base_ir.IR, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]

Bases: hail.expr.expressions.base_expression.Expression

Expression of numeric type.

>>> x = hl.literal(3)
>>> y = hl.literal(4.5)
__add__(other)[source]

Add two numbers.

Examples

>>> hl.eval(x + 2)
5
>>> hl.eval(x + y)
7.5
Parameters:other (NumericExpression) – Number to add.
Returns:NumericExpression – Sum of the two numbers.
__floordiv__(other)[source]

Divide two numbers with floor division.

Examples

>>> hl.eval(x // 2)
1
>>> hl.eval(y // 2)
2.0
Parameters:other (NumericExpression) – Dividend.
Returns:NumericExpression – The floor of the left number divided by the right.
__ge__(other)[source]

Greater-than-or-equals comparison.

Examples

>>> hl.eval(y >= 4)
True
Parameters:other (NumericExpression) – Right side for comparison.
Returns:BooleanExpressionTrue if the left side is greater than or equal to the right side.
__gt__(other)[source]

Greater-than comparison.

Examples

>>> hl.eval(y > 4)
True
Parameters:other (NumericExpression) – Right side for comparison.
Returns:BooleanExpressionTrue if the left side is greater than the right side.
__le__(other)[source]

Less-than-or-equals comparison.

Examples

>>> hl.eval(x <= 3)
True
Parameters:other (NumericExpression) – Right side for comparison.
Returns:BooleanExpressionTrue if the left side is smaller than or equal to the right side.
__lt__(other)[source]

Less-than comparison.

Examples

>>> hl.eval(x < 5)
True
Parameters:other (NumericExpression) – Right side for comparison.
Returns:BooleanExpressionTrue if the left side is smaller than the right side.
__mod__(other)[source]

Compute the left modulo the right number.

Examples

>>> hl.eval(32 % x)
2
>>> hl.eval(7 % y)
2.5
Parameters:other (NumericExpression) – Dividend.
Returns:NumericExpression – Remainder after dividing the left by the right.
__mul__(other)[source]

Multiply two numbers.

Examples

>>> hl.eval(x * 2)
6
>>> hl.eval(x * y)
9.0
Parameters:other (NumericExpression) – Number to multiply.
Returns:NumericExpression – Product of the two numbers.
__neg__()[source]

Negate the number (multiply by -1).

Examples

>>> hl.eval(-x)
-3
Returns:NumericExpression – Negated number.
__pow__(power, modulo=None)[source]

Raise the left to the right power.

Examples

>>> hl.eval(x ** 2)
9.0
>>> hl.eval(x ** -2)
0.1111111111111111
>>> hl.eval(y ** 1.5)
9.545941546018392
Parameters:
Returns:

Expression of type tfloat64 – Result of raising left to the right power.

__sub__(other)[source]

Subtract the right number from the left.

Examples

>>> hl.eval(x - 2)
1
>>> hl.eval(x - y)
-1.5
Parameters:other (NumericExpression) – Number to subtract.
Returns:NumericExpression – Difference of the two numbers.
class hail.expr.expressions.Int32Expression(ir: hail.ir.base_ir.IR, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]

Bases: hail.expr.expressions.typed_expressions.NumericExpression

Expression of type tint32.

class hail.expr.expressions.Int64Expression(ir: hail.ir.base_ir.IR, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]

Bases: hail.expr.expressions.typed_expressions.NumericExpression

Expression of type tint64.

class hail.expr.expressions.Float32Expression(ir: hail.ir.base_ir.IR, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]

Bases: hail.expr.expressions.typed_expressions.NumericExpression

Expression of type tfloat32.

class hail.expr.expressions.Float64Expression(ir: hail.ir.base_ir.IR, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]

Bases: hail.expr.expressions.typed_expressions.NumericExpression

Expression of type tfloat64.

class hail.expr.expressions.SetExpression(ir, type, indices=Indices(axes=set(), source=None), aggregations=List())[source]

Bases: hail.expr.expressions.typed_expressions.CollectionExpression

Expression of type tset.

>>> s1 = hl.literal({1, 2, 3})
>>> s2 = hl.literal({1, 3, 5})
add(item)[source]

Returns a new set including item.

Examples

>>> hl.eval(s1.add(10))
{1, 2, 3, 10}
Parameters:item (Expression) – Value to add.
Returns:SetExpression – Set with item added.
contains(item)[source]

Returns True if item is in the set.

Examples

>>> hl.eval(s1.contains(1))
True
>>> hl.eval(s1.contains(10))
False
Parameters:item (Expression) – Value for inclusion test.
Returns:BooleanExpressionTrue if item is in the set.
difference(s)[source]

Return the set of elements in the set that are not present in set s.

Examples

>>> hl.eval(s1.difference(s2))
{2}
>>> hl.eval(s2.difference(s1))
{5}
Parameters:s (SetExpression) – Set expression of the same type.
Returns:SetExpression – Set of elements not in s.
intersection(s)[source]

Return the intersection of the set and set s.

Examples

>>> hl.eval(s1.intersection(s2))
{1, 3}
Parameters:s (SetExpression) – Set expression of the same type.
Returns:SetExpression – Set of elements present in s.
is_subset(s)[source]

Returns True if every element is contained in set s.

Examples

>>> hl.eval(s1.is_subset(s2))
False
>>> hl.eval(s1.remove(2).is_subset(s2))
True
Parameters:s (SetExpression) – Set expression of the same type.
Returns:BooleanExpressionTrue if every element is contained in set s.
remove(item)[source]

Returns a new set excluding item.

Examples

>>> hl.eval(s1.remove(1))
{2, 3}
Parameters:item (Expression) – Value to remove.
Returns:SetExpression – Set with item removed.
union(s)[source]

Return the union of the set and set s.

Examples

>>> hl.eval(s1.union(s2))
{1, 2, 3, 5}
Parameters:s (SetExpression) – Set expression of the same type.
Returns:SetExpression – Set of elements present in either set.
class hail.expr.expressions.StringExpression(ir: hail.ir.base_ir.IR, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]

Bases: hail.expr.expressions.base_expression.Expression

Expression of type tstr.

>>> s = hl.literal('The quick brown fox')
__add__(other)[source]

Concatenate strings.

Examples

>>> hl.eval(s + ' jumped over the lazy dog')
'The quick brown fox jumped over the lazy dog'
Parameters:other (StringExpression) – String to concatenate.
Returns:StringExpression – Concatenated string.
__getitem__(item)[source]

Slice or index into the string.

Examples

>>> hl.eval(s[:15])
'The quick brown'
>>> hl.eval(s[0])
'T'
Parameters:item (slice or Expression of type tint32) – Slice or character index.
Returns:StringExpression – Substring or character at index item.
contains(substr)[source]

Returns whether substr is contained in the string.

Examples

>>> hl.eval(s.contains('fox'))
True
>>> hl.eval(s.contains('dog'))
False

Note

This method is case-sensitive.

Parameters:substr (StringExpression)
Returns:BooleanExpression
endswith(substr)[source]

Returns whether substr is a suffix of the string.

Examples

>>> hl.eval(s.endswith('dog'))
True

Note

This method is case-sensitive.

Parameters:substr (StringExpression)
Returns:StringExpression
first_match_in(regex)[source]

Returns an array containing the capture groups of the first match of regex in the given character sequence.

Examples

>>> hl.eval(s.first_match_in("The quick (\w+) fox"))
["brown"]
>>> hl.eval(s.first_match_in("The (\w+) (\w+) (\w+)"))
["quick", "brown", "fox"]
>>> hl.eval(s.first_match_in("(\w+) (\w+)"))
None
Parameters:regex (StringExpression)
Returns:ArrayExpression with element type tstr
length()[source]

Returns the length of the string.

Examples

>>> hl.eval(s.length())
19
Returns:Expression of type tint32 – Length of the string.
lower()[source]

Returns a copy of the string, but with upper case letters converted to lower case.

Examples

>>> hl.eval(s.lower())
'the quick brown fox'
Returns:StringExpression
matches(regex)[source]

Returns True if the string contains any match for the given regex.

Examples

>>> string = hl.literal('NA12878')

The regex parameter does not need to match the entire string:

>>> hl.eval(string.matches('12'))
True

Regex motifs can be used to match sequences of characters:

>>> hl.eval(string.matches(r'NA\\d+'))
True

Notes

The regex argument is a regular expression, and uses Java regex syntax.

Parameters:regex (str) – Pattern to match.
Returns:BooleanExpressionTrue if the string contains any match for the regex, otherwise False.
replace(pattern1, pattern2)[source]

Replace substrings matching pattern1 with pattern2 using regex.

Examples

>>> hl.eval(s.replace(' ', '_'))
'The_quick_brown_fox'

Notes

The regex expressions used should follow Java regex syntax

Parameters:
split(delim, n=None)[source]

Returns an array of strings generated by splitting the string at delim.

Examples

>>> hl.eval(s.split('\s+'))
['The', 'quick', 'brown', 'fox']
>>> hl.eval(s.split('\s+', 2))
['The', 'quick brown fox']

Notes

The delimiter is a regex using the Java regex syntax delimiter. To split on special characters, escape them with double backslash (\\).

Parameters:
Returns:

ArrayExpression – Array of split strings.

startswith(substr)[source]

Returns whether substr is a prefix of the string.

Examples

>>> hl.eval(s.startswith('The'))
True
>>> hl.eval(s.startswith('the'))
False

Note

This method is case-sensitive.

Parameters:substr (StringExpression)
Returns:StringExpression
strip()[source]

Returns a copy of the string with whitespace removed from the start and end.

Examples

>>> s2 = hl.str('  once upon a time\n')
>>> hl.eval(s2.strip())
'once upon a time'
Returns:StringExpression
upper()[source]

Returns a copy of the string, but with lower case letters converted to upper case.

Examples

>>> hl.eval(s.upper())
'THE QUICK BROWN FOX'
Returns:StringExpression
class hail.expr.expressions.StructExpression(ir, type, indices=Indices(axes=set(), source=None), aggregations=List())[source]

Bases: typing.Mapping, hail.expr.expressions.base_expression.Expression

Expression of type tstruct.

>>> struct = hl.struct(a=5, b='Foo')

Struct fields are accessible as attributes and keys. It is therefore possible to access field a of struct s with dot syntax:

>>> hl.eval(struct.a)
5

However, it is recommended to use square brackets to select fields:

>>> hl.eval(struct['a'])
5

The latter syntax is safer, because fields that share their name with an existing attribute of StructExpression (keys, values, annotate, drop, etc.) will only be accessible using the StructExpression.__getitem__() syntax. This is also the only way to access fields that are not valid Python identifiers, like fields with spaces or symbols.

__getitem__(item)[source]

Access a field of the struct by name or index.

Examples

>>> hl.eval(struct['a'])
5
>>> hl.eval(struct[1])
'Foo'
Parameters:item (str) – Field name.
Returns:Expression – Struct field.
annotate(**named_exprs)[source]

Add new fields or recompute existing fields.

Examples

>>> hl.eval(struct.annotate(a=10, c=2*2*2))
Struct(a=10, b='Foo', c=8)

Notes

If an expression in named_exprs shares a name with a field of the struct, then that field will be replaced but keep its position in the struct. New fields will be appended to the end of the struct.

Parameters:named_exprs (keyword args of Expression) – Fields to add.
Returns:StructExpression – Struct with new or updated fields.
drop(*fields)[source]

Drop fields from the struct.

Examples

>>> hl.eval(struct.drop('b'))
Struct(a=5)
Parameters:fields (varargs of str) – Fields to drop.
Returns:StructExpression – Struct without certain fields.
select(*fields, **named_exprs)[source]

Select existing fields and compute new ones.

Examples

>>> hl.eval(struct.select('a', c=['bar', 'baz']))
Struct(a=5, c=[u'bar', u'baz'])

Notes

The fields argument is a list of field names to keep. These fields will appear in the resulting struct in the order they appear in fields.

The named_exprs arguments are new field expressions.

Parameters:
  • fields (varargs of str) – Field names to keep.
  • named_exprs (keyword args of Expression) – New field expressions.
Returns:

StructExpression – Struct containing specified existing fields and computed fields.