Collection functions

Collection constructors

dict(collection)

Creates a dictionary.

empty_dict(key_type, value_type)

Returns an empty dictionary with key type key_type and value type value_type.

array(collection)

Construct an array expression.

empty_array(t)

Returns an empty array of elements of a type t.

set(collection)

Convert a set expression.

empty_set(t)

Returns an empty set of elements of a type t.

Collection functions

len(x)

Returns the size of a collection or string.

map(f, *collections)

Transform each element of a collection.

flatmap(f, collection)

Map each element of the collection to a new collection, and flatten the results.

starmap(f, collection)

Transform each element of a collection of tuples.

zip(*arrays[, fill_missing])

Zip together arrays into a single array.

enumerate(a[, start, index_first])

Returns an array of (index, element) tuples.

zip_with_index(a[, index_first])

Deprecated in favor of enumerate().

flatten(collection)

Flatten a nested collection by concatenating sub-collections.

any(*args)

Check for any True in boolean expressions or collections of booleans.

all(*args)

Check for all True in boolean expressions or collections of booleans.

filter(f, collection)

Returns a new collection containing elements where f returns True.

sorted(collection[, key, reverse])

Returns a sorted array.

find(f, collection)

Returns the first element where f returns True.

group_by(f, collection)

Group collection elements into a dict according to a lambda function.

fold(f, zero, collection)

Reduces a collection with the given function f, provided the initial value zero.

array_scan(f, zero, a)

Map each element of a to cumulative value of function f, with initial value zero.

reversed(x)

Reverses the elements of a collection.

keyed_intersection(*arrays, key)

Compute the intersection of sorted arrays on a given key.

keyed_union(*arrays, key)

Compute the distinct union of sorted arrays on a given key.

hail.expr.functions.len(x)[source]

Returns the size of a collection or string.

Examples

>>> a = ['The', 'quick', 'brown', 'fox']
>>> s = {1, 3, 5, 6, 7, 9}
>>> hl.eval(hl.len(a))
4
>>> hl.eval(hl.len(s))
6
>>> hl.eval(hl.len("12345"))
5
Parameters:

x (ArrayExpression or SetExpression or DictExpression or StringExpression) – String or collection expression.

Returns:

Expression of type tint32

hail.expr.functions.map(f, *collections)[source]

Transform each element of a collection.

Examples

>>> a = ['The', 'quick', 'brown', 'fox']
>>> b = [2, 4, 6, 8]
>>> hl.eval(hl.map(lambda x: hl.len(x), a))
[3, 5, 5, 3]
>>> hl.eval(hl.map(lambda s, n: hl.len(s) + n, a, b))
[5, 9, 11, 11]
Parameters:
  • f (function ( (*arg) -> Expression)) – Function to transform each element of the collection.

  • *collections (ArrayExpression or SetExpression) – A single collection expression or multiple array expressions.

Returns:

ArrayExpression or SetExpression. – Collection where each element has been transformed by f.

hail.expr.functions.flatmap(f, collection)[source]

Map each element of the collection to a new collection, and flatten the results.

Examples

>>> a = [[0, 1], [1, 2], [4, 5, 6, 7]]
>>> hl.eval(hl.flatmap(lambda x: x[1:], a))
[1, 2, 5, 6, 7]
Parameters:
  • f (function ( (arg) -> CollectionExpression)) – Function from the element type of the collection to the type of the collection. For instance, flatmap on a set<str> should take a str and return a set.

  • collection (ArrayExpression or SetExpression) – Collection expression.

Returns:

ArrayExpression or SetExpression

hail.expr.functions.starmap(f, collection)[source]

Transform each element of a collection of tuples.

Examples

>>> a = [(1, 5), (3, 2), (7, 8)]
>>> hl.eval(hl.starmap(lambda x, y: hl.if_else(x < y, x, y), a))
[1, 2, 7]
Parameters:
Returns:

ArrayExpression or SetExpression. – Collection where each element has been transformed by f.

hail.expr.functions.zip(*arrays, fill_missing=False)[source]

Zip together arrays into a single array.

Examples

>>> hl.eval(hl.zip([1, 2, 3], [4, 5, 6]))
[(1, 4), (2, 5), (3, 6)]

If the arrays are different lengths, the behavior is decided by the fill_missing parameter.

>>> hl.eval(hl.zip([1], [10, 20], [100, 200, 300]))
[(1, 10, 100)]
>>> hl.eval(hl.zip([1], [10, 20], [100, 200, 300], fill_missing=True))
[(1, 10, 100), (None, 20, 200), (None, None, 300)]

Notes

The element type of the resulting array is a ttuple with a field for each array.

Parameters:
  • arrays (: variable-length args of ArrayExpression) – Array expressions.

  • fill_missing (bool) – If False, return an array with length equal to the shortest length of the arrays. If True, return an array equal to the longest length of the arrays, by extending the shorter arrays with missing values.

Returns:

ArrayExpression

hail.expr.functions.enumerate(a, start=0, *, index_first=True)[source]

Returns an array of (index, element) tuples.

Examples

>>> hl.eval(hl.enumerate(['A', 'B', 'C']))
[(0, 'A'), (1, 'B'), (2, 'C')]
>>> hl.eval(hl.enumerate(['A', 'B', 'C'], start=3))
[(3, 'A'), (4, 'B'), (5, 'C')]
>>> hl.eval(hl.enumerate(['A', 'B', 'C'], index_first=False))
[('A', 0), ('B', 1), ('C', 2)]
Parameters:
  • a (ArrayExpression)

  • start (Int32Expression) – The index value from which the counter is started, 0 by default.

  • index_first (bool) – If True, the index is the first value of the element tuples. If False, the index is the second value.

Returns:

ArrayExpression – Array of (index, element) or (element, index) tuples.

hail.expr.functions.zip_with_index(a, index_first=True)[source]

Deprecated in favor of enumerate().

Returns an array of (index, element) tuples.

Examples

>>> hl.eval(hl.zip_with_index(['A', 'B', 'C']))
[(0, 'A'), (1, 'B'), (2, 'C')]
>>> hl.eval(hl.zip_with_index(['A', 'B', 'C'], index_first=False))
[('A', 0), ('B', 1), ('C', 2)]
Parameters:
  • a (ArrayExpression)

  • index_first (bool) – If True, the index is the first value of the element tuples. If False, the index is the second value.

Returns:

ArrayExpression – Array of (index, element) or (element, index) tuples.

hail.expr.functions.flatten(collection)[source]

Flatten a nested collection by concatenating sub-collections.

Examples

>>> a = [[1, 2], [2, 3]]
>>> hl.eval(hl.flatten(a))
[1, 2, 2, 3]
Parameters:

collection (ArrayExpression or SetExpression) – Collection with element type tarray or tset.

Returns:

collection (ArrayExpression or SetExpression)

hail.expr.functions.any(*args)[source]

Check for any True in boolean expressions or collections of booleans.

any() comes in three forms:

  1. hl.any(boolean, ...). Is at least one argument True?

  2. hl.any(collection). Is at least one element of this collection True?

  3. hl.any(function, collection). Does function return True for at least one value in this collection?

Examples

The first form:

>>> hl.eval(hl.any())
False
>>> hl.eval(hl.any(True))
True
>>> hl.eval(hl.any(False))
False
>>> hl.eval(hl.any(False, False, True, False))
True

The second form:

>>> hl.eval(hl.any([False, True, False]))
True
>>> hl.eval(hl.any([False, False, False]))
False

The third form:

>>> a = ['The', 'quick', 'brown', 'fox']
>>> s = {1, 3, 5, 6, 7, 9}
>>> hl.eval(hl.any(lambda x: x[-1] == 'x', a))
True
>>> hl.eval(hl.any(lambda x: x % 4 == 0, s))
False

Notes

any() returns False when given an empty array or empty argument list.

hail.expr.functions.all(*args)[source]

Check for all True in boolean expressions or collections of booleans.

all() comes in three forms:

  1. hl.all(boolean, ...). Are all arguments True?

  2. hl.all(collection). Are all elements of the collection True?

  3. hl.all(function, collection). Does function return True for all values in this collection?

Examples

The first form:

>>> hl.eval(hl.all())
True
>>> hl.eval(hl.all(True))
True
>>> hl.eval(hl.all(False))
False
>>> hl.eval(hl.all(True, True, True))
True
>>> hl.eval(hl.all(False, False, True, False))
False

The second form:

>>> hl.eval(hl.all([False, True, False]))
False
>>> hl.eval(hl.all([True, True, True]))
True

The third form:

>>> a = ['The', 'quick', 'brown', 'fox']
>>> s = {1, 3, 5, 6, 7, 9}
>>> hl.eval(hl.all(lambda x: hl.len(x) > 3, a))
False
>>> hl.eval(hl.all(lambda x: x < 10, s))
True

Notes

all() returns True when given an empty array or empty argument list.

hail.expr.functions.filter(f, collection)[source]

Returns a new collection containing elements where f returns True.

Examples

>>> a = [1, 2, 3, 4]
>>> s = {'Alice', 'Bob', 'Charlie'}
>>> hl.eval(hl.filter(lambda x: x % 2 == 0, a))
[2, 4]
>>> hl.eval(hl.filter(lambda x: ~(x[-1] == 'e'), s))
{'Bob'}

Notes

Returns a same-type expression; evaluated on a SetExpression, returns a SetExpression. Evaluated on an ArrayExpression, returns an ArrayExpression.

Parameters:
Returns:

ArrayExpression or SetExpression – Expression of the same type as collection.

hail.expr.functions.sorted(collection, key=None, reverse=False)[source]

Returns a sorted array.

Examples

>>> a = ['Charlie', 'Alice', 'Bob']
>>> hl.eval(hl.sorted(a))
['Alice', 'Bob', 'Charlie']
>>> hl.eval(hl.sorted(a, reverse=True))
['Charlie', 'Bob', 'Alice']
>>> hl.eval(hl.sorted(a, key=lambda x: hl.len(x)))
['Bob', 'Alice', 'Charlie']

Notes

The ordered types are tstr and numeric types.

Parameters:
Returns:

ArrayExpression – Sorted array.

hail.expr.functions.find(f, collection)[source]

Returns the first element where f returns True.

Examples

>>> a = ['The', 'quick', 'brown', 'fox']
>>> s = {1, 3, 5, 6, 7, 9}
>>> hl.eval(hl.find(lambda x: x[-1] == 'x', a))
'fox'
>>> hl.eval(hl.find(lambda x: x % 4 == 0, s))
None

Notes

If f returns False for every element, then the result is missing.

Sets are unordered. If collection is of type tset, then the element returned comes from no guaranteed ordering.

Parameters:
Returns:

Expression – Expression whose type is the element type of the collection.

hail.expr.functions.group_by(f, collection)[source]

Group collection elements into a dict according to a lambda function.

Examples

>>> a = ['The', 'quick', 'brown', 'fox']
>>> hl.eval(hl.group_by(lambda x: hl.len(x), a))
{3: ['The', 'fox'], 5: ['quick', 'brown']}
Parameters:
  • f (function ( (arg) -> Expression)) – Function to evaluate for each element of the collection to produce a key for the resulting dictionary.

  • collection (ArrayExpression or SetExpression) – Collection expression.

Returns:

DictExpression. – Dictionary keyed by results of f.

hail.expr.functions.fold(f, zero, collection)[source]

Reduces a collection with the given function f, provided the initial value zero.

Examples

>>> a = [0, 1, 2]
>>> hl.eval(hl.fold(lambda i, j: i + j, 0, a))
3
Parameters:
Returns:

Expression

hail.expr.functions.array_scan(f, zero, a)[source]

Map each element of a to cumulative value of function f, with initial value zero.

Examples

>>> a = [0, 1, 2]
>>> hl.eval(hl.array_scan(lambda i, j: i + j, 0, a))
[0, 0, 1, 3]
Parameters:
Returns:

ArrayExpression.

hail.expr.functions.reversed(x)[source]

Reverses the elements of a collection.

Examples

>>> a = ['The', 'quick', 'brown', 'fox']
>>> hl.eval(hl.reversed(a))
['fox', 'brown', 'quick', 'The']
Parameters:

x (ArrayExpression or StringExpression) – Array or string expression.

Returns:

Expression

hail.expr.functions.keyed_intersection(*arrays, key)[source]

Compute the intersection of sorted arrays on a given key.

Requires sorted arrays with distinct keys.

Warning

Experimental. Does not support downstream randomness.

Parameters:
  • arrays

  • key

Returns:

ArrayExpression

hail.expr.functions.keyed_union(*arrays, key)[source]

Compute the distinct union of sorted arrays on a given key.

Requires sorted arrays with distinct keys.

Warning

Experimental. Does not support downstream randomness.

Parameters:
  • exprs

  • key

Returns:

ArrayExpression