# SetExpression¶

class hail.expr.SetExpression[source]

Expression of type tset.

>>> s1 = hl.literal({1, 2, 3})
>>> s2 = hl.literal({1, 3, 5})


Attributes

 dtype The data type of the expression.

Methods

 add Returns a new set including item. contains Returns True if item is in the set. difference Return the set of elements in the set that are not present in set s. intersection Return the intersection of the set and set s. is_subset Returns True if every element is contained in set s. remove Returns a new set excluding item. union Return the union of the set and set s.
__eq__(other)

Returns True if the two expressions are equal.

Examples

>>> x = hl.literal(5)
>>> y = hl.literal(5)
>>> z = hl.literal(1)

>>> hl.eval(x == y)
True

>>> hl.eval(x == z)
False


Notes

This method will fail with an error if the two expressions are not of comparable types.

Parameters

other (Expression) – Expression for equality comparison.

Returns

BooleanExpressionTrue if the two expressions are equal.

__ge__(other)

Return self>=value.

__gt__(other)

Return self>value.

__le__(other)

Return self<=value.

__lt__(other)

Return self<value.

__ne__(other)

Returns True if the two expressions are not equal.

Examples

>>> x = hl.literal(5)
>>> y = hl.literal(5)
>>> z = hl.literal(1)

>>> hl.eval(x != y)
False

>>> hl.eval(x != z)
True


Notes

This method will fail with an error if the two expressions are not of comparable types.

Parameters

other (Expression) – Expression for inequality comparison.

Returns

BooleanExpressionTrue if the two expressions are not equal.

add(item)[source]

Returns a new set including item.

Examples

>>> hl.eval(s1.add(10))
{1, 2, 3, 10}

Parameters

item (Expression) – Value to add.

Returns

SetExpression – Set with item added.

all(f)

Returns True if f returns True for every element.

Examples

>>> hl.eval(a.all(lambda x: x < 10))
True


Notes

This method returns True if the collection is empty.

Parameters

f (function ( (arg) -> BooleanExpression)) – Function to evaluate for each element of the collection. Must return a BooleanExpression.

Returns

BooleanExpression. – True if f returns True for every element, False otherwise.

any(f)

Returns True if f returns True for any element.

Examples

>>> hl.eval(a.any(lambda x: x % 2 == 0))
True

>>> hl.eval(s3.any(lambda x: x[0] == 'D'))
False


Notes

This method always returns False for empty collections.

Parameters

f (function ( (arg) -> BooleanExpression)) – Function to evaluate for each element of the collection. Must return a BooleanExpression.

Returns

BooleanExpression. – True if f returns True for any element, False otherwise.

collect(_localize=True)

Collect all records of an expression into a local list.

Examples

Collect all the values from C1:

>>> table1.C1.collect()
[2, 2, 10, 11]


Warning

Extremely experimental.

Warning

The list of records may be very large.

Returns

list

contains(item)[source]

Returns True if item is in the set.

Examples

>>> hl.eval(s1.contains(1))
True

>>> hl.eval(s1.contains(10))
False

Parameters

item (Expression) – Value for inclusion test.

Returns

BooleanExpressionTrue if item is in the set.

describe(handler=<built-in function print>)

Print information about type, index, and dependencies.

difference(s)[source]

Return the set of elements in the set that are not present in set s.

Examples

>>> hl.eval(s1.difference(s2))
frozenset({2})

>>> hl.eval(s2.difference(s1))
frozenset({5})

Parameters

s (SetExpression) – Set expression of the same type.

Returns

SetExpression – Set of elements not in s.

property dtype

The data type of the expression.

Returns

HailType

export(path, delimiter='\t', missing='NA', header=True)

Export a field to a text file.

Examples

>>> small_mt.GT.export('output/gt.tsv')
>>> with open('output/gt.tsv', 'r') as f:
...     for line in f:
...         print(line, end='')
locus   alleles 0       1       2       3
1:1     ["A","C"]       0/1     0/1     0/0     0/0
1:2     ["A","C"]       1/1     0/1     1/1     1/1
1:3     ["A","C"]       1/1     0/1     0/1     0/0
1:4     ["A","C"]       1/1     0/1     1/1     1/1

>>> small_mt.GT.export('output/gt-no-header.tsv', header=False)
>>> with open('output/gt-no-header.tsv', 'r') as f:
...     for line in f:
...         print(line, end='')
1:1     ["A","C"]       0/1     0/1     0/0     0/0
1:2     ["A","C"]       1/1     0/1     1/1     1/1
1:3     ["A","C"]       1/1     0/1     0/1     0/0
1:4     ["A","C"]       1/1     0/1     1/1     1/1

>>> small_mt.pop.export('output/pops.tsv')
>>> with open('output/pops.tsv', 'r') as f:
...     for line in f:
...         print(line, end='')
sample_idx      pop
0       2
1       2
2       0
3       2

>>> small_mt.ancestral_af.export('output/ancestral_af.tsv')
>>> with open('output/ancestral_af.tsv', 'r') as f:
...     for line in f:
...         print(line, end='')
locus   alleles ancestral_af
1:1     ["A","C"]       5.3905e-01
1:2     ["A","C"]       8.6768e-01
1:3     ["A","C"]       4.3765e-01
1:4     ["A","C"]       7.6300e-01

>>> mt = small_mt
>>> small_mt.bn.export('output/bn.tsv')
>>> with open('output/bn.tsv', 'r') as f:
...     for line in f:
...         print(line, end='')
bn
{"n_populations":3,"n_samples":4,"n_variants":4,"n_partitions":8,"pop_dist":[1,1,1],"fst":[0.1,0.1,0.1],"mixture":false}


Notes

For entry-indexed expressions, if there is one column key field, the result of calling str() on that field is used as the column header. Otherwise, each compound column key is converted to JSON and used as a column header. For example:

>>> small_mt = small_mt.key_cols_by(s=small_mt.sample_idx, family='fam1')
>>> with open('output/gt-no-header.tsv', 'r') as f:
...     for line in f:
...         print(line, end='')
locus   alleles {"s":0,"family":"fam1"} {"s":1,"family":"fam1"} {"s":2,"family":"fam1"} {"s":3,"family":"fam1"}
1:1     ["A","C"]       0/1     0/1     0/0     0/0
1:2     ["A","C"]       1/1     0/1     1/1     1/1
1:3     ["A","C"]       1/1     0/1     0/1     0/0
1:4     ["A","C"]       1/1     0/1     1/1     1/1

Parameters
filter(f)

Returns a new collection containing elements where f returns True.

Examples

>>> hl.eval(a.filter(lambda x: x % 2 == 0))
[2, 4]

>>> hl.eval(s3.filter(lambda x: ~(x[-1] == 'e')))
frozenset({'Bob'})


Notes

Returns a same-type expression; evaluated on a SetExpression, returns a SetExpression. Evaluated on an ArrayExpression, returns an ArrayExpression.

Parameters

f (function ( (arg) -> BooleanExpression)) – Function to evaluate for each element of the collection. Must return a BooleanExpression.

Returns

CollectionExpression – Expression of the same type as the callee.

find(f)

Returns the first element where f returns True.

Examples

>>> hl.eval(a.find(lambda x: x ** 2 > 20))
5

>>> hl.eval(s3.find(lambda x: x[0] == 'D'))
None


Notes

If f returns False for every element, then the result is missing.

Parameters

f (function ( (arg) -> BooleanExpression)) – Function to evaluate for each element of the collection. Must return a BooleanExpression.

Returns

Expression – Expression whose type is the element type of the collection.

flatmap(f)

Map each element of the collection to a new collection, and flatten the results.

Examples

>>> hl.eval(a.flatmap(lambda x: hl.range(0, x)))
[0, 0, 1, 0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 3, 4]

>>> hl.eval(s3.flatmap(lambda x: hl.set(hl.range(0, x.length()).map(lambda i: x[i]))))
{'A', 'B', 'C', 'a', 'b', 'c', 'e', 'h', 'i', 'l', 'o', 'r'}

Parameters

f (function ( (arg) -> CollectionExpression)) – Function from the element type of the collection to the type of the collection. For instance, flatmap on a set<str> should take a str and return a set.

Returns

CollectionExpression

fold(f, zero)

Reduces the collection with the given function f, provided the initial value zero.

Examples

>>> a = [0, 1, 2]

>>> hl.eval(hl.fold(lambda i, j: i + j, 0, a))
3

Parameters
Returns
group_by(f)

Group elements into a dict according to a lambda function.

Examples

>>> hl.eval(a.group_by(lambda x: x % 2 == 0))
{False: [1, 3, 5], True: [2, 4]}

>>> hl.eval(s3.group_by(lambda x: x.length()))
{3: {'Bob'}, 5: {'Alice'}, 7: {'Charlie'}}

Parameters

f (function ( (arg) -> Expression)) – Function to evaluate for each element of the collection to produce a key for the resulting dictionary.

Returns

DictExpression. – Dictionary keyed by results of f.

intersection(s)[source]

Return the intersection of the set and set s.

Examples

>>> hl.eval(s1.intersection(s2))
frozenset({1, 3})

Parameters

s (SetExpression) – Set expression of the same type.

Returns

SetExpression – Set of elements present in s.

is_subset(s)[source]

Returns True if every element is contained in set s.

Examples

>>> hl.eval(s1.is_subset(s2))
False

>>> hl.eval(s1.remove(2).is_subset(s2))
True

Parameters

s (SetExpression) – Set expression of the same type.

Returns

BooleanExpressionTrue if every element is contained in set s.

length()

Returns the size of a collection.

Examples

>>> hl.eval(a.length())
5

>>> hl.eval(s3.length())
3

Returns

Expression of type tint32 – The number of elements in the collection.

map(f)

Transform each element of a collection.

Examples

>>> hl.eval(a.map(lambda x: x ** 3))
[1.0, 8.0, 27.0, 64.0, 125.0]

>>> hl.eval(s3.map(lambda x: x.length()))
frozenset({3, 5, 7})

Parameters

f (function ( (arg) -> Expression)) – Function to transform each element of the collection.

Returns

CollectionExpression. – Collection where each element has been transformed according to f.

remove(item)[source]

Returns a new set excluding item.

Examples

>>> hl.eval(s1.remove(1))
frozenset({2, 3})

Parameters

item (Expression) – Value to remove.

Returns

SetExpression – Set with item removed.

show(n=None, width=None, truncate=None, types=True, handler=None, n_rows=None, n_cols=None)

Print the first few records of the expression to the console.

If the expression refers to a value on a keyed axis of a table or matrix table, then the accompanying keys will be shown along with the records.

Examples

>>> table1.SEX.show()
+-------+-----+
|    ID | SEX |
+-------+-----+
| int32 | str |
+-------+-----+
|     1 | "M" |
|     2 | "M" |
|     3 | "F" |
|     4 | "F" |
+-------+-----+

>>> hl.literal(123).show()
+--------+
| <expr> |
+--------+
|  int32 |
+--------+
|    123 |
+--------+


Notes

The output can be passed piped to another output source using the handler argument:

>>> ht.foo.show(handler=lambda x: logging.info(x))

Parameters
• n (int) – Maximum number of rows to show.

• width (int) – Horizontal width at which to break columns.

• truncate (int, optional) – Truncate each field to the given number of characters. If None, truncate fields to the given width.

• types (bool) – Print an extra header line with the type of each field.

size()

Returns the size of a collection.

Examples

>>> hl.eval(a.size())
5

>>> hl.eval(s3.size())
3

Returns

Expression of type tint32 – The number of elements in the collection.

summarize(handler=None)

Compute and print summary information about the expression.

Danger

This functionality is experimental. It may not be tested as well as other parts of Hail and the interface is subject to change.

take(n, _localize=True)

Collect the first n records of an expression.

Examples

Take the first three rows:

>>> table1.X.take(3)
[5, 6, 7]


Warning

Extremely experimental.

Parameters

n (int) – Number of records to take.

Returns

list

union(s)[source]

Return the union of the set and set s.

Examples

>>> hl.eval(s1.union(s2))
frozenset({1, 2, 3, 5})

Parameters

s (SetExpression) – Set expression of the same type.

Returns

SetExpression – Set of elements present in either set.