# Types¶

Fields and expressions in Hail have types. Throughout the documentation, you will find type descriptions like array<str> or tlocus. It is generally more important to know how to use expressions of various types than to know how to manipulate the types themselves, but some operations like null() require type arguments.

In Python, 5 is of type int while "hello" is of type str. Python is a dynamically-typed language, meaning that a function like:

>>> def add_x_and_y(x, y):
...     return x + y


…can be called on any two objects which can be added, like numbers, strings, or numpy arrays.

Types are very important in Hail, because the fields of Table and MatrixTable objects have data types.

## Primitive types¶

Hail’s primitive data types for boolean, numeric and string objects are:

 tint Alias for tint32. tint32 Hail type for signed 32-bit integers. tint64 Hail type for signed 64-bit integers. tfloat Alias for tfloat64. tfloat32 Hail type for 32-bit floating point numbers. tfloat64 Hail type for 64-bit floating point numbers. tstr Hail type for text strings. tbool Hail type for Boolean (True or False) values.

## Container types¶

Hail’s container types are:

 tarray Hail type for variable-length arrays of elements. tndarray Hail type for n-dimensional arrays. tset Hail type for collections of distinct elements. tdict Hail type for key-value maps. ttuple Hail type for tuples. tinterval Hail type for intervals of ordered values. tstruct Hail type for structured groups of heterogeneous fields.

## Genetics types¶

Hail has two genetics-specific types:

 tlocus Hail type for a genomic coordinate with a contig and a position. tcall Hail type for a diploid genotype.

## When to work with types¶

In general, you won’t need to mention types explicitly.

There are a few situations where you may want to specify types explicitly:

## Viewing an object’s type¶

Hail objects have a dtype field that will print their type.

>>> hl.rand_norm().dtype
dtype('float64')


Printing the representation of a Hail expression will also show the type:

>>> hl.rand_norm()
<Float64Expression of type float64>


We can see that hl.rand_norm() is of type tfloat64, but what does Expression mean? Each data type in Hail is represented by its own Expression class. Data of type tfloat64 is represented by an Float64Expression. Data of type tstruct is represented by a StructExpression.

## Collection Types¶

Hail’s collection types (arrays, ndarrays, sets, and dicts) have homogenous elements, meaning that all values in the collection must be of the same type. Python allows mixed collections: ['1', 2, 3.0] is a valid Python list. However, Hail arrays cannot contain both tstr and tint32 values. Likewise, the dict {'a': 1, 2: 'b'} is a valid Python dictionary, but a Hail dictionary cannot contain keys of different types. An example of a valid dictionary in Hail is {'a': 1, 'b': 2}, where the keys are all strings and the values are all integers. The type of this dictionary would be dict<str, int32>.

## Constructing types¶

Constructing types can be done either by using the type objects and classes (prefixed by “t”) or by parsing from strings with dtype(). As an example, we will construct a tstruct with each option:

>>> t = hl.tstruct(a = hl.tint32, b = hl.tstr, c = hl.tarray(hl.tfloat64))
>>> t
dtype('struct{a: int32, b: str, c: array<float64>}')

>>> t = hl.dtype('struct{a: int32, b: str, c: array<float64>}')
>>> t
dtype('struct{a: int32, b: str, c: array<float64>}')


## Reference documentation¶

class hail.expr.types.HailType[source]

Hail type superclass.

hail.expr.types.dtype(type_str)[source]

Parse a type from its string representation.

Examples

>>> hl.dtype('int')
dtype('int32')

>>> hl.dtype('float')
dtype('float64')

>>> hl.dtype('array<int32>')
dtype('array<int32>')

>>> hl.dtype('dict<str, bool>')
dtype('dict<str, bool>')

>>> hl.dtype('struct{a: int32, field with spaces: int64}')
dtype('struct{a: int32, field with spaces: int64}')


Notes

This function is able to reverse str(t) on a HailType.

The grammar is defined as follows:

type = _ (array / set / dict / struct / union / tuple / interval / int64 / int32 / float32 / float64 / bool / str / call / str / locus) _
int64 = "int64" / "tint64"
int32 = "int32" / "tint32" / "int" / "tint"
float32 = "float32" / "tfloat32"
float64 = "float64" / "tfloat64" / "tfloat" / "float"
bool = "tbool" / "bool"
call = "tcall" / "call"
str = "tstr" / "str"
locus = ("tlocus" / "locus") _ "[" identifier "]"
array = ("tarray" / "array") _ "<" type ">"
array = ("tstream" / "stream") _ "<" type ">"
ndarray = ("tndarray" / "ndarray") _ "<" type, identifier ">"
set = ("tset" / "set") _ "<" type ">"
dict = ("tdict" / "dict") _ "<" type "," type ">"
struct = ("tstruct" / "struct") _ "{" (fields / _) "}"
union = ("tunion" / "union") _ "{" (fields / _) "}"
tuple = ("ttuple" / "tuple") _ "(" ((type ("," type)*) / _) ")"
fields = field ("," field)*
field = identifier ":" type
interval = ("tinterval" / "interval") _ "<" type ">"
identifier = _ (simple_identifier / escaped_identifier) _
simple_identifier = ~"\w+"
escaped_identifier = ~"([^\\\\]|\\\\.)*"
_ = ~"\s*"

Parameters

type_str (str) – String representation of type.

Returns

HailType

hail.expr.types.tint = dtype('int32')

Alias for tint32.

hail.expr.types.tint32 = dtype('int32')

Hail type for signed 32-bit integers.

Their values can range from $$-2^{31}$$ to $$2^{31} - 1$$ (approximately 2.15 billion).

In Python, these are represented as int.

hail.expr.types.tint64 = dtype('int64')

Hail type for signed 64-bit integers.

Their values can range from $$-2^{63}$$ to $$2^{63} - 1$$.

In Python, these are represented as int.

hail.expr.types.tfloat = dtype('float64')

Alias for tfloat64.

hail.expr.types.tfloat32 = dtype('float32')

Hail type for 32-bit floating point numbers.

In Python, these are represented as float.

hail.expr.types.tfloat64 = dtype('float64')

Hail type for 64-bit floating point numbers.

In Python, these are represented as float.

hail.expr.types.tstr = dtype('str')

Hail type for text strings.

In Python, these are represented as strings.

hail.expr.types.tbool = dtype('bool')

Hail type for Boolean (True or False) values.

In Python, these are represented as bool.

class hail.expr.types.tarray(element_type)[source]

Hail type for variable-length arrays of elements.

In Python, these are represented as list.

Notes

Arrays contain elements of only one type, which is parameterized by element_type.

Parameters

element_type (HailType) – Element type of array.

class hail.expr.types.tndarray(element_type, ndim)[source]

Hail type for n-dimensional arrays.

Danger

This functionality is experimental. It may not be tested as well as other parts of Hail and the interface is subject to change.

In Python, these are represented as NumPy numpy.ndarray.

Notes

NDArrays contain elements of only one type, which is parameterized by element_type.

Parameters
class hail.expr.types.tset(element_type)[source]

Hail type for collections of distinct elements.

In Python, these are represented as set.

Notes

Sets contain elements of only one type, which is parameterized by element_type.

Parameters

element_type (HailType) – Element type of set.

class hail.expr.types.tdict(key_type, value_type)[source]

Hail type for key-value maps.

In Python, these are represented as dict.

Notes

Dicts parameterize the type of both their keys and values with key_type and value_type.

Parameters
class hail.expr.types.tstruct(**field_types)[source]

Hail type for structured groups of heterogeneous fields.

In Python, these are represented as Struct.

Hail’s tstruct type is commonly used to compose types together to form nested structures. Structs can contain any combination of types, and are ordered mappings from field name to field type. Each field name must be unique.

Structs are very common in Hail. Each component of a Table and MatrixTable is a struct:

Structs appear below the top-level component types as well. Consider the following join:

>>> new_table = table1.annotate(table2_fields = table2.index(table1.key))


This snippet adds a field to table1 called table2_fields. In the new table, table2_fields will be a struct containing all the non-key fields from table2.

Parameters

field_types (keyword args of HailType) – Fields.

class hail.expr.types.ttuple(*types)[source]

Hail type for tuples.

In Python, these are represented as tuple.

Parameters

types (varargs of HailType) – Element types.

hail.expr.types.tcall = dtype('call')

Hail type for a diploid genotype.

In Python, these are represented by Call.

class hail.expr.types.tlocus(reference_genome='default')[source]

Hail type for a genomic coordinate with a contig and a position.

In Python, these are represented by Locus.

Parameters

reference_genome (ReferenceGenome or str) – Reference genome to use.

reference_genome

Reference genome.

Returns

ReferenceGenome – Reference genome.

class hail.expr.types.tinterval(point_type)[source]

Hail type for intervals of ordered values.

In Python, these are represented by Interval.

Parameters

point_type (HailType`) – Interval point type.