String functions

format(f, *args)

Returns a formatted string using a specified format string and arguments.

json(x)

Convert an expression to a JSON string expression.

parse_json(x, dtype)

Convert a JSON string to a structured expression.

hamming(s1, s2)

Returns the Hamming distance between the two strings.

delimit(collection[, delimiter])

Joins elements of collection into single string delimited by delimiter.

entropy(s)

Returns the Shannon entropy of the character distribution defined by the string.

parse_int(x)

Parse a string as a 32-bit integer.

parse_int32(x)

Parse a string as a 32-bit integer.

parse_int64(x)

Parse a string as a 64-bit integer.

parse_float(x)

Parse a string as a 64-bit floating point number.

parse_float32(x)

Parse a string as a 32-bit floating point number.

parse_float64(x)

Parse a string as a 64-bit floating point number.

hail.expr.functions.format(f, *args)[source]

Returns a formatted string using a specified format string and arguments.

Examples

>>> hl.eval(hl.format('%.3e', 0.09345332))
'9.345e-02'
>>> hl.eval(hl.format('%.4f', hl.missing(hl.tfloat64)))
'null'
>>> hl.eval(hl.format('%s %s %s', 'hello', hl.tuple([3, hl.locus('1', 2453)]), True))
'hello (3, 1:2453) true'

Notes

See the Java documentation for valid format specifiers and arguments.

Missing values are printed as 'null' except when using the format flags ‘b’ and ‘B’ (printed as 'false' instead).

Parameters:
Returns:

StringExpression

hail.expr.functions.json(x)[source]

Convert an expression to a JSON string expression.

Examples

>>> hl.eval(hl.json([1,2,3,4,5]))
'[1,2,3,4,5]'
>>> hl.eval(hl.json(hl.struct(a='Hello', b=0.12345, c=[1,2], d={'hi', 'bye'})))
'{"a":"Hello","b":0.12345,"c":[1,2],"d":["bye","hi"]}'
Parameters:

x – Expression to convert.

Returns:

StringExpression – String expression with JSON representation of x.

hail.expr.functions.parse_json(x, dtype)[source]

Convert a JSON string to a structured expression.

Examples

>>> json_str = '{"a": 5, "b": 1.1, "c": "foo"}'
>>> parsed = hl.parse_json(json_str, dtype='struct{a: int32, b: float64, c: str}')
>>> hl.eval(parsed.a)
5
Parameters:
Returns:

Expression

hail.expr.functions.hamming(s1, s2)[source]

Returns the Hamming distance between the two strings.

Examples

>>> hl.eval(hl.hamming('ATATA', 'ATGCA'))
2
>>> hl.eval(hl.hamming('abcdefg', 'zzcdefz'))
3

Notes

This method will fail if the two strings have different length.

Parameters:
Returns:

Expression of type tint32

hail.expr.functions.delimit(collection, delimiter=',')[source]

Joins elements of collection into single string delimited by delimiter.

Examples

>>> a = ['Bob', 'Charlie', 'Alice', 'Bob', 'Bob']
>>> hl.eval(hl.delimit(a))
'Bob,Charlie,Alice,Bob,Bob'

Notes

If the element type of collection is not tstr, then the str() function will be called on each element before joining with the delimiter.

Parameters:
Returns:

StringExpression – Joined string expression.

hail.expr.functions.entropy(s)[source]

Returns the Shannon entropy of the character distribution defined by the string.

Examples

>>> hl.eval(hl.entropy('ac'))
1.0
>>> hl.eval(hl.entropy('accctg'))
1.7924812503605778

Notes

For a string of length \(n\) with \(k\) unique characters \(\{ c_1, \dots, c_k \}\), let \(p_i\) be the probability that a randomly chosen character is \(c_i\), e.g. the number of instances of \(c_i\) divided by \(n\). Then the base-2 Shannon entropy is given by

\[H = \sum_{i=1}^k p_i \log_2(p_i).\]
Parameters:

s (StringExpression)

Returns:

Expression of type tfloat64

hail.expr.functions.parse_int(x)[source]

Parse a string as a 32-bit integer.

Examples

>>> hl.eval(hl.parse_int('154'))
154
>>> hl.eval(hl.parse_int('15.4'))
None
>>> hl.eval(hl.parse_int('asdf'))
None

Notes

If the input is an invalid integer, then result of this call will be missing.

Parameters:

x (StringExpression)

Returns:

NumericExpression of type tint32

hail.expr.functions.parse_int32(x)[source]

Parse a string as a 32-bit integer.

Examples

>>> hl.eval(hl.parse_int32('154'))
154
>>> hl.eval(hl.parse_int32('15.4'))
None
>>> hl.eval(hl.parse_int32('asdf'))
None

Notes

If the input is an invalid integer, then result of this call will be missing.

Parameters:

x (StringExpression)

Returns:

NumericExpression of type tint32

hail.expr.functions.parse_int64(x)[source]

Parse a string as a 64-bit integer.

Examples

>>> hl.eval(hl.parse_int64('154'))
154
>>> hl.eval(hl.parse_int64('15.4'))
None
>>> hl.eval(hl.parse_int64('asdf'))
None

Notes

If the input is an invalid integer, then result of this call will be missing.

Parameters:

x (StringExpression)

Returns:

NumericExpression of type tint64

hail.expr.functions.parse_float(x)[source]

Parse a string as a 64-bit floating point number.

Examples

>>> hl.eval(hl.parse_float('1.1'))
1.1
>>> hl.eval(hl.parse_float('asdf'))
None

Notes

If the input is an invalid floating point number, then result of this call will be missing.

Parameters:

x (StringExpression)

Returns:

NumericExpression of type tfloat64

hail.expr.functions.parse_float32(x)[source]

Parse a string as a 32-bit floating point number.

Examples

>>> hl.eval(hl.parse_float32('1.1'))
1.100000023841858
>>> hl.eval(hl.parse_float32('asdf'))
None

Notes

If the input is an invalid floating point number, then result of this call will be missing.

Parameters:

x (StringExpression)

Returns:

NumericExpression of type tfloat32

hail.expr.functions.parse_float64(x)[source]

Parse a string as a 64-bit floating point number.

Examples

>>> hl.eval(hl.parse_float64('1.1'))
1.1
>>> hl.eval(hl.parse_float64('asdf'))
None

Notes

If the input is an invalid floating point number, then result of this call will be missing.

Parameters:

x (StringExpression)

Returns:

NumericExpression of type tfloat64