String functions

format(f, *args)

Returns a formatted string using a specified format string and arguments.

json(x)

Convert an expression to a JSON string expression.

hamming(s1, s2)

Returns the Hamming distance between the two strings.

delimit(collection[, delimiter])

Joins elements of collection into single string delimited by delimiter.

entropy(s)

Returns the Shannon entropy of the character distribution defined by the string.

hail.expr.functions.format(f, *args)[source]

Returns a formatted string using a specified format string and arguments.

Examples

>>> hl.eval(hl.format('%.3e', 0.09345332))
'9.345e-02'
>>> hl.eval(hl.format('%.4f', hl.null(hl.tfloat64)))
'null'
>>> hl.eval(hl.format('%s %s %s', 'hello', hl.tuple([3, hl.locus('1', 2453)]), True))
'hello [3,1:2453] true'

Notes

See the Java documentation for valid format specifiers and arguments.

Missing values are printed as 'null' except when using the format flags ‘b’ and ‘B’ (printed as 'false' instead).

Parameters
Returns

StringExpression

hail.expr.functions.json(x) → hail.expr.expressions.typed_expressions.StringExpression[source]

Convert an expression to a JSON string expression.

Examples

>>> hl.eval(hl.json([1,2,3,4,5]))
'[1,2,3,4,5]'
>>> hl.eval(hl.json(hl.struct(a='Hello', b=0.12345, c=[1,2], d={'hi', 'bye'})))  # doctest: +NOTEST
'{"a":"Hello","c":[1,2],"b":0.12345,"d":["bye","hi"]}'
Parameters

x – Expression to convert.

Returns

StringExpression – String expression with JSON representation of x.

hail.expr.functions.hamming(s1, s2) → hail.expr.expressions.typed_expressions.Int32Expression[source]

Returns the Hamming distance between the two strings.

Examples

>>> hl.eval(hl.hamming('ATATA', 'ATGCA'))
2
>>> hl.eval(hl.hamming('abcdefg', 'zzcdefz'))
3

Notes

This method will fail if the two strings have different length.

Parameters
Returns

Expression of type tint32

hail.expr.functions.delimit(collection, delimiter=', ') → hail.expr.expressions.typed_expressions.StringExpression[source]

Joins elements of collection into single string delimited by delimiter.

Examples

>>> a = ['Bob', 'Charlie', 'Alice', 'Bob', 'Bob']
>>> hl.eval(hl.delimit(a))
'Bob,Charlie,Alice,Bob,Bob'

Notes

If the element type of collection is not tstr, then the str() function will be called on each element before joining with the delimiter.

Parameters
Returns

StringExpression – Joined string expression.

hail.expr.functions.entropy(s) → hail.expr.expressions.typed_expressions.Float64Expression[source]

Returns the Shannon entropy of the character distribution defined by the string.

Examples

>>> hl.eval(hl.entropy('ac'))
1.0
>>> hl.eval(hl.entropy('accctg'))
1.7924812503605778

Notes

For a string of length \(n\) with \(k\) unique characters \(\{ c_1, \dots, c_k \}\), let \(p_i\) be the probability that a randomly chosen character is \(c_i\), e.g. the number of instances of \(c_i\) divided by \(n\). Then the base-2 Shannon entropy is given by

\[H = \sum_{i=1}^k p_i \log_2(p_i).\]
Parameters

s (StringExpression)

Returns

Expression of type tfloat64