Random functions

Hail has several functions that generate random values when invoked. The values are seeded when the function is called, so calling a random Hail function and then using it several times in the same expression will yield the same result each time.

Evaluating the same expression will yield the same value every time, but multiple calls of the same function will have different results. For example, let x be a random number generated with the function rand_unif():

>>> x = hl.rand_unif(0, 1)

The value of x will not change, although other calls to rand_unif() will generate different values:

>>> hl.eval(x)  
0.5562065047992025
>>> hl.eval(x)  
0.5562065047992025
>>> hl.eval(hl.rand_unif(0, 1))  
0.4678132874101748
>>> hl.eval(hl.rand_unif(0, 1))  
0.9097632224065403
>>> hl.eval(hl.array([x, x, x]))  
[0.5562065047992025, 0.5562065047992025, 0.5562065047992025]

If the three values in the last expression should be distinct, three separate calls to rand_unif() should be made:

>>> a = hl.rand_unif(0, 1)
>>> b = hl.rand_unif(0, 1)
>>> c = hl.rand_unif(0, 1)
>>> hl.eval(hl.array([a, b, c]))  
[0.8846327207915881, 0.14415148553468504, 0.8202677741734825]

Within the rows of a Table, the same expression will yield a consistent value within each row, but different (random) values across rows:

>>> table = hl.utils.range_table(5, 1)
>>> table = table.annotate(x1=x, x2=x, rand=hl.rand_unif(0, 1))
>>> table.show()  
+-------+-------------+-------------+-------------+
|   idx |          x1 |          x2 |        rand |
+-------+-------------+-------------+-------------+
| int32 |     float64 |     float64 |     float64 |
+-------+-------------+-------------+-------------+
|     0 | 8.50369e-01 | 8.50369e-01 | 9.64129e-02 |
|     1 | 5.15437e-01 | 5.15437e-01 | 8.60843e-02 |
|     2 | 5.42493e-01 | 5.42493e-01 | 1.69816e-01 |
|     3 | 5.51289e-01 | 5.51289e-01 | 6.48706e-01 |
|     4 | 6.40977e-01 | 6.40977e-01 | 8.22508e-01 |
+-------+-------------+-------------+-------------+

The same is true of the rows, columns, and entries of a MatrixTable.

Setting a seed

All random functions can take a specified seed as an argument. This guarantees that multiple invocations of the same function within the same context will return the same result, e.g.

>>> hl.eval(hl.rand_unif(0, 1, seed=0))  
0.5488135008937808
>>> hl.eval(hl.rand_unif(0, 1, seed=0))  
0.5488135008937808

This does not guarantee the same behavior across different contexts; e.g., the rows may have different values if the expression is applied to different tables:

>>> table = hl.utils.range_table(5, 1).annotate(x=hl.rand_bool(0.5, seed=0))
>>> table.x.collect()  
[0.5488135008937808,
 0.7151893652121089,
 0.6027633824638369,
 0.5448831893094143,
 0.42365480398481625]
>>> table = hl.utils.range_table(5, 1).annotate(x=hl.rand_bool(0.5, seed=0))
>>> table.x.collect()  
[0.5488135008937808,
 0.7151893652121089,
 0.6027633824638369,
 0.5448831893094143,
 0.42365480398481625]
>>> table = hl.utils.range_table(5, 5).annotate(x=hl.rand_bool(0.5, seed=0))
>>> table.x.collect()  
[0.5488135008937808,
 0.9595974306263271,
 0.42205690070893265,
 0.828743805759555,
 0.6414977904324134]

The seed can also be set globally using set_global_seed(). This sets the seed globally for all subsequent Hail operations, and a pipeline will be guaranteed to have the same results if the global seed is set right beforehand:

>>> hl.set_global_seed(0)
>>> hl.eval(hl.array([hl.rand_unif(0, 1), hl.rand_unif(0, 1)]))  
[0.6830630912401323, 0.4035978197966855]
>>> hl.set_global_seed(0)
>>> hl.eval(hl.array([hl.rand_unif(0, 1), hl.rand_unif(0, 1)]))  
[0.6830630912401323, 0.4035978197966855]
rand_bool(p[, seed]) Returns True with probability p.
rand_beta(a, b[, lower, upper, seed]) Samples from a beta distribution with parameters a (alpha) and b (beta).
rand_cat(prob[, seed]) Samples from a categorical distribution.
rand_dirichlet(a[, seed]) Samples from a Dirichlet distribution.
rand_gamma(shape, scale[, seed]) Samples from a gamma distribution with parameters shape and scale.
rand_norm([mean, sd, seed]) Samples from a normal distribution with mean mean and standard deviation sd.
rand_pois(lamb[, seed]) Samples from a Poisson distribution with rate parameter lamb.
rand_unif(lower, upper[, seed]) Samples from a uniform distribution within the interval [lower, upper].
hail.expr.functions.rand_bool(p, seed=None) → hail.expr.expressions.typed_expressions.BooleanExpression[source]

Returns True with probability p.

Examples

>>> hl.eval(hl.rand_bool(0.5))  
True
>>> hl.eval(hl.rand_bool(0.5))  
False
Parameters:
Returns:

BooleanExpression

hail.expr.functions.rand_beta(a, b, lower=None, upper=None, seed=None) → hail.expr.expressions.typed_expressions.Float64Expression[source]

Samples from a beta distribution with parameters a (alpha) and b (beta).

Notes

The optional parameters lower and upper represent a truncated beta distribution with parameters a and b and support [lower, upper]. Draws are made via rejection sampling, i.e. returning the first draw from Beta(a,b) that falls in range [lower, upper]. This procedure may be slow if the probability mass of Beta(a,b) over [lower, upper] is small.

Examples

>>> hl.eval(hl.rand_beta(0, 1))  
0.6696807666871818
>>> hl.eval(hl.rand_beta(0, 1))  
0.8512985039011525
Parameters:
Returns:

Float64Expression

hail.expr.functions.rand_cat(prob, seed=None) → hail.expr.expressions.typed_expressions.Int32Expression[source]

Samples from a categorical distribution.

Notes

The categories correspond to the indices of prob, an unnormalized probability mass function. The probability of drawing index i is prob[i]/sum(prob).

Warning

This function may be slow when the number of categories is large.

Examples

>>> hl.eval(hl.rand_cat([0, 1.7, 2]))  
2
>>> hl.eval(hl.rand_cat([0, 1.7, 2]))  
1
Parameters:
  • prob (list of float or ArrayExpression of type tfloat64)
  • seed (int or None) – If not None, function will be seeded with provided seed.
Returns:

Int32Expression

hail.expr.functions.rand_dirichlet(a, seed=None) → hail.expr.expressions.typed_expressions.ArrayExpression[source]

Samples from a Dirichlet distribution.

Examples

>>> hl.eval(hl.rand_dirichlet([1, 1, 1]))  
[0.4630197581640282,0.18207753442497876,0.3549027074109931]
>>> hl.eval(hl.rand_dirichlet([1, 1, 1]))  
[0.20851948405364765,0.7873859423649898,0.004094573581362475]
Parameters:
  • a (list of float or ArrayExpression of type tfloat64) – Array of non-negative concentration parameters.
  • seed (int, optional) – Random seed.
Returns:

Float64Expression

hail.expr.functions.rand_gamma(shape, scale, seed=None) → hail.expr.expressions.typed_expressions.Float64Expression[source]

Samples from a gamma distribution with parameters shape and scale.

Examples

>>> hl.eval(hl.rand_gamma(1, 1))  
2.3915947710237537
>>> hl.eval(hl.rand_gamma(1, 1))  
0.1339939936379711
Parameters:
Returns:

Float64Expression

hail.expr.functions.rand_norm(mean=0, sd=1, seed=None) → hail.expr.expressions.typed_expressions.Float64Expression[source]

Samples from a normal distribution with mean mean and standard deviation sd.

Examples

>>> hl.eval(hl.rand_norm())  
1.5388475315213386
>>> hl.eval(hl.rand_norm())  
-0.3006188509144124
Parameters:
Returns:

Float64Expression

hail.expr.functions.rand_pois(lamb, seed=None) → hail.expr.expressions.typed_expressions.Float64Expression[source]

Samples from a Poisson distribution with rate parameter lamb.

Examples

>>> hl.eval(hl.rand_pois(1))  
2.0
>>> hl.eval(hl.rand_pois(1))  
3.0
Parameters:
Returns:

Float64Expression

hail.expr.functions.rand_unif(lower, upper, seed=None) → hail.expr.expressions.typed_expressions.Float64Expression[source]

Samples from a uniform distribution within the interval [lower, upper].

Examples

>>> hl.eval(hl.rand_unif(0, 1))  
0.7983073825816226
>>> hl.eval(hl.rand_unif(0, 1))  
0.5161799497741769
Parameters:
Returns:

Float64Expression