Functions

These functions are exposed at the top level of the module, e.g. hl.case.

Core language functions

eval(expression) Evaluate a Hail expression, returning the result.
literal(x, dtype, str, NoneType] = None) Captures and broadcasts a Python variable or object as an expression.
cond(condition, consequent, alternate, …) Expression for an if/else statement; tests a condition and returns one of two options based on the result.
switch(expr) Build a conditional tree on the value of an expression.
case(missing_false) Chain multiple if-else statements with a CaseBuilder.
bind(f, *exprs) Bind a temporary variable and use it in a function.
null(t, str]) Creates an expression representing a missing value of a specified type.
is_missing(expression) Returns True if the argument is missing.
is_defined(expression) Returns True if the argument is not missing.
coalesce(*args) Returns the first non-missing value of args.
or_else(a, b) If a is missing, return b.
or_missing(predicate, value) Returns value if predicate is True, otherwise returns missing.
range(start, stop[, step]) Returns an array of integers from start to stop by step.

Constructors

bool(x) Convert to a Boolean expression.
float(x) Convert to a 64-bit floating point expression.
float32(x) Convert to a 32-bit floating point expression.
float64(x) Convert to a 64-bit floating point expression.
int(x) Convert to a 32-bit integer expression.
int32(x) Convert to a 32-bit integer expression.
int64(x) Convert to a 64-bit integer expression.
interval(start, end[, includes_start, …]) Construct an interval expression.
str(x) Returns the string representation of x.
struct(**kwargs) Construct a struct expression.
tuple(iterable) Construct a tuple expression.

Collection constructors

array(collection) Construct an array expression.
empty_array(t, str]) Returns an empty array of elements of a type t.
set(collection) Convert a set expression.
empty_set(t, str]) Returns an empty set of elements of a type t.
dict(collection) Creates a dictionary.

Collection functions

len(x) Returns the size of a collection or string.
map(f, collection) Transform each element of a collection.
flatmap(f, collection) Map each element of the collection to a new collection, and flatten the results.
zip(*arrays, fill_missing) Zip together arrays into a single array.
zip_with_index(a) Returns an array of (index, element) tuples.
flatten(collection) Flatten a nested collection by concatenating sub-collections.
any(f, collection) Returns True if f returns True for any element.
all(f, collection) Returns True if f returns True for every element.
filter(f, collection) Returns a new collection containing elements where f returns True.
sorted(collection, key, NoneType] = None[, …]) Returns a sorted array.
find(f, collection) Returns the first element where f returns True.
group_by(f, collection) Group collection elements into a dict according to a lambda function.
fold(f, zero, collection) Reduces a collection with the given function f, provided the initial value zero.
array_scan(f, zero, a) Map each element of a to cumulative value of function f, with initial value zero.
reversed(x) Reverses the elements of a collection.

Numeric functions

abs(x) Take the absolute value of a numeric value or array.
approx_equal(x, y[, tolerance, absolute, …]) Tests whether two numbers are approximately equal.
exp(x) Computes e raised to the power x.
is_nan(x) Returns True if the argument is nan (not a number).
is_finite(x) Returns True if the argument is a finite floating-point number.
is_infinite(x) Returns True if the argument is positive or negative infinity.
log(x[, base]) Take the logarithm of the x with base base.
log10(x) Take the logarithm of the x with base 10.
sign(x) Returns the sign of a numeric value or array.
sqrt(x) Returns the square root of x.
int(x) Convert to a 32-bit integer expression.
int32(x) Convert to a 32-bit integer expression.
int64(x) Convert to a 64-bit integer expression.
float(x) Convert to a 64-bit floating point expression.
float32(x) Convert to a 32-bit floating point expression.
float64(x) Convert to a 64-bit floating point expression.
floor(x) The largest integral value that is less than or equal to x.
ceil(x) The smallest integral value that is greater than or equal to x.
uniroot(f, min, max) Finds a root of the function f within the interval [min, max].

Numeric collection functions

min(*exprs, filter_missing) Returns the minimum of a collection or of given numeric expressions.
max(*exprs, filter_missing) Returns the maximum element of a collection or of given numeric expressions.
mean(collection, filter_missing) Returns the mean of all values in the collection.
median(collection) Returns the median value in the collection.
product(collection, filter_missing) Returns the product of values in the collection.
sum(collection, filter_missing) Returns the sum of values in the collection.
cumulative_sum(a, filter_missing) Returns an array of the cumulative sum of values in the array.
argmin(array, unique) Return the index of the minimum value in the array.
argmax(array, unique) Return the index of the maximum value in the array.
corr(x, y) Compute the Pearson correlation coefficient between x and y.

String functions

format(f, *args) Returns a formatted string using a specified format string and arguments.
json(x) Convert an expression to a JSON string expression.
hamming(s1, s2) Returns the Hamming distance between the two strings.
delimit(collection[, delimiter]) Joins elements of collection into single string delimited by delimiter.
entropy(s) Returns the Shannon entropy of the character distribution defined by the string.

Statistical functions

chi_squared_test(c1, c2, c3, c4) Performs chi-squared test of independence on a 2x2 contingency table.
fisher_exact_test(c1, c2, c3, c4) Calculates the p-value, odds ratio, and 95% confidence interval using Fisher’s exact test for a 2x2 table.
contingency_table_test(c1, c2, c3, c4, …) Performs chi-squared or Fisher’s exact test of independence on a 2x2 contingency table.
dbeta(x, a, b) Returns the probability density at x of a beta distribution with parameters a (alpha) and b (beta).
dpois(x, lamb[, log_p]) Compute the (log) probability density at x of a Poisson distribution with rate parameter lamb.
hardy_weinberg_test(n_hom_ref, n_het, n_hom_var) Performs test of Hardy-Weinberg equilibrium.
pchisqtail(x, df) Returns the probability under the right-tail starting at x for a chi-squared distribution with df degrees of freedom.
pnorm(x) The cumulative probability function of a standard normal distribution.
ppois(x, lamb[, lower_tail, log_p]) The cumulative probability function of a Poisson distribution.
qchisqtail(p, df) Inverts pchisqtail().
qnorm(p) Inverts pnorm().
qpois(p, lamb[, lower_tail, log_p]) Inverts ppois().

Randomness

rand_bool(p[, seed]) Returns True with probability p.
rand_beta(a, b[, lower, upper, seed]) Samples from a beta distribution with parameters a (alpha) and b (beta).
rand_cat(prob[, seed]) Samples from a categorical distribution.
rand_dirichlet(a[, seed]) Samples from a Dirichlet distribution.
rand_gamma(shape, scale[, seed]) Samples from a gamma distribution with parameters shape and scale.
rand_norm([mean, sd, seed]) Samples from a normal distribution with mean mean and standard deviation sd.
rand_pois(lamb[, seed]) Samples from a Poisson distribution with rate parameter lamb.
rand_unif(lower, upper[, seed]) Samples from a uniform distribution within the interval [lower, upper].

Genetics functions

locus(contig, pos, reference_genome, …) Construct a locus expression from a chromosome and position.
locus_from_global_position(global_pos, …) Constructs a locus expression from a global position and a reference genome.
locus_interval(contig, start, end[, …]) Construct a locus interval expression.
parse_locus(s, reference_genome, …) Construct a locus expression by parsing a string or string expression.
parse_variant(s, reference_genome, …) Construct a struct with a locus and alleles by parsing a string.
parse_locus_interval(s, reference_genome, …) Construct a locus interval expression by parsing a string or string expression.
call(*alleles[, phased]) Construct a call expression.
unphased_diploid_gt_index_call(gt_index) Construct an unphased, diploid call from a genotype index.
parse_call(s) Construct a call expression by parsing a string or string expression.
downcode(c, i) Create a new call by setting all alleles other than i to ref
triangle(n) Returns the triangle number of n.
is_snp(ref, alt) Returns True if the alleles constitute a single nucleotide polymorphism.
is_mnp(ref, alt) Returns True if the alleles constitute a multiple nucleotide polymorphism.
is_transition(ref, alt) Returns True if the alleles constitute a transition.
is_transversion(ref, alt) Returns True if the alleles constitute a transversion.
is_insertion(ref, alt) Returns True if the alleles constitute an insertion.
is_deletion(ref, alt) Returns True if the alleles constitute a deletion.
is_indel(ref, alt) Returns True if the alleles constitute an insertion or deletion.
is_star(ref, alt) Returns True if the alleles constitute an upstream deletion.
is_complex(ref, alt) Returns True if the alleles constitute a complex polymorphism.
is_valid_contig(contig[, reference_genome]) Returns True if contig is a valid contig name in reference_genome.
is_valid_locus(contig, position[, …]) Returns True if contig and position is a valid site in reference_genome.
allele_type(ref, alt) Returns the type of the polymorphism as a string.
pl_dosage(pl) Return expected genotype dosage from array of Phred-scaled genotype likelihoods with uniform prior.
gp_dosage(gp) Return expected genotype dosage from array of genotype probabilities.
get_sequence(contig, position[, before, …]) Return the reference sequence at a given locus.
mendel_error_code(locus, is_female, father, …) Compute a Mendelian violation code for genotypes.
liftover(x, dest_reference_genome[, …]) Lift over coordinates to a different reference genome.
min_rep(locus, alleles) Computes the minimal representation of a (locus, alleles) polymorphism.