Functions

These functions are exposed at the top level of the module, e.g. hl.case.

Core language functions

literal(x[, dtype])

Captures and broadcasts a Python variable or object as an expression.

cond(condition, consequent, alternate[, ...])

Deprecated in favor of if_else().

if_else(condition, consequent, alternate[, ...])

Expression for an if/else statement; tests a condition and returns one of two options based on the result.

switch(expr)

Build a conditional tree on the value of an expression.

case([missing_false])

Chain multiple if-else statements with a CaseBuilder.

bind(f, *exprs[, _ctx])

Bind a temporary variable and use it in a function.

rbind(*exprs[, _ctx])

Bind a temporary variable and use it in a function.

null(t)

Deprecated in favor of missing().

is_missing(expression)

Returns True if the argument is missing.

is_defined(expression)

Returns True if the argument is not missing.

coalesce(*args)

Returns the first non-missing value of args.

or_else(a, b)

If a is missing, return b.

or_missing(predicate, value)

Returns value if predicate is True, otherwise returns missing.

range(start[, stop, step])

Returns an array of integers from start to stop by step.

query_table(path, point_or_interval)

Query records from a table corresponding to a given point or range of keys.

Constructors

bool(x)

Convert to a Boolean expression.

float(x)

Convert to a 64-bit floating point expression.

float32(x)

Convert to a 32-bit floating point expression.

float64(x)

Convert to a 64-bit floating point expression.

int(x)

Convert to a 32-bit integer expression.

int32(x)

Convert to a 32-bit integer expression.

int64(x)

Convert to a 64-bit integer expression.

interval(start, end[, includes_start, ...])

Construct an interval expression.

str(x)

Returns the string representation of x.

struct(**kwargs)

Construct a struct expression.

tuple(iterable)

Construct a tuple expression.

Collection constructors

array(collection)

Construct an array expression.

empty_array(t)

Returns an empty array of elements of a type t.

set(collection)

Convert a set expression.

empty_set(t)

Returns an empty set of elements of a type t.

dict(collection)

Creates a dictionary.

empty_dict(key_type, value_type)

Returns an empty dictionary with key type key_type and value type value_type.

Collection functions

len(x)

Returns the size of a collection or string.

map(f, *collections)

Transform each element of a collection.

flatmap(f, collection)

Map each element of the collection to a new collection, and flatten the results.

zip(*arrays[, fill_missing])

Zip together arrays into a single array.

enumerate(a[, start, index_first])

Returns an array of (index, element) tuples.

zip_with_index(a[, index_first])

Deprecated in favor of enumerate().

flatten(collection)

Flatten a nested collection by concatenating sub-collections.

any(*args)

Check for any True in boolean expressions or collections of booleans.

all(*args)

Check for all True in boolean expressions or collections of booleans.

filter(f, collection)

Returns a new collection containing elements where f returns True.

sorted(collection[, key, reverse])

Returns a sorted array.

find(f, collection)

Returns the first element where f returns True.

group_by(f, collection)

Group collection elements into a dict according to a lambda function.

fold(f, zero, collection)

Reduces a collection with the given function f, provided the initial value zero.

array_scan(f, zero, a)

Map each element of a to cumulative value of function f, with initial value zero.

reversed(x)

Reverses the elements of a collection.

keyed_intersection(*arrays, key)

Compute the intersection of sorted arrays on a given key.

keyed_union(*arrays, key)

Compute the distinct union of sorted arrays on a given key.

Numeric functions

abs(x)

Take the absolute value of a numeric value, array or ndarray.

approx_equal(x, y[, tolerance, absolute, ...])

Tests whether two numbers are approximately equal.

bit_and(x, y)

Bitwise and x and y.

bit_or(x, y)

Bitwise or x and y.

bit_xor(x, y)

Bitwise exclusive-or x and y.

bit_lshift(x, y)

Bitwise left-shift x by y.

bit_rshift(x, y[, logical])

Bitwise right-shift x by y.

bit_not(x)

Bitwise invert x.

bit_count(x)

Count the number of 1s in the in the two's complement binary representation of x.

exp(x)

expit(x)

is_nan(x)

is_finite(x)

is_infinite(x)

log(x[, base])

Take the logarithm of the x with base base.

log10(x)

logit(x)

sign(x)

Returns the sign of a numeric value, array or ndarray.

sqrt(x)

int(x)

Convert to a 32-bit integer expression.

int32(x)

Convert to a 32-bit integer expression.

int64(x)

Convert to a 64-bit integer expression.

float(x)

Convert to a 64-bit floating point expression.

float32(x)

Convert to a 32-bit floating point expression.

float64(x)

Convert to a 64-bit floating point expression.

floor(x)

ceil(x)

uniroot(f, min, max, *[, max_iter, epsilon, ...])

Finds a root of the function f within the interval [min, max].

Numeric collection functions

min(*exprs[, filter_missing])

Returns the minimum element of a collection or of given numeric expressions.

nanmin(*exprs[, filter_missing])

Returns the minimum value of a collection or of given arguments, excluding NaN.

max(*exprs[, filter_missing])

Returns the maximum element of a collection or of given numeric expressions.

nanmax(*exprs[, filter_missing])

Returns the maximum value of a collection or of given arguments, excluding NaN.

mean(collection[, filter_missing])

Returns the mean of all values in the collection.

median(collection)

Returns the median value in the collection.

product(collection[, filter_missing])

Returns the product of values in the collection.

sum(collection[, filter_missing])

Returns the sum of values in the collection.

cumulative_sum(a[, filter_missing])

Returns an array of the cumulative sum of values in the array.

argmin(array[, unique])

Return the index of the minimum value in the array.

argmax(array[, unique])

Return the index of the maximum value in the array.

corr(x, y)

Compute the Pearson correlation coefficient between x and y.

binary_search(array, elem)

Binary search array for the insertion point of elem.

String functions

format(f, *args)

Returns a formatted string using a specified format string and arguments.

json(x)

Convert an expression to a JSON string expression.

parse_json(x, dtype)

Convert a JSON string to a structured expression.

hamming(s1, s2)

Returns the Hamming distance between the two strings.

delimit(collection[, delimiter])

Joins elements of collection into single string delimited by delimiter.

entropy(s)

Returns the Shannon entropy of the character distribution defined by the string.

parse_int(x)

Parse a string as a 32-bit integer.

parse_int32(x)

Parse a string as a 32-bit integer.

parse_int64(x)

Parse a string as a 64-bit integer.

parse_float(x)

Parse a string as a 64-bit floating point number.

parse_float32(x)

Parse a string as a 32-bit floating point number.

parse_float64(x)

Parse a string as a 64-bit floating point number.

Statistical functions

chi_squared_test(c1, c2, c3, c4)

Performs chi-squared test of independence on a 2x2 contingency table.

fisher_exact_test(c1, c2, c3, c4)

Calculates the p-value, odds ratio, and 95% confidence interval using Fisher's exact test for a 2x2 table.

contingency_table_test(c1, c2, c3, c4, ...)

Performs chi-squared or Fisher's exact test of independence on a 2x2 contingency table.

dbeta(x, a, b)

Returns the probability density at x of a beta distribution with parameters a (alpha) and b (beta).

dpois(x, lamb[, log_p])

Compute the (log) probability density at x of a Poisson distribution with rate parameter lamb.

hardy_weinberg_test(n_hom_ref, n_het, n_hom_var)

Performs test of Hardy-Weinberg equilibrium.

pchisqtail(x, df[, ncp, lower_tail, log_p])

Returns the probability under the right-tail starting at x for a chi-squared distribution with df degrees of freedom.

pnorm(x[, mu, sigma, lower_tail, log_p])

The cumulative probability function of a normal distribution with mean mu and standard deviation sigma.

ppois(x, lamb[, lower_tail, log_p])

The cumulative probability function of a Poisson distribution.

qchisqtail(p, df[, ncp, lower_tail, log_p])

The quantile function of a chi-squared distribution with df degrees of freedom, inverts pchisqtail().

qnorm(p[, mu, sigma, lower_tail, log_p])

The quantile function of a normal distribution with mean mu and standard deviation sigma, inverts pnorm().

qpois(p, lamb[, lower_tail, log_p])

The quantile function of a Poisson distribution with rate parameter lamb, inverts ppois().

Randomness

rand_bool(p[, seed])

Returns True with probability p.

rand_beta(a, b[, lower, upper, seed])

Samples from a beta distribution with parameters a (alpha) and b (beta).

rand_cat(prob[, seed])

Samples from a categorical distribution.

rand_dirichlet(a[, seed])

Samples from a Dirichlet distribution.

rand_gamma(shape, scale[, seed])

Samples from a gamma distribution with parameters shape and scale.

rand_norm([mean, sd, seed, size])

Samples from a normal distribution with mean mean and standard deviation sd.

rand_pois(lamb[, seed])

Samples from a Poisson distribution with rate parameter lamb.

rand_unif([lower, upper, seed, size])

Samples from a uniform distribution within the interval [lower, upper].

rand_int32(a[, b, seed])

Samples from a uniform distribution of 32-bit integers.

rand_int64([a, b, seed])

Samples from a uniform distribution of 64-bit integers.

shuffle(a[, seed])

Randomly permute an array

Genetics functions

locus(contig, pos[, reference_genome])

Construct a locus expression from a chromosome and position.

locus_from_global_position(global_pos[, ...])

Constructs a locus expression from a global position and a reference genome.

locus_interval(contig, start, end[, ...])

Construct a locus interval expression.

parse_locus(s[, reference_genome])

Construct a locus expression by parsing a string or string expression.

parse_variant(s[, reference_genome])

Construct a struct with a locus and alleles by parsing a string.

parse_locus_interval(s[, reference_genome, ...])

Construct a locus interval expression by parsing a string or string expression.

variant_str(*args)

Create a variant colon-delimited string.

call(*alleles[, phased])

Construct a call expression.

unphased_diploid_gt_index_call(gt_index)

Construct an unphased, diploid call from a genotype index.

parse_call(s)

Construct a call expression by parsing a string or string expression.

downcode(c, i)

Create a new call by setting all alleles other than i to ref

triangle(n)

Returns the triangle number of n.

is_snp(ref, alt)

Returns True if the alleles constitute a single nucleotide polymorphism.

is_mnp(ref, alt)

Returns True if the alleles constitute a multiple nucleotide polymorphism.

is_transition(ref, alt)

Returns True if the alleles constitute a transition.

is_transversion(ref, alt)

Returns True if the alleles constitute a transversion.

is_insertion(ref, alt)

Returns True if the alleles constitute an insertion.

is_deletion(ref, alt)

Returns True if the alleles constitute a deletion.

is_indel(ref, alt)

Returns True if the alleles constitute an insertion or deletion.

is_star(ref, alt)

Returns True if the alleles constitute an upstream deletion.

is_complex(ref, alt)

Returns True if the alleles constitute a complex polymorphism.

is_valid_contig(contig[, reference_genome])

Returns True if contig is a valid contig name in reference_genome.

is_valid_locus(contig, position[, ...])

Returns True if contig and position is a valid site in reference_genome.

contig_length(contig[, reference_genome])

Returns the length of contig in reference_genome.

allele_type(ref, alt)

Returns the type of the polymorphism as a string.

pl_dosage(pl)

Return expected genotype dosage from array of Phred-scaled genotype likelihoods with uniform prior.

gp_dosage(gp)

Return expected genotype dosage from array of genotype probabilities.

get_sequence(contig, position[, before, ...])

Return the reference sequence at a given locus.

mendel_error_code(locus, is_female, father, ...)

Compute a Mendelian violation code for genotypes.

liftover(x, dest_reference_genome[, ...])

Lift over coordinates to a different reference genome.

min_rep(locus, alleles)

Computes the minimal representation of a (locus, alleles) polymorphism.

reverse_complement(s[, rna])

Reverses the string and translates base pairs into their complements .