# linalg/utils

 array_windows(a, radius) Returns start and stop indices for window around each array value. locus_windows(locus_expr, radius[, ...]) Returns start and stop indices for window around each locus.

Returns start and stop indices for window around each array value.

Examples

>>> hl.linalg.utils.array_windows(np.array([1, 2, 4, 4, 6, 8]), 2)
(array([0, 0, 1, 1, 2, 4]), array([2, 4, 5, 5, 6, 6]))

>>> hl.linalg.utils.array_windows(np.array([-10.0, -2.5, 0.0, 0.0, 1.2, 2.3, 3.0]), 2.5)
(array([0, 1, 1, 1, 2, 2, 4]), array([1, 4, 6, 6, 7, 7, 7]))


Notes

For an array a in ascending order, the resulting starts and stops arrays have the same length as a and the property that, for all indices i, [starts[i], stops[i]) is the maximal range of indices j such that a[i] - radius <= a[j] <= a[i] + radius.

Index ranges are start-inclusive and stop-exclusive. This function is especially useful in conjunction with BlockMatrix.sparsify_row_intervals().

Parameters:
Returns:

(numpy.ndarray of int, numpy.ndarray of int) – Tuple of start indices array and stop indices array.

Returns start and stop indices for window around each locus.

Examples

Windows with 2bp radius for one contig with positions 1, 2, 3, 4, 5:

>>> starts, stops = hl.linalg.utils.locus_windows(
...     hl.balding_nichols_model(1, 5, 5).locus,
>>> starts, stops
(array([0, 0, 0, 1, 2]), array([3, 4, 5, 5, 5]))


The following examples involve three contigs.

>>> loci = [{'locus': hl.Locus('1', 1), 'cm': 1.0},
...         {'locus': hl.Locus('1', 2), 'cm': 3.0},
...         {'locus': hl.Locus('1', 4), 'cm': 4.0},
...         {'locus': hl.Locus('2', 1), 'cm': 2.0},
...         {'locus': hl.Locus('2', 1), 'cm': 2.0},
...         {'locus': hl.Locus('3', 3), 'cm': 5.0}]

>>> ht = hl.Table.parallelize(
...         loci,
...         hl.tstruct(locus=hl.tlocus('GRCh37'), cm=hl.tfloat64),
...         key=['locus'])


>>> hl.linalg.utils.locus_windows(ht.locus, 1)
(array([0, 0, 2, 3, 3, 5]), array([2, 2, 3, 5, 5, 6]))


>>> hl.linalg.utils.locus_windows(ht.locus, 1.0, coord_expr=ht.cm)
(array([0, 1, 1, 3, 3, 5]), array([1, 3, 3, 5, 5, 6]))


Notes

This function returns two 1-dimensional ndarrays of integers, starts and stops, each of size equal to the number of rows.

By default, for all indices i, [starts[i], stops[i]) is the maximal range of row indices j such that contig[i] == contig[j] and position[i] - radius <= position[j] <= position[i] + radius.

If the global_position() on locus_expr is not in ascending order, this method will fail. Ascending order should hold for a matrix table keyed by locus or variant (and the associated row table), or for a table that has been ordered by locus_expr.

Set coord_expr to use a value other than position to define the windows. This row-indexed numeric expression must be non-missing, non-nan, on the same source as locus_expr, and ascending with respect to locus position for each contig; otherwise the function will fail.

The last example above uses centimorgan coordinates, so [starts[i], stops[i]) is the maximal range of row indices j such that contig[i] == contig[j] and cm[i] - radius <= cm[j] <= cm[i] + radius.

Index ranges are start-inclusive and stop-exclusive. This function is especially useful in conjunction with BlockMatrix.sparsify_row_intervals().

Parameters:
• locus_expr (LocusExpression) – Row-indexed locus expression on a table or matrix table.

• radius (int) – Radius of window for row values.

• coord_expr (Float64Expression, optional) – Row-indexed numeric expression for the row value. Must be on the same table or matrix table as locus_expr. By default, the row value is given by the locus position.

Returns:

(numpy.ndarray of int, numpy.ndarray of int) – Tuple of start indices array and stop indices array.