linalg/utils
|
Returns start and stop indices for window around each array value. |
|
Returns start and stop indices for window around each locus. |
- hail.linalg.utils.array_windows(a, radius)[source]
Returns start and stop indices for window around each array value.
Examples
>>> hl.linalg.utils.array_windows(np.array([1, 2, 4, 4, 6, 8]), 2) (array([0, 0, 1, 1, 2, 4]), array([2, 4, 5, 5, 6, 6]))
>>> hl.linalg.utils.array_windows(np.array([-10.0, -2.5, 0.0, 0.0, 1.2, 2.3, 3.0]), 2.5) (array([0, 1, 1, 1, 2, 2, 4]), array([1, 4, 6, 6, 7, 7, 7]))
Notes
For an array
a
in ascending order, the resultingstarts
andstops
arrays have the same length asa
and the property that, for all indicesi
,[starts[i], stops[i])
is the maximal range of indicesj
such thata[i] - radius <= a[j] <= a[i] + radius
.Index ranges are start-inclusive and stop-exclusive. This function is especially useful in conjunction with
BlockMatrix.sparsify_row_intervals()
.- Parameters:
a (
numpy.ndarray
of signed integer or float values) – 1-dimensional array of values, non-decreasing with respect to index.radius (
float
) – Non-negative radius of window for values.
- Returns:
(
numpy.ndarray
ofint
,numpy.ndarray
ofint
) – Tuple of start indices array and stop indices array.
- hail.linalg.utils.locus_windows(locus_expr, radius, coord_expr=None, _localize=True)[source]
Returns start and stop indices for window around each locus.
Examples
Windows with 2bp radius for one contig with positions 1, 2, 3, 4, 5:
>>> starts, stops = hl.linalg.utils.locus_windows( ... hl.balding_nichols_model(1, 5, 5).locus, ... radius=2) >>> starts, stops (array([0, 0, 0, 1, 2]), array([3, 4, 5, 5, 5]))
The following examples involve three contigs.
>>> loci = [{'locus': hl.Locus('1', 1), 'cm': 1.0}, ... {'locus': hl.Locus('1', 2), 'cm': 3.0}, ... {'locus': hl.Locus('1', 4), 'cm': 4.0}, ... {'locus': hl.Locus('2', 1), 'cm': 2.0}, ... {'locus': hl.Locus('2', 1), 'cm': 2.0}, ... {'locus': hl.Locus('3', 3), 'cm': 5.0}]
>>> ht = hl.Table.parallelize( ... loci, ... hl.tstruct(locus=hl.tlocus('GRCh37'), cm=hl.tfloat64), ... key=['locus'])
Windows with 1bp radius:
>>> hl.linalg.utils.locus_windows(ht.locus, 1) (array([0, 0, 2, 3, 3, 5]), array([2, 2, 3, 5, 5, 6]))
Windows with 1cm radius:
>>> hl.linalg.utils.locus_windows(ht.locus, 1.0, coord_expr=ht.cm) (array([0, 1, 1, 3, 3, 5]), array([1, 3, 3, 5, 5, 6]))
Notes
This function returns two 1-dimensional ndarrays of integers,
starts
andstops
, each of size equal to the number of rows.By default, for all indices
i
,[starts[i], stops[i])
is the maximal range of row indicesj
such thatcontig[i] == contig[j]
andposition[i] - radius <= position[j] <= position[i] + radius
.If the
global_position()
on locus_expr is not in ascending order, this method will fail. Ascending order should hold for a matrix table keyed by locus or variant (and the associated row table), or for a table that has been ordered by locus_expr.Set coord_expr to use a value other than position to define the windows. This row-indexed numeric expression must be non-missing, non-
nan
, on the same source as locus_expr, and ascending with respect to locus position for each contig; otherwise the function will fail.The last example above uses centimorgan coordinates, so
[starts[i], stops[i])
is the maximal range of row indicesj
such thatcontig[i] == contig[j]
andcm[i] - radius <= cm[j] <= cm[i] + radius
.Index ranges are start-inclusive and stop-exclusive. This function is especially useful in conjunction with
BlockMatrix.sparsify_row_intervals()
.- Parameters:
locus_expr (
LocusExpression
) – Row-indexed locus expression on a table or matrix table.radius (
int
) – Radius of window for row values.coord_expr (
Float64Expression
, optional) – Row-indexed numeric expression for the row value. Must be on the same table or matrix table as locus_expr. By default, the row value is given by the locus position.
- Returns:
(
numpy.ndarray
ofint
,numpy.ndarray
ofint
) – Tuple of start indices array and stop indices array.