Interval

class hail.representation.Interval(start, end)[source]

A genomic interval marked by start and end loci.

Parameters:
  • start (Locus) – inclusive start locus
  • end (Locus) – exclusive end locus

Attributes

end Locus object referring to the end of the interval (exclusive).
start Locus object referring to the start of the interval (inclusive).

Methods

__init__ x.__init__(…) initializes x; see help(type(x)) for signature
contains True if the supplied locus is contained within the interval.
overlaps True if the the supplied interval contains any locus in common with this one.
parse Parses a genomic interval from string representation.
contains(locus)[source]

True if the supplied locus is contained within the interval.

This membership check is left-inclusive, right-exclusive. This means that the interval 1:100-101 includes 1:100 but not 1:101.

Type:locus: Locus.
Return type:bool
end

Locus object referring to the end of the interval (exclusive).

Return type:Locus
overlaps(interval)[source]

True if the the supplied interval contains any locus in common with this one.

The statement

>>> interval1.overlaps(interval2)

is equivalent to

>>> interval1.contains(interval2.start) or interval2.contains(interval1.start)
Type:interval: Interval
Return type:bool
static parse(string)[source]

Parses a genomic interval from string representation.

Examples:

>>> interval_1 = Interval.parse('X:100005-X:150020')
>>> interval_2 = Interval.parse('16:29500000-30200000')
>>> interval_3 = Interval.parse('16:29.5M-30.2M')  # same as interval_2
>>> interval_4 = Interval.parse('16:30000000-END')
>>> interval_5 = Interval.parse('16:30M-END')  # same as interval_4
>>> interval_6 = Interval.parse('1-22')  # autosomes
>>> interval_7 = Interval.parse('X')  # all of chromosome X

There are several acceptable representations.

CHR1:POS1-CHR2:POS2 is the fully specified representation, and we use this to define the various shortcut representations.

In a POS field, start (Start, START) stands for 0.

In a POS field, end (End, END) stands for max int.

In a POS field, the qualifiers m (M) and k (K) multiply the given number by 1,000,000 and 1,000, respectively. 1.6K is short for 1600, and 29M is short for 29000000.

CHR:POS1-POS2 stands for CHR:POS1-CHR:POS2

CHR1-CHR2 stands for CHR1:START-CHR2:END

CHR stands for CHR:START-CHR:END

Note that the start locus must precede the start locus.

Return type:Interval
start

Locus object referring to the start of the interval (inclusive).

Return type:Locus