Plot

Warning

Plotting functionality is in early stages and is experimental. Interfaces will change regularly.

Plotting in Hail is easy. Hail’s plot functions utilize Bokeh plotting libraries to create attractive, interactive figures. Plotting functions in this module return a Bokeh Figure, so you can call a method to plot your data and then choose to extend the plot however you like by interacting directly with Bokeh. See the GWAS tutorial for examples.

Plot functions in Hail accept data in the form of either Python objects or Table and MatrixTable fields.

histogram Create a histogram.
cumulative_histogram Create a cumulative histogram.
scatter Create a scatterplot.
qq Create a Quantile-Quantile plot.
manhattan Create a Manhattan plot.
hail.plot.histogram(data, range=None, bins=50, legend=None, title=None)[source]

Create a histogram.

Parameters:
  • data (Struct or Float64Expression) – Sequence of data to plot.
  • range (Tuple[float]) – Range of x values in the histogram.
  • bins (int) – Number of bins in the histogram.
  • legend (str) – Label of data on the x-axis.
  • title (str) – Title of the histogram.
Returns:

bokeh.plotting.figure.Figure

hail.plot.cumulative_histogram(data, range=None, bins=50, legend=None, title=None, normalize=True, log=False)[source]

Create a cumulative histogram.

Parameters:
  • data (Struct or Float64Expression) – Sequence of data to plot.
  • range (Tuple[float]) – Range of x values in the histogram.
  • bins (int) – Number of bins in the histogram.
  • legend (str) – Label of data on the x-axis.
  • title (str) – Title of the histogram.
  • normalize (bool) – Whether or not the cumulative data should be normalized.
  • log (bool) – Whether or not the y-axis should be of type log.
Returns:

bokeh.plotting.figure.Figure

hail.plot.scatter(x, y, label=None, title=None, xlabel=None, ylabel=None, size=4, legend=True, collect_all=False, n_divisions=500, source_fields=None)[source]

Create a scatterplot.

Parameters:
  • x (List[float] or Float64Expression) – List of x-values to be plotted.
  • y (List[float] or Float64Expression) – List of y-values to be plotted.
  • label (List[str] or StringExpression) – List of labels for x and y values, used to assign each point a label (e.g. population)
  • title (str) – Title of the scatterplot.
  • xlabel (str) – X-axis label.
  • ylabel (str) – Y-axis label.
  • size (int) – Size of markers in screen space units.
  • legend (bool) – Whether or not to show the legend in the resulting figure.
  • collect_all (bool) – Whether to collect all values or downsample before plotting. This parameter will be ignored if x and y are Python objects.
  • n_divisions (int) – Factor by which to downsample (default value = 500). A lower input results in fewer output datapoints.
  • source_fields (Dict[str, List[Any]]) – Extra fields for the ColumnDataSource of the plot.
Returns:

bokeh.plotting.figure.Figure

hail.plot.qq(pvals, collect_all=False, n_divisions=500)[source]

Create a Quantile-Quantile plot. (https://en.wikipedia.org/wiki/Q-Q_plot)

Parameters:
  • pvals (List[float] or Float64Expression) – P-values to be plotted.
  • collect_all (bool) – Whether to collect all values or downsample before plotting. This parameter will be ignored if pvals is a Python object.
  • n_divisions (int) – Factor by which to downsample (default value = 500). A lower input results in fewer output datapoints.
Returns:

bokeh.plotting.figure.Figure

hail.plot.manhattan(pvals, locus=None, title=None, size=4, hover_fields=None, collect_all=False, n_divisions=500, significance_line=5e-08)[source]

Create a Manhattan plot. (https://en.wikipedia.org/wiki/Manhattan_plot)

Parameters:
  • pvals (Float64Expression) – P-values to be plotted.
  • locus (LocusExpression) – Locus values to be plotted.
  • title (str) – Title of the plot.
  • size (int) – Size of markers in screen space units.
  • hover_fields (Dict[str, Expression]) – Dictionary of field names and values to be shown in the HoverTool of the plot.
  • collect_all (bool) – Whether to collect all values or downsample before plotting.
  • n_divisions (int) – Factor by which to downsample (default value = 500). A lower input results in fewer output datapoints.
  • significance_line (float, optional) – p-value at which to add a horizontal, dotted red line indicating genome-wide significance. If None, no line is added.
Returns:

bokeh.plotting.figure.Figure