VariantDataset

class hail.vds.VariantDataset[source]

Class for representing cohort-level genomic data.

This class facilitates a sparse, split representation of genomic data in which reference block data and variant data are contained in separate MatrixTable objects.

Parameters:

reference_data (MatrixTable) – MatrixTable containing only reference block data.
variant_data (MatrixTable) – MatrixTable containing only variant data.

Attributes

`ref_block_max_length_field`	Name of global field that indicates max reference block length.
`reference_genome`	Dataset reference genome.

Methods

`checkpoint`	Write to path and then read from path.
`from_merged_representation`	Create a VariantDataset from a sparse MatrixTable containing variant and reference data.
`n_samples`	The number of samples present.
`union_rows`	Combine many VDSes with the same samples but disjoint variants.
`validate`	Eagerly checks necessary representational properties of the VDS.
`write`	Write to path.

checkpoint(path, **kwargs)[source]: Write to path and then read from path.

static from_merged_representation(mt, *, ref_block_indicator_field='END', ref_block_fields=(), infer_ref_block_fields=True, is_split=False)[source]: Create a VariantDataset from a sparse MatrixTable containing variant and reference data.

n_samples()[source]: The number of samples present.

ref_block_max_length_field = 'ref_block_max_length': Name of global field that indicates max reference block length.

property reference_genome

Dataset reference genome.

Returns:: ReferenceGenome

union_rows()[source]

Combine many VDSes with the same samples but disjoint variants.

Examples

If a dataset is imported as VDS in chromosome-chunks, the following will combine them into one VDS:

>>> vds_paths = ['chr1.vds', 'chr2.vds']  
... vds_per_chrom = [hl.vds.read_vds(path) for path in vds_paths)  
... hl.vds.VariantDataset.union_rows(*vds_per_chrom)  

validate(*, check_data=True)[source]: Eagerly checks necessary representational properties of the VDS.

write(path, **kwargs)[source]

Write to path.

Any optional parameter from MatrixTable.write() can be passed as a keyword paramter to this method.