DB
- class hail.experimental.DB[source]
- Bases: - object- An annotation database instance. - This class facilitates the annotation of genetic datasets with variant annotations. It accepts either an HTTP(S) URL to an Annotation DB configuration or a Python - dictdescribing an Annotation DB configuration. User must specify the region (aws:- 'us', gcp:- 'us-central1'or- 'europe-west1') in which the cluster is running if connecting to the default Hail Annotation DB. User must also specify the cloud platform that they are using (- 'gcp'or- 'aws').- Parameters:
- region ( - str) – Region cluster is running in, either- 'us',- 'us-central1', or- 'europe-west1'(default is- 'us-central1').
- cloud ( - str) – Cloud platform, either- 'gcp'or- 'aws'(default is- 'gcp').
- url ( - str, optional) – Optional URL to annotation DB configuration, if using custom configuration (default is- None).
- config ( - str, optional) – Optional- dictdescribing an annotation DB configuration, if using custom configuration (default is- None).
 
 - Note - The - 'aws'cloud platform is currently only available for the- 'us'region.- Examples - Create an annotation database connecting to the default Hail Annotation DB: - >>> db = hl.experimental.DB(region='us-central1', cloud='gcp') - Attributes - List of names of available annotation datasets. - Methods - Add annotations from datasets specified by name to a relational object. - annotate_rows_db(rel, *names)[source]
- Add annotations from datasets specified by name to a relational object. - List datasets with - available_datasets.- An interactive query builder is available in the Hail Annotation Database documentation. - Examples - Annotate a - MatrixTablewith- gnomad_lof_metrics:- >>> db = hl.experimental.DB(region='us-central1', cloud='gcp') >>> mt = db.annotate_rows_db(mt, 'gnomad_lof_metrics') - Annotate a - Tablewith- clinvar_gene_summary,- CADD, and- DANN:- >>> db = hl.experimental.DB(region='us-central1', cloud='gcp') >>> ht = db.annotate_rows_db(ht, 'clinvar_gene_summary', 'CADD', 'DANN') - Notes - If a dataset is gene-keyed, the annotation will be a dictionary mapping from gene name to the annotation value. There will be one entry for each gene overlapping the given locus. - If a dataset does not have unique rows for each key (consider the - gencodegenes, which may overlap; and- clinvar_variant_summary, which contains many overlapping multiple nucleotide variants), then the result will be an array of annotation values, one for each row.- Parameters:
- rel ( - MatrixTableor- Table) – The relational object to which to add annotations.
- names (varargs of - str) – The names of the datasets with which to annotate rel.
 
- Returns:
- MatrixTableor- Table– The relational object rel, with the annotations from names added.