Hail is an open-source, scalable framework for exploring and analyzing genomic data.
For genomics applications, Hail can, for example:
- flexibly import and export to a variety of data and annotation formats, including VCF, BGEN and PLINK
- generate variant annotations like call rate, Hardy-Weinberg equilibrium p-value, and population-specific allele count; and import annotations in parallel through annotation datasets, VEP, and Nirvana
- compute sample annotations like mean depth, imputed sex, and TiTv ratio
- compute new annotations from existing ones as well as genotypes, and use these to filter samples, variants, and genotypes
- find Mendelian violations in trios, prune variants in linkage disequilibrium, analyze genetic similarity between samples, and compute sample scores and variant loadings using PCA
- perform variant, gene-burden and eQTL association analyses using linear, logistic, Poisson, and linear mixed regression, and estimate heritability
- lots more! Check out some of the new features in Hail 0.2.
Hail is a Python library with a scalable backend built on top of Apache Spark to efficiently analyze gigabyte-scale data on a laptop or terabyte-scale data on a cluster.
To get started using Hail:
Hail uses a continuous deployment approach to software development, which means features, bug fixes, and performance improvements land every day. We recommend updating the software frequently.
There are many ways to get in touch with the Hail team if you need help using Hail or would like to suggest improvements or new features.
Hail is maintained by a team in the Neale lab at the Stanley Center for Psychiatric Research of the Broad Institute of MIT and Harvard and the Analytic and Translational Genetics Unit of Massachusetts General Hospital.
Contact the Hail team:
Follow Hail on Twitter: @hailgenetics.
If you use Hail for published work, please cite the software:
- Hail, https://github.com/hail-is/hail
The Hail team has several sources of funding at the Broad Institute:
- The Stanley Center for Psychiatric Research, which together with Neale Lab has provided an incredibly supportive and stimulating home.
- Principal Investigators Benjamin Neale and Daniel MacArthur, whose scientific leadership has been essential for solving the right problems.
- Jeremy Wertheimer, whose strategic advice and generous philanthropy have been essential for growing the impact of Hail.
We are grateful for generous support from:
- The National Institute of Diabetes and Digestive and Kidney Diseases
- The National Institute of Mental Health
- The National Human Genome Research Institute
- The Chan Zuckerburg Initiative
We would like to thank Zulip for supporting open-source by providing free hosting, and YourKit, LLC for generously providing free licenses for YourKit Java Profiler for open-source development.