Giter VIP home page Giter VIP logo

hail's Introduction

Hail

Gitter CI Status

Hail is an open-source, scalable framework for exploring and analyzing genetic data. Starting from sequencing or microarray data in VCF and other formats, Hail can, for example:

  • generate variant annotations like call rate, Hardy-Weinberg equilibrium p-value, and population-specific allele count
  • generate sample annotations like mean depth, imputed sex, and TiTv ratio
  • load variant and sample annotations from text tables, JSON, VCF, VEP, and locus interval files
  • generate new annotations from existing annotations and the genotypes, and use these to filter samples, variants, and genotypes
  • find Mendelian violations in trios, analyze genetic similarity between samples via the GRM and IBD matrix, and compute sample scores and variant loadings using PCA
  • perform association analyses using linear, logistic, and linear mixed regression, and estimate heritability

All this functionality is exposed through Python and backed by distributed algorithms built on top of Apache Spark to efficiently analyze gigabyte-scale data on a laptop or terabyte-scale data on an on-prem cluster or in the cloud.

Hail is used in published research and as the core analysis platform of large-scale genomics efforts including ExAC v2 and gnomAD. The project began in Fall 2015 and is under very active development as we work toward a stable release, so we do not guarantee forward compatibility of formats and interfaces. Want to get involved in development? Check out the Github repo and chat with us in the Gitter dev room.

To get started using Hail:

We encourage use of the discussion forum for user and dev support, feature requests, and sharing your Hail-powered science. Please report any suspected bugs to github issues.

Hail Team

The Hail team is based in the Neale lab at the Stanley Center for Psychiatric Research of the Broad Institute of MIT and Harvard and the Analytic and Translational Genetics Unit of Massachusetts General Hospital.

Contact the Hail team at [email protected].

Citing Hail

If you use Hail for published work, please cite the software:

and either the forthcoming manuscript describing Hail (if possible):

  • Cotton Seed, Alex Bloemendal, Jonathan M Bloom, Jacqueline I Goldstein, Daniel King, Timothy Poterba, Benjamin M. Neale. Hail: An Open-Source Framework for Scalable Genetic Data Analysis. In preparation.

or the following paper which includes a brief introduction to Hail in the online methods:

  • Andrea Ganna, Giulio Genovese, et al. Ultra-rare disruptive and damaging mutations influence educational attainment in the general population. Nature Neuroscience

And we'd love to hear about your work in the Science category of the discussion forum!

hail's People

Contributors

tpoterba avatar cseed avatar jbloom22 avatar jigold avatar danking avatar alexb-3 avatar lfrancioli avatar johnc1231 avatar bw2 avatar tomwhite avatar konradjk avatar fedja avatar khernyo avatar zamoshchin avatar mpinese avatar gitter-badger avatar shusson avatar

Watchers

James Cloos avatar Akshay Kakumanu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.