Giter VIP home page Giter VIP logo

transfergwas's Introduction

transferGWAS

transferGWAS is a method for performing genome-wide association studies on whole images. This repository provides code to run your own transferGWAS on UK Biobank or your own data. transferGWAS has 3 steps: 1. pretraining, 2. feature condensation, and 3. LMM association analysis. Since the three steps require different compute infrastructure (GPU vs CPU server) and different parts can take longer time (e.g. pretraining can take a few days on a GPU), the parts are kept separate.

  • quickstart: helps you create dummy genetic data on which to run a simulation with transferGWAS.

  • pretraining: provides code for training your own models on retinal fundus scans. This is mostly for reproducibility purposes - we provide our trained models in models, and if you want to train on your own data you will probably want to adapt to that data.

  • feature_condensation: this is a short script to go from trained model to low-dimensional condensed features. If you want to run on your own data, you will maybe need to write a few lines of pytorch code to properly read in your data into a Dataset (there's no one-size-fits-all here, unfortunately).

  • lmm: this part is a wrapper for the BOLT-LMM association analysis, including some basic preprocessing steps. If you have experience with running GWAS, you maybe won't need this part.

  • models: here you can find our pretrained models (only after you downloaded them via ./download_models.sh)

  • simulation: This is the code for the simulation study.

  • reproducibility: This directory contains instructions and data to reproduce results from our paper.

Getting started

This repository requires bash and was written and tested on Ubuntu 18.04.4 LTS.

Start by cloning this repo:

git clone https://github.com/mkirchler/transferGWAS.git

You can download pretrained models and BOLT-LMM via

./download_models.sh

This includes the CNN pretrained on the EyePACS dataset to predict Diabetic Retinopathy and the StyleGAN2 on retinal fundus images for the simulation study (the ImageNet-pretrained network is included in the pytorch library), as well as BOLT-LMM version 2.3.4.

Python

All parts require python 3.6+, and all deep learning parts are built in pytorch. We recommend using some up-to-date version of anaconda and then creating a new environment from the environment.yml:

conda env create --file environment.yml
conda activate transfer_gwas

If you want to run part of the non-deep learning code (especially the BOLT-LMM) on a CPU-only machine, use the environment_cpu.yml file for that:

conda env create --file environment_cpu.yml
conda activate transfer_gwas_cpu

Note that this won't install any of the pytorch libraries - you can only use it for the run_bolt and for stages 1 and 4 in the simulation.

Installation of requirements should not take longer than a few minutes (depending on internet connection).

Reproducing paper results

To reproduce results from our paper, see the reproducibility directory.

Running a transferGWAS

If you don't want to train your own network, just:

  • start with feature_condensation, either with ImageNet-pretrained CNN or with the EyePACS CNN in models; then
  • run the lmm on those condensed embeddings.

If you do want to train your own network, first check out the pretraining part first.

transfergwas's People

Contributors

mkirchler avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.