Giter VIP home page Giter VIP logo

deepan's Introduction

Extracting knowledge from multi-omics and clinical datasets using effective graph autoencoders

Cluster patient based on their multiomics data utilizing graph autoencoders. Adapted from the Simple and Effective Graph Autoencoders with One-Hop Linear Models(Salha et al., 2020) in PyTorch Geometric.

About the project

Project based on pytorch-geometric. It uses clinical EHR, gene expression and somatic mutation data from the TCGA Study TCGA Study

1. Generate a patient similarity graph.

Transform omics data into binary or numerical features and preselect them. Generated patient nodes have an edge connecting them if their distance in the feature space is below a set threshold. The feature matrix and the adjacency matrix are stored in a PyTorch Data Object.

2. Train graph autoencoders

GAE are graph convolutional nets that integrate feature and adjacency information. The resulting latent represenation is decoded to reconstruct the adjacency information and the loss is the mean squared error between the original matrix and the reconstructed one. Various architectures from the pytorch geometric project are included and they all result in a latent representation after training. Mainly using simple linear AE, GAE, VGAE, variational simple linear AE

3. Clustering analysis for the latent represenation of the patients

The latent represenation can the be projected via an dimensionality reduction (UMAP) and clustered (DBSCAN). An survival analysis is performed on the clustered patients afterwards.

Getting started

For GPU usage please check CUDA (min version 10.1) distributions in dependencies and in the requirements in the following links. Conda environment preferred: follow installation steps for pytorch under (min version 1.4.0): Pytorch Installation

follow installation steps for pytorch geometric under (min version 1.6): PyG Docs and PyG Installation

Remaining required packages under Dependencies

Single runs can be executed by running

pytorch_linearVAE.py

Multiple runs with different parameters can be executed by running

run.sh 

The output of the runs is visualized in Tensorboard (HTML based Dashboard) and executable for example:

Terminal command:

tensorboard --logdir=./Deepan/runs/2021-03-18

Additional

This repository is licensed under MIT license The AGE graph clustering implementation can be found under ferdinand-popp/AGE and utilizes the pytorch dataset generated by this repository.

deepan's People

Contributors

ferdinand-popp avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.