Giter VIP home page Giter VIP logo

anjunatarajan / colorectal-cancer-survival-analysis-msk Goto Github PK

View Code? Open in Web Editor NEW
2.0 1.0 4.0 1.56 MB

This repository contains all machine learning and statistical models used to analyze the landscape of colorectal cancer.

Home Page: https://www.cbioportal.org/study/summary?id=crc_msk_2017

License: GNU General Public License v2.0

R 100.00%
cbioportal cancer-genomics cancer-research cancer machine-learning statistical-analysis colorectal-cancer genomics

colorectal-cancer-survival-analysis-msk's Introduction

Colorectal Cancer Survival Analysis

Description

This R Repository is a comprehensive statistical analysis of the clinical and genomic attributes based on samples collected from 1134 colorectal cancer patients treated at the Memorial Sloan Kettering Cancer Institute. This data is available publicly through cBioPortal for cancer genomics. In addition to statistical analyses, further machine learning models were built to predict the presence or absence of TP53 gene mutations as a novel study since TP53 mutations were present in 77% of the cancer patients within the cohort, and no similar studies have been conducted on TP53. The intent behind the statistical analyses and machine learning models is to enable clinicians who are providing care for cancer patients to provide a more targeted pattern based treatments and build a more useful genomic and clinical profile for each patient.

R packages used

The following R packages were instrumental for this project:

readxl class dplyr RSMS ROCR tidyverse
ggplot2 survival surminer lubridate lranger ggfortify

Sample Kaplan Meier Survival Curves for univariate analysis of Clinical Attributes

Median Age Masectomy Sex

Sample Kaplan Meier Survival Curves for multivariate analysis of genomic attributes

Gene Alterations

CoxPH Survival Analysis Box Plots to perform comparative reference based multivariate analysis on TP53 Gene Mutations

TP53 Mutations

CoxPH Survival Analysis Box Plots to perform comparative reference based multivariate analysis on all attributes

CoxPH Clinical and Genomic

Repository Contents

Non curated public data in Excel format containing clinical and genomic attributes for the patient cohort studied. Curated Text file and Excel file for performing pairwise analysis of certain insightful attributes. R code that executes statistical and machine learning methods on the Excel data

Usage

Install R Studio or feel free to use an online tool like Kaggle to execute the R code. Download the Excel and Text files, alter the source code to load the data from your local path. Execute the R code and have fun! It is recommended to gain some basis understanding of Survival Analysis using Kaplan Meier, CoxPH models and the usage of KNN and GLM machine learning models prior to trying out this script.

Roadmap

Future releases to this script will be made based on the following vision of the author:

  • Gain access to more data in order to develop the existing statistical analyses and to improve the Accuracy and AUC for the machine learning models
  • Obtain data about samples of primary tumor along with the information of which ones later on metastasized. Such a model will be vital for cancer care clinicians since they can use this information for preventive and more targeted treatment measures by monitoring specific patients and also by observing patients with similar prognosis based on how the Machine Learning models clustered them.
  • Submit the models for experimental use for clinicians providing cancer care based on reviews and approvals from bioinformaticians

Support

Any one running into issues or have questions can open a Github issue and the author will respond to the issues within 2 weeks of opening the issue.

colorectal-cancer-survival-analysis-msk's People

Contributors

anjunatarajan avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.