Giter VIP home page Giter VIP logo

c-sclc-prognosisml's Introduction

Project Title: Combined Small Cell Lung Cancer (C-SCLC) Survival Analysis and Prediction

Abstract

Combined Small Cell Lung Cancer (C-SCLC), a rare variant subtype of lung cancer, has distinct characteristics and prognosis challenges. Despite its clinical significance, there exists a knowledge gap in the diagnosis, treatment, and prognosis of C-SCLC. Utilizing the SEER database, an authoritative source of cancer data in the United States, we applied advanced machine learning techniques to analyze the overall survival (OS) of C-SCLC patients from 2004 to 2020 across multiple staging systems. Our findings provide significant insights and contributions into the survival outcomes of C-SCLC patients, for their distinction within lung cancer classifications. We present a model to predict OS for patients with this rare subtype from 2004 to 2015, using the AJCC 6th edition, alongside visualizations and code to reproduce these results. Our model achieves 81% recall for high-risk patients (less than 9 months survival), with key contributing factors being Metastasis, Chemotherapy, Radiation, Surgery, and Tumor Size.

How to Use the Files

  1. Data: The "Data" folder contains:

    • Raw data (as obtained from SEER)
    • Pre-processed data (ready for modeling)
  2. Models: The "Models" folder contains saved models for reproducing results.

  3. Utils: The "utils" folder contains functions used throughout the project.

  4. SEERStat Receipts: Images "data_collection1.png", etc. provide proof of data access through SEERStat.

  5. Reproducing the Analysis:

    • Start with "data_cleaning.ipynb" for initial data cleaning.
    • Proceed to "data1_preprocessing.ipynb" for further preprocessing.
    • Finally, use "data1_modeling.ipynb" to run the modeling process.

Important Notes

  • This project focuses on data from 2004-2015 for consistency with the AJCC 6th Edition staging system.
  • Other files in the repository represent exploratory work or tests.

Disclaimer

This project is for research and analysis purposes only. Do not use the results for making clinical decisions. Data cannot be shared due to privacy issues, please create an account with SEER to access the data

c-sclc-prognosisml's People

Contributors

mswinds avatar parzon avatar

Stargazers

 avatar

Watchers

 avatar

Forkers

parzon

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.