Giter VIP home page Giter VIP logo

alphapatho's Introduction

Build Status License

AlphaPatho - Powerful Computational Pathology Tool

Paper: An end-to-end and well-generalized weakly supervised whole slide image analysis for lung cancer classification using deep learning

Contents

Framework

Ecosystem

Pretrained weights Description
Fold1 Fold1 model weights
Fold2 Fold2 model weights
Fold3 Fold3 model weights
Fold4 Fold4 model weights
Fold5 Fold5 model weights

Why AlphaPatho

  • Weakly supervised learning (No expensive annotations)
  • End-to-end pipeline (Iterative sampling and feature aggregation module)
  • Data efficiency (<200 images, AUCs of >0.9)
  • Promising generalizability (six heterogenous external datasets with AUCs of 0.93-0.97)
  • Overperform state-of-the-art methods

QuickStart

Install

System: Linux(CentOS) Pytorch (version 1.7.0): model training and validation. OpenSlide (version 3.4.1): processing whole slide images. lmdb (version 1.3.0): building lmdb dataset for faster data loading. staintools: following https://github.com/Peter554/StainTools.

$ git clone https://github.com/DeepLaboratory/AlphaPatho.git # install AlphaPatho
$ cd AlphaPatho

Directory description:

├─ data.py                 // dataset code
├─ mean_pixel.pkl		   // mean value of pixel value for data augmentation
├─ model.py				   // model architecture
├─ run.py		           // main code for training and validation
├─ run.sh                  // run.py launcher
├─ test.py                 // main code for inference or evaluation
├─ run_test.sh             // test.py launcher
├─ utils.py                // tool code
├─ tmp.py                  // LMDB dataset building code

Usage

Train from scratch

  • Slide tiling: see methods in our paper. When tiling is done, we should have files like below:

    ├─ slide1_folder
    │  ├─ tile1.png
    │  ├─ tile2.png
    │  ├─ ...
    ├─ slide2_folder
    │  ├─ tile1.png
    │  ├─ tile2.png
    │  ├─ ...
    ├─ ...
    
  • csv for LMDB dataset generation: after slide tiling, we need to generate a csv used for LMDB dataset generation. The csv content is as follow:

    slides_name label
    xxx/slide1_folder 1
    xxx/slide2_folder 0
    ... ... ...
  • Building LMDB dataset

    tmp.py is the main code. We need to modify the csv path and lmdb dataset save directory. Using tmp.py, we can generate the LMDB dataset.

  • Dataset preparation: preparing csv files for 5-fold validation, the format of each csv file is like follows:

    slides_name label
    xxx/slide1_folder 1
    xxx/slide2_folder 0
    ... ... ...

    After cvs preparation, we will get train.csv, val.csv, and test.csv for each fold.

  • Model training: run run.sh and start training.

Model inference or test

Preparing datasets for testing, modifying the parameters in the run_test.sh, and runing run_test.sh for testing!

Contribution

Maintainers

alphapatho's People

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.