The alphapatho from zx333445

AlphaPatho - Powerful Computational Pathology Tool

Paper: An end-to-end and well-generalized weakly supervised whole slide image analysis for lung cancer classification using deep learning

Framework
Ecosystem
QuickStart
- Install
- Usage
Contribution
Maintainers

Framework

Ecosystem

Pretrained weights	Description
Fold1	Fold1 model weights
Fold2	Fold2 model weights
Fold3	Fold3 model weights
Fold4	Fold4 model weights
Fold5	Fold5 model weights

Why AlphaPatho

Weakly supervised learning (No expensive annotations)
End-to-end pipeline (Iterative sampling and feature aggregation module)
Data efficiency (<200 images, AUCs of >0.9)
Promising generalizability (six heterogenous external datasets with AUCs of 0.93-0.97)
Overperform state-of-the-art methods

QuickStart

Install

System: Linux(CentOS) Pytorch (version 1.7.0): model training and validation. OpenSlide (version 3.4.1): processing whole slide images. lmdb (version 1.3.0): building lmdb dataset for faster data loading. staintools: following https://github.com/Peter554/StainTools.

$ git clone https://github.com/DeepLaboratory/AlphaPatho.git # install AlphaPatho
$ cd AlphaPatho

Directory description:

├─ data.py                 // dataset code
├─ mean_pixel.pkl		   // mean value of pixel value for data augmentation
├─ model.py				   // model architecture
├─ run.py		           // main code for training and validation
├─ run.sh                  // run.py launcher
├─ test.py                 // main code for inference or evaluation
├─ run_test.sh             // test.py launcher
├─ utils.py                // tool code
├─ tmp.py                  // LMDB dataset building code

Usage

Train from scratch

Slide tiling: see methods in our paper. When tiling is done, we should have files like below:

├─ slide1_folder
│  ├─ tile1.png
│  ├─ tile2.png
│  ├─ ...
├─ slide2_folder
│  ├─ tile1.png
│  ├─ tile2.png
│  ├─ ...
├─ ...

csv for LMDB dataset generation: after slide tiling, we need to generate a csv used for LMDB dataset generation. The csv content is as follow:

slides_name label

xxx/slide1_folder 1

xxx/slide2_folder 0

... ... ...
Building LMDB dataset

tmp.py is the main code. We need to modify the csv path and lmdb dataset save directory. Using tmp.py, we can generate the LMDB dataset.
Dataset preparation: preparing csv files for 5-fold validation, the format of each csv file is like follows:

slides_name label

xxx/slide1_folder 1

xxx/slide2_folder 0

... ... ...

After cvs preparation, we will get train.csv, val.csv, and test.csv for each fold.
Model training: run run.sh and start training.

slides_name	label
xxx/slide1_folder	1
xxx/slide2_folder	0
... ...	...

slides_name	label
xxx/slide1_folder	1
xxx/slide2_folder	0
... ...	...

Model inference or test

Preparing datasets for testing, modifying the parameters in the run_test.sh, and runing run_test.sh for testing!

Contribution

Maintainers

[raycaohmu]https://github.com/raycaohmu)

zx333445 / alphapatho Goto Github PK

alphapatho's Introduction

AlphaPatho - Powerful Computational Pathology Tool

Contents

Framework

Ecosystem

Why AlphaPatho

QuickStart

Install

Usage

Train from scratch

Model inference or test

Contribution

Maintainers

alphapatho's People

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent