Giter VIP home page Giter VIP logo

ssl_cr_histo's Introduction

Self-Supervised driven Consistency Training for Annotation Efficient Histopathology Image Analysis

Overview

We propose a self-supervised driven consistency training paradigm for histopathology image analysis that learns to leverage both task-agnostic and task-specific unlabeled data based on two strategies:

  1. A self-supervised pretext task that harnesses the underlying multi-resolution contextual cues in histology whole-slide images (WSIs) to learn a powerful supervisory signal for unsupervised representation learning.

  2. A new teacher-student semi-supervised consistency paradigm that learns to effectively transfer the pretrained representations to downstream tasks based on prediction consistency with the task-specific unlabeled data.

We carry out extensive validation experiments on three histopathology benchmark datasets across two classification and one regression-based task, i.e., tumor metastasis detection (Breast), tissue type classification(Colorectal), and tumor cellularity quantification (Breast). We compare against the state-of-the-art self-supervised pretraining methods based on generative and contrastive learning techniques: Variational Autoencoder (VAE) and Momentum Contrast (MoCo), respectively.

1. Self-Supervised pretext task

2. Consistency training

Results

  • Predicted tumor cellularity (TC) scores on BreastPathQ test set for 10% labeled data


  • Predicted tumor probability on Camelyon16 test set for 10% labeled data

Prerequisites

Core implementation:

  • Python 3.7+
  • Pytorch 1.7+
  • Openslide-python 1.1+
  • Albumentations 1.8+
  • Scikit-image 0.15+
  • Scikit-learn 0.22+
  • Matplotlib 3.2+
  • Scipy, Numpy (any version)

Additional packages can be installed via:

pip install -r requirements.txt

Datasets

Training

The model training happens at three stages:

  1. Task-agnostic self-supervised pretext task (i.e., the proposed Resolution sequence prediction (RSP) task)
  2. Task-specific supervised fine-tuning (SSL)
  3. Task-specific teacher-student consistency training (SSL_CR)

1. Self-supervised pretext task: Resolution sequence prediction (RSP) in WSIs

From the file "pretrain_BreastPathQ.py / pretrain_Camelyon16.py", you can pretrain the network (ResNet18) for predicting the resolution sequence ordering in WSIs on BreastPathQ & Camelyon16 dataset, respectively. This can be easily adapted to any other dataset of choice.

  • The choice of resolution levels for the RSP task can also be set in dataset.py#L277 while pretraining on any other datasets.
  • The argument --train_image_pth is the only required argument and should be set to the directory containing your training WSIs. There are many more arguments that can be set, and these are all explained in the corresponding files.
python pretrain_BreastPathQ.py    // Pretraining on BreastPathQ   
python pretrain_Camelyon16.py    // Pretraining on Camelyon16
  • We also provided the pretrained models for BreastPathQ and Camelyon16, found in the "Pretrained_models" folder. These models can also be used for feature transferability (domain adaptation) between datasets with different tissue types/organs.

2. Task specific supervised fine-tuning on downstream task

From the file "eval_BreastPathQ_SSL.py / eval_Camelyon_SSL.py / eval_Kather_SSL.py", you can fine-tune the network (i.e., task-specific supervised fine-tuning) on the downstream task with limited label data (10%, 25%, 50%). Refer to, paper for more details.

  • Arguments: --model_path - path to load self-supervised pretrained model (i.e., trained model from Step 1). There are other arguments that can be set in the corresponding files.
python eval_BreastPathQ_SSL.py  // Supervised fine-tuning on BreastPathQ   
python eval_Camelyon_SSL.py    // Supervised fine-tuning on Camelyon16
python eval_Kather_SSL.py    // Supervised fine-tuning on Kather dataset (Colorectal)

Note: we didn't perform self-supervised pretraining on the Kather dataset (colorectal) due to the unavailability of WSI's. Instead, we performed domain adaptation by pretraining on Camelyon16 and fine-tuning on the Kather dataset. Refer to, paper for more details.

3. Task specific teacher-student consistency training on downstream task

From the file "eval_BreastPathQ_SSL_CR.py / eval_Camelyon_SSL_CR.py / eval_Kather_SSL_CR.py", you can fine-tune the student network by keeping the teacher network frozen via task-specific consistency training on the downstream task with limited label data (10%, 25%, 50%). Refer to, paper for more details.

  • Arguments: --model_path_finetune - path to load SSL fine-tuned model (i.e., self-supervised pretraining followed by supervised fine-tuned model from Step 2) to intialize "Teacher and student network" for consistency training; There are other arguments that can be set in the corresponding files.
python eval_BreastPathQ_SSL_CR.py  // Consistency training on BreastPathQ   
python eval_Camelyon_SSL_CR.py    // Consistency training on Camelyon16
python eval_Kather_SSL_CR.py    // Consistency training on Kather dataset (Colorectal)

Testing

The test performance is validated at two stages:

  1. Self-Supervised pretraining followed by supervised fine-tuning
  • From the file "eval_BreastPathQ_SSL.py / eval_Kather_SSL.py ", you can test the model by changing the flag in argument: '--mode' to 'evaluation'.
  1. Consistency training
  • From the file "eval_BreastPathQ_SSL_CR.py / eval_Kather_SSL_CR.py", you can test the model by changing the flag in argument: '--mode' to 'evaluation'.

The prediction on Camelyon16 test set can be performed using "test_Camelyon16.py" file.

Citation

If you use significant portions of our code or ideas from our paper in your research, please cite our work:

@article{srinidhi2021self,
  title={Self-supervised driven consistency training for annotation efficient histopathology image analysis},
  author={Srinidhi, Chetan L and Kim, Seung Wook and Chen, Fu-Der and Martel, Anne L},
  journal={arXiv preprint arXiv:2102.03897},
  year={2021}
}

Acknowledgements

We would like to acknowledge the use of Compute Canada facilities for our computing resources. This work was funded by the Canadian Cancer Society (grant number #705772); National Cancer Institute of the National Institutes of Health [grant number #U24CA199374-01]; Canadian Institutes of Health Research.

Questions or Comments

Please direct any questions or comments to me; I am happy to help in any way I can. You can email me directly at [email protected].

ssl_cr_histo's People

Contributors

srinidhipy avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.