Reproducibility study: Unsupervised Learning for Cell-Level Visual Representation in Histopathology Images With Generative Adversarial Networks
This repository contains the implementation of our reproducibility study of the paper Unsupervised Learning for Cell-level Visual Representation in Histopathology Images with Generative Adversarial Networks, Bo Hu♯ , Ye Tang♯ , Eric I-Chao Chang, Yubo Fan, Maode Lai and Yan Xu* (* corresponding author; ♯ equal contribution), arxiv, IEEE
Original paper's code can be found on nu-gan
To install requirements:
conda create -n myenv python=3.8.5
conda activate myenv
conda install -c anaconda pip
# install PyTorch - https://pytorch.org/get-started/locally/
conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.1 -c pytorch
pip install -r requirements.txt
To download datasets:
- Extract all data to desired path: Datasets
Dataset A is a labeled dataset, new dataset is unlabeled.
To train the model(s) in the paper, separate tasks can be chosen using flags. Make sure the datasets are extracted in the current directory: cd path_to_datasets
python /path_to_utils/nu_gan.py --task 'cell_representation'
- To change the number of clusters:
python /path_to_utils/nu_gan.py --task 'cell_representation' --dis_category 4
The default number of clusters (dis_category
) is 5.
python /path_to_utils/nu_gan.py --task 'cell_representation_unlabeled'
- To change the number of clusters:
python /path_to_utils/nu_gan.py --task 'cell_representation_unlabeled' --dis_category 4
The default number of clusters (dis_category
) is 5.
Make sure the current directory has the experiment folders: cd path_to_experiments
.
python /path_to_utils/nu_gan.py --task 'cell_classification' --experiment_id 123456
experiment_id
identifies the experiment from cell-level clustering to perform cell classification. Can be retrieved from the output folder name from cell_representation
or cell_representation_unlabeled
.
Other hyperparameters that don't have specified flags can be changed in nu_gan.py
.
To evaluate the model on labeled data:
python /path_to_utils/eval.py --model_path "path_to_models" --dis_category 5
The default number of clusters (dis_category
) is 5.
You can download pretrained models here:
- Unsupervised Cell-level Clustering on labeled dataset trained on the labeled Dataset A.
- Unsupervised Cell-level Clustering on unlabeled dataset trained on the unlabeled new dataset.
Our model achieves the following performance on :
Purity | Entropy | F-score | |
---|---|---|---|
Reproduction | 0.803 | 0.914 | 0.810 |
Clusters | V(D, G) | Lq |
---|---|---|
5 | -6.08 | 0.031 |
Where V(D, G) is the value function of the discriminator and generator networks and Lq is the loss of the auxiliary network.
Precision | Recall | F-score |
---|---|---|
0.908 | 0.895 | 0.899 |
If you'd like to contribute, or have any suggestions, you can contact us at [email protected] or open an issue on this GitHub repository.
All contributions welcome! All content in this repository is licensed under the MIT license.