Giter VIP home page Giter VIP logo

scdino's Introduction

Self-Supervised Vision Transformers for multi-channel single-cells images

Application of DINO for automated microscopy-derived fluorescent imaging datasets of single cells and instructions on how to run subsequent downstream analyses with trained Vision Transformers (ViTs) of non-RGB multi-channel images. See Emerging Properties in Self-Supervised Vision Transformers for the original DINO implementation and Self-supervised vision transformers accurately decode cellular state heterogeneity for the adaption described here. [DINO arXiv] [scDINO bioRxiv]

DINO illustration

Check out our recent publication, Cellular Architecture Shapes the Naïve T Cell Response, in Science Magazine. We used scDINO to identify distinct T cell phenotypes by examining over 30,000 single-cell crops of CD4 and CD8 T cells from healthy donors. We trained ViT-S/16 models exclusively on CD3 single-channel images, and downstream analysis to investigate the phenotypic heterogeneity was performed by clustering the CLS-Token latent space and visualizing it with the TopOMetry framework [Science].

Science Sfig1A
Science Sfig1A

Further demonstration of the usefulness of the DINO framework for image-based biological discovery is presented in the preprint, Unbiased single-cell morphology with self-supervised vision transformers. This work demonstrates that self-supervised vision transformers can encode cellular morphology at various scales, from subcellular to multicellular [bioRxiv].

This codebase provides:

  • Workflow to run analyses of multi-channel image datasets (non-RGB) with publicly available self-supervised Vision Transformers (DINO-ss-ViTs) from [DINO arXiv] and with scDINO (scDINO-ss-ViTs) introduced in our paper [scDINO bioRxiv]
  • Workflow to train ViTs on multi-channel single-cell images generated by automated microscopy using scDINO and subsequently run downstream analyses

Pretrained models

Public available ss-ViTs pretrained on Imagenet with DINO

This table is adapted from the official DINO repository. You can choose to download the weights of the pretrained backbone used for downstream tasks, or the full checkpoint containing backbone and projection head weights for both student and teacher networks. Detailed arguments and training/evaluation logs are provided. Note that DeiT-S and ViT-S names refer exactly to the same architecture.

arch download
DINO-ss-ViT-S/16 backbone only full ckpt args logs
DINO-ss-ViT-S/8 backbone only full ckpt args logs
DINO-ss-ViT-B/16 backbone only full ckpt args logs
DINO-ss-ViT-B/8 backbone only full ckpt args logs

scDINO ss-ViTs pretrained on high-content imaging data of single immune cells

Here you can donwload the pretrained single-cell DINO (scDINO) ss-ViTs used in our article [scDINO bioRxiv]. The ViTs are pretrained on the Deep phenotyping PBMC Image Set of Y.Severin, a high-content imaging dataset containing labeled single-cell images of 8 different immune cell classes from multiple healthy donors. Here we provide the scDINO-ss-ViT-S/16 full checkpoint trained for 100 epochs.

arch download
scDINO-ss-ViT-S/16 full ckpt args logs

Requirements

This codebase has been developed on a linux machine with python version 3.8, snakemake 7.20.0, torch 1.8.1, torchvision 0.9.1 and a HPC cluster running with the slurm workload manager. All required python packages and corresponding version for this setup can be found in the requirements.txt file.

Analyse non-RGB multi-channel images with pretrained ViTs

In Figure 1 of our manuscript we show [scDINO bioRxiv] how DINO-ss-ViTs can be applied to decipher stem cell heterogeneity using single-cell images derived from high-content imaging. These single-cell images images are not RGB-based, but composed of several separate microscopy-derived greyscale images that were combined in one multi-channel TIFF image. To be able to use these multi-channel input images in combination with ViTs, we load the values of a TIFF input file as a multidimensional pytorch tensor in the Multichannel_dataset(datasets.ImageFolder) Class in the compute_CLS_features.py which is used to construct the pytorch dataset object.

analysis plots

Run all 3 analyses at once

To send a job to the slurm cluster to compute the CLS Token representations, visualise their embeddings using UMAP and generate example attention images all at once, use the only_downstream_snakefile snakemake file and the only_downstream_snakefile.yml configuration file.

Example submission:

snakemake -s only_downstream_snakefile all \
--configfile="configs/only_downstream_analyses.yaml" \
--keep-incomplete \
--drop-metadata \
--keep-going \
--cores 8 \
--jobs 40 \
-k \
--cluster "sbatch --time=01:00:00 \
--gpus=1 \
-n 8 \
--mem-per-cpu=9000 \
--output=slurm_output_evaluate.txt \
--error=slurm_error_evaluate.txt" \
--latency-wait 45 \

All configurations and parameters (metadata and hyperparameters) of the job can be set in the only_downstream_snakefile.yml file. The results will be saved in the output_dir folder. Instead of running all 3 analyses at once, you can also run them separately, by specifying the target rule in the snakemake command.

Compute [CLS] Token representations

The representation of an image is given by the output of the [CLS] Token in form of a numeric vector with dimensionality d = 384 for ViT-S and d = 768 for ViT-B. To compute a [CLS] feature space for a given dataset, prepare the configuration variables in the downstream_analyses: subsection in only_downstream_snakefile.yaml.

To learn more about the args in the configuration file for the computation of the features, run:

pyscripts/compute_CLS_features.py --help

Visualise CLS token space using UMAP

To get a glimpse of the feature space, we can use the UMAP algorithm to project multidimensional vectors into a 2D embedding. The UMAP parameters can be adjusted in the downstream_analyses: umap_eval: subsection of the config file.

Run k-NN evaluation

To quantitatively evaluate label-specific clustering, we can run a k-NN evaluation to get a global clustering score across classes. The kNN parameters can be adjusted in the configuration file in the downstream_analyses: kNN: subsection.

Visualisation of the CLS Token-based Self-Attention Mechanism of ss-ViTs

DINO illustration

To visualise the CLS token-based self-attention of the ss-ViTs, attention maps can be generated for each image class. Our default settings randomly pick 1 image per image class in the given dataset. The attention maps are saved in the attention_maps subfolder of the output_dir in the results folder. Each attention head is saved as a separate image. Additionally, for each original multi-channel input image, all channels are separately saved as a single image.

scDINO training and evaluation on greyscale multi-channel images

immunocell_plot

To train your own Vision Transformers on a given dataset from scratch and subsequently evaluate them on downstream tasks (with automatic train and test split), use the full_pipeline_snakefile and the scDINO_full_pipeline_snakefile.yml configuration file.

Example submission:

snakemake -s full_pipeline_snakefile all \
--configfile="configs/scDINO_full_pipeline.yaml" \
--keep-incomplete \
--drop-metadata \
--cores 8 \
--jobs 40 \
-k \
--cluster "sbatch --time=04:00:00 \
--gpus=2 \
-n 8 \
--mem-per-cpu=9000 \
--output=slurm_output.txt \
--error=slurm_error.txt" \
--latency-wait 45 \

To reproduce the scDINO-ss-ViT-S/16 used in our manuscript, download the Deep phenotyping PBMC Image Set of Y.Severin and define the path to the dataset in the config file under dataset_dir.

License

This repository adheres to the Apache 2.0 license. You can find more information on this in the LICENSE file.

Citation

If you find this adaption useful for your research, please consider citing us:

@article {Pfaendler2023.01.16.524226,
	author = {Pfaendler, Ramon and Hanimann, Jacob and Lee, Sohyon and Snijder, Berend},
	title = {Self-supervised vision transformers accurately decode cellular state heterogeneity},
	year = {2023},
	doi = {10.1101/2023.01.16.524226},
	URL = {https://www.biorxiv.org/content/early/2023/01/18/2023.01.16.524226},
	eprint = {https://www.biorxiv.org/content/early/2023/01/18/2023.01.16.524226.full.pdf},
	journal = {bioRxiv}
}

scdino's People

Contributors

dineshpalli avatar jacobhanimann avatar mikelippincott avatar pfaendler avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

scdino's Issues

Question about downstream run yaml config file

Hi Jacob,

First, thank you for your work on scDINO. I have been using it for some of my analysis looking into cell death!

I had some questions about the parameters defined in the only_downstream_analyses.yaml config file.
I might have missed it but, I am having trouble understanding what all of the parameters are for and how to define them.
If I could get some clarification that would be amazing!
Some params in particular that I do not quite understand:

  • selected_channel_combination_per_run
    • The default is ["01234", "0", "1", "2", "3", "4"]. Are certain channels run through the model alone??
  • norm_per_channel
    • How would one go about calculating these normalization weights??
  • custom_embedding_map the default is "{0:2, 1:2, 2:2, 3:2, 4:2}"
    • Using this embedding map outputs channels 0-4 to one embedding space?? Is this a correct interpretation??

If additional documentation should be added, I can help with a pull request.

Thank you for your help and for creating this amazing tool and model.

Best,
Mike

How do I use mean_and_std_of_dataset.txt without running the full pipeline?

Thank you so much for writing scDINO-it will be of great use to my work on cell morphology! I am currently trying to run the downstream pipeline using the pre-trained scDINO model, and I am unsure of what file to include for the “ mean_std_file_location”. I noticed that a mean_and_std_of_dataset.txt file appears to be an output of the full pipeline. Is there a way to access this file without running the full pipeline? I also could set “parse_mean_std_from_file” to False while computing cls features, but I am unsure how or if this would affect the overall function. Thank you so much for any insights.

CLS features all nan when passing in 4 channel images.

I am using scDINO on my image sets. I have four channel fluorescence images. To make my images compatible with the pre-trained model, I made a 5th channel with all 0's.

This is an example of how I make my blank channel(s).
Where image is the loaded image via the tifffile python package

if image.shape[-1] < 5:
        channels_to_add = 5 - image.shape[-1]
        for channel in range(channels_to_add):
            # add a new channel of all zeros
            new_channels = np.zeros((image.shape[0], image.shape[1], 1))
            image_merge = np.concatenate((image,new_channels), axis=-1)
    print(image_merge.shape)

The issue I am having in scDINO seems to arise in the normalize_numpy_0_to_1 & normalize_tensor_per_channel functions in the pyscripts/utils.py file.
When I have a channel of a uniform distribution of pixel values I end up with a normalized array value in the uniformly distributed channel of NaN due to the zerodivision that occurs on lines 78 & 108.

I have a proposed change that would help scDINO use multichannel images that contain uniform distributions.
If the maintainers of scDINO agree with the proposed change I can happily open a PR from this issue!
The proposed change is below.

def normalize_numpy_0_to_1(x):
    print("x",x.shape)
    x_min = x.min(axis=(0,1), keepdims=True)
    x_max = x.max(axis=(0,1), keepdims=True)
    diff_min_max = x_max - x_min
    if check_nan(diff_min_max):
        print("diff_min_max is nan")
    if check_nan(x-x_min):
        print("x-x_min is nan:")
    if check_nan(x):
        print("x contains nan before normalization")
    if check_zero(diff_min_max):
        print("diff_min_max has zero")
        print("x_max",x_max)
        print("x_min",x_min)
        print("diff_min_max",diff_min_max)
        print("x",x.shape)
        # replace x_max 0 values with 1
        for i in range(len(x_max[0][0])):
            if x_max[0][0][i] == 0:
                x_max[0][0][i] = 1
    x = (x - x_min)/(x_max-x_min)
    if check_nan(x):
        print("x contains nan after normalization")
    return x

Excellent work on scDINO and implementations! I would love to talk more in the future!

reason: Missing output files: mean_and_std_of_dataset.txt

Hello, super interesting work! I would like to train it on my cell dataset but I am not familiar with snakemake. I tried to pretain the model using the command shared in the README.md and I got this error, could you please share some recommendation to solve this and be able to get mean and std of my dataset? Thanks in advance! Screenshot 2023-07-17 at 15 20 22

Calculation of train and val indices in mean_std_dataset.py

Hi, thank you so much for writing scDINO!

I am running the full pipeline including calculating mean and std of my training dataset before training the model. I think the train and val indices might be defined the wrong way around in mean_std_dataset.py. Currently the train and val indices are defined as train_indices, val_indices = indices[split:], indices[:split] where split = int(numpy.floor(validation_split * dataset_size)) and validation_split = 1-float(snakemake.params["fraction_for_mean_std"]). The default value of snakemake.params["fraction_for_mean_std"] is currently 0.2.

This means that split represents 80 % of my full dataset size, so train_indices corresponds to the final 20 % of indices. With a dataset of 1200000 images, I get this output:

length of dataset: 1200000
240000 960000
Train dataset consists of 240000 images.

which I think is the wrong way around?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.