Giter VIP home page Giter VIP logo

hia's Introduction

HIA (Histopathology Image Analysis)

This repository contains the Python version of a general workflow for end-to-end artificial intelligence on histopathology images. It is based on workflows which were previously described in Kather et al., Nature Medicine 2019 and Echle et al., Gastroenterology 2020. The objective is to predict a given label directly from digitized histological whole slide images (WSI). The label is defined on the level of patients, not on the level of pixels in a given WSI. Thus, the problems addressed by HAI are weakly supervised problems. Common labels are molecular subtype of cancer, binarized clinical outcome or treatment response. Compared to previous Matlab-based implementations of this framework (e.g. DeepHistology), this version is implemented using Python and PyTorch and is highly scalable and extensively validated in multiple clincially relevant problems. A key feature of HIA is that it provides an implementation of multiple artificial intelligence algorithms, including

This is important to notice that there are various changes in this version but it follows the same steps.

++ These scripts are still under the development and please always use the final version of it ++

How to use this repository:

To use this workflow, you need to modfiy specific experiement file based on your project. Experiment file is a text file and an example of it can be find this repository. For this file you need to fill the following options:

Input Variable name Description
-projectDetails This is an optional string input. In this section you can write down some keywords about your experiment.
-dataDir_train Path to the directory containing the normalized tiles. For example : ["K:\TCGA-CRC-DX"].
This folder should contain a subfolder of tiles which can have one of the following names:
{BLOCKS_NORM_MACENKO, BLOCKS_NORM_VAHADANE, BLOCKS_NORM_REINHARD or BLOCKS}.
The clinical table and the slide table of this data set should be also stored in this folder.
This is an example of the structure for this folder:
K:\TCGA-CRC-DX:
{
1. BLOCKS_NORM_MACENKO
2. TCGA-CRC-DX_CLINI.xlsx
3. TCGA-CRC-DX_SLIDE.csv
}
-dataDir_test If you are planning to have external validation for your experiemnt, this varibal is the path to the directory containing the normalized tiles which will be used in external validation. This folder should have the same structure as the 'dataDir_train'.
-targetLabels This is the list of targets which you want to analyze. The clinical data should have the values for these targets. For Example : ["isMSIH", "stage"].
-trainFull If you are planning to do cross validation, this variable should be defined as False. If you want to use all the data to train and then use the external validation, then this variable should be defined as True.
-maxNumBlocks This integer variable, defines the maximum number of tiles which will be used per slide. Since the number of extracted tiles per slide can vary alot, we use limited number of tiles per slide. For more detail, please ckeck the paper.
-epochs This integer variable, defines the number of epochs for training.
-batchSize This integer variable, defines the batch size for training.
-k This integer variable, defined the number of K for cross validation experiment. This will be considered only if the trainFull variable has the value of False.
-modelName This is a string variable which can be defined using one of the following neural network models. The script will download the pretrained weights for each of these models.
{resnet, alexnet, vgg, squeezenet, densenet, inception, vit, efficient}
-opt This is a string variable defining the name of optimizer to use for training.
{"adam" or "sgd"}
-lr This float variable defines the learning rate for the optimizer.
-reg This float variable defines the weight_decay for the optimizer.
-gpuNo If the computer has more than one gpu, this variable can be assigned to run the experiment on specified gpu.
-freezeRatio This is a float variable which can vary between [0, 1]. It will specified the ratio of the neural network layers to be freezed during the training.

Run training :

To start training, we use the Main.py script. The full path to the experiemnt file, should be used as an input variable in this script.

External Validation:

If you used trainFull = True in the experiemnt file and you want to evaluate your model on the external data set, you should use the script named Deploy_Classic.py. In this script, following two inputs should be filled:
{
1. addressExp: is the full path to the experiment file created for external validation. This experiemnt file has the same features as explained above. DataDir_test is the path to folder of dataset which will be used for external validation. The targetLabels is a single target which you want to evaluate.
2. modelAdr is the full path to the model which is saved in the RESULT folder of the experiemnt which you defined trainFull as True.
}

hia's People

Contributors

jnkather avatar narminghaffari avatar

Stargazers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.