project-monai / tutorials Goto Github PK

View Code? Open in Web Editor NEW

1.7K 34.0 662.0 229.58 MB

MONAI Tutorials

Home Page: https://monai.io/started.html

License: Apache License 2.0

Jupyter Notebook 98.40% Python 1.48% Shell 0.11% Dockerfile 0.01%

monai monai-tutorials pytorch jupyter-notebook monai-workflows

tutorials's Introduction

MONAI Tutorials

This repository hosts the MONAI tutorials.

1. Requirements

Most of the examples and tutorials require matplotlib and Jupyter Notebook.

These can be installed with:

python -m pip install -U pip
python -m pip install -U matplotlib
python -m pip install -U notebook

Some of the examples may require optional dependencies. In case of any optional import errors, please install the relevant packages according to MONAI's installation guide. Or install all optional requirements with:

pip install -r https://raw.githubusercontent.com/Project-MONAI/MONAI/dev/requirements-dev.txt

Run the notebooks from Colab

Most of the Jupyter Notebooks have an "Open in Colab" button. Please right-click on the button, and select "Open Link in New Tab" to start a Colab page with the corresponding notebook content.

To use GPU resources through Colab, please remember to change the runtime type to GPU:

From the Runtime menu select Change runtime type
Choose GPU from the drop-down menu
Click SAVE This will reset the notebook and may ask you if you are a robot (these instructions assume you are not).

Running:

!nvidia-smi

in a cell will verify this has worked and show you what kind of hardware you have access to.

Data

Some notebooks will require additional data. Each user is responsible for checking the content of datasets and the applicable licenses and determining if suitable for the intended use.

2. Questions and bugs

For questions relating to the use of MONAI, please us our Discussions tab on the main repository of MONAI.
For bugs relating to MONAI functionality, please create an issue on the main repository.
For bugs relating to the running of a tutorial, please create an issue in this repository.

3. Become a contributor

You can read details about adding a tutorial in our CONTRIBUTING GUIDELINES.

4. List of notebooks and examples

2D classification

mednist_tutorial

This notebook shows how to easily integrate MONAI features into existing PyTorch programs. It's based on the MedNIST dataset which is very suitable for beginners as a tutorial. This tutorial also makes use of MONAI's in-built occlusion sensitivity functionality.

2D segmentation

torch examples

Training and evaluation examples of 2D segmentation based on UNet and synthetic dataset. The examples are standard PyTorch programs and have both dictionary-based and array-based versions.

3D classification

ignite examples

Training and evaluation examples of 3D classification based on DenseNet3D and IXI dataset. The examples are PyTorch Ignite programs and have both dictionary-based and array-based transformation versions.

torch examples

Training and evaluation examples of 3D classification based on DenseNet3D and IXI dataset. The examples are standard PyTorch programs and have both dictionary-based and array-based transformation versions.

3D regression

densenet_training_array.ipynb

Training and evaluation examples of 3D regression based on DenseNet3D and IXI dataset.

3D segmentation

ignite examples

Training and evaluation examples of 3D segmentation based on UNet3D and synthetic dataset. The examples are PyTorch Ignite programs and have both dictionary-base and array-based transformations.

torch examples

Training, evaluation and inference examples of 3D segmentation based on UNet3D and synthetic dataset. The examples are standard PyTorch programs and have both dictionary-based and array-based versions.

brats_segmentation_3d

This tutorial shows how to construct a training workflow of multi-labels segmentation task based on MSD Brain Tumor dataset, and how to convert the pytorch model to an onnx model for inference and comparison.

spleen_segmentation_3d_aim

This notebook shows how MONAI may be used in conjunction with the aimhubio/aim.

spleen_segmentation_3d_lightning

This notebook shows how MONAI may be used in conjunction with the PyTorch Lightning framework.

spleen_segmentation_3d

This notebook is an end-to-end training and evaluation example of 3D segmentation based on MSD Spleen dataset. The example shows the flexibility of MONAI modules in a PyTorch-based program:

Transforms for dictionary-based training data structure.
Load NIfTI images with metadata.
Scale medical image intensity with expected range.
Crop out a batch of balanced image patch samples based on positive / negative label ratio.
Cache IO and transforms to accelerate training and validation.
3D UNet, Dice loss function, Mean Dice metric for 3D segmentation task.
Sliding window inference.
Deterministic training for reproducibility.

unet_segmentation_3d_ignite

This notebook is an end-to-end training & evaluation example of 3D segmentation based on synthetic dataset. The example is a PyTorch Ignite program and shows several key features of MONAI, especially with medical domain specific transforms and event handlers for profiling (logging, TensorBoard, MLFlow, etc.).

COVID 19-20 challenge baseline

This folder provides a simple baseline method for training, validation, and inference for COVID-19 LUNG CT LESION SEGMENTATION CHALLENGE - 2020 (a MICCAI Endorsed Event).

unetr_btcv_segmentation_3d

This notebook demonstrates how to construct a training workflow of UNETR on multi-organ segmentation task using the BTCV challenge dataset.

unetr_btcv_segmentation_3d_lightning

This tutorial demonstrates how MONAI can be used in conjunction with PyTorch Lightning framework to construct a training workflow of UNETR on multi-organ segmentation task using the BTCV challenge dataset.

2D registration

registration using mednist

This notebook shows a quick demo for learning based affine registration of 64 x 64 X-Ray hands.

3D registration

3D registration using paired lung CT

This tutorial shows how to use MONAI to register lung CT volumes acquired at different time points for a single patient.

3D registration using unpaired brain MR

This tutorial shows how to get started on using the general-purpose registration framework VoxelMorph offered in MONAI to register unpaired brain MR volumes.

DeepAtlas

This tutorial demonstrates the use of MONAI for training of registration and segmentation models together. The DeepAtlas approach, in which the two models serve as a source of weakly supervised learning for each other, is useful in situations where one has many unlabeled images and just a few images with segmentation labels. The notebook works with 3D images from the OASIS-1 brain MRI dataset.

Deepgrow

The example show how to train/validate a 2D/3D deepgrow model. It also demonstrates running an inference for trained deepgrow models.

DeepEdit

This example shows how to train/test a DeepEdit model. In this tutorial there is a Notebook that shows how to run inference on a pretrained DeepEdit model.

Deployment

BentoML

This is a simple example of training and deploying a MONAI network with BentoML as a web server, either locally using the BentoML repository or as a containerized service.

Ray

This uses the previous notebook's trained network to demonstrate deployment a web server using Ray.

Triton

This is example walks through using a Triton Server and Python client using MONAI on the MedNIST classification problem. The demo is self contained and the Readme explains how to use Triton "backends" to inject the MONAI code into the server. See Triton Inference Server/python_backend documentation

Experiment Management

Aim

An example of experiment management with Aim, using 3D spleen segmentation as an example.

MLFlow

An example of experiment management with MLFlow, using 3D spleen segmentation as an example.

MONAI bundle integrates MLFlow

An example shows how to easily enable and customize the MLFlow for experiment management in MONAI bundle.

ClearML

An example of experiment management with ClearML, using 3D Segmentation with UNet as an example.

Federated Learning

NVFlare

The examples show how to train federated learning models with NVFlare and MONAI-based trainers.

OpenFL

The examples show how to train federated learning models based on OpenFL and MONAI.

Substra

The example show how to execute the 3d segmentation torch tutorial on a federated learning platform, Substra.

Breast Density FL Challenge

Reference implementation used in MICCAI 2022 ACR-NVIDIA-NCI Breast Density FL challenge.

Digital Pathology

Whole Slide Tumor Detection

The example shows how to train and evaluate a tumor detection model (based on patch classification) on whole-slide histopathology images.

Profiling Whole Slide Tumor Detection

The example shows how to use MONAI NVTX transforms to tag and profile pre- and post-processing transforms in the digital pathology whole slide tumor detection pipeline.

Multiple Instance Learning WSI classification

An example of Multiple Instance Learning (MIL) classification from Whole Slide Images (WSI) of prostate histopathology.

NuClick Annotation

The notebook demonstrates examples of training and inference pipelines with interactive annotation for pathology, NuClick is used for delineating nuclei, cells and a squiggle for outlining glands.

HoVerNet:Nuclear segmentation and classification task

This tutorial demonstrates how to construct a training workflow of HoVerNet on nuclear segmentation and classification task using the CoNSep dataset.

Nuclei Classification

The notebook demonstrates examples of training and inference pipelines with interactive annotation for pathology, NuClick is used for delineating nuclei, cells and a squiggle for outlining glands.

Acceleration

fast_model_training_guide

The document introduces details of how to profile the training pipeline, how to analyze the dataset and select suitable algorithms, and how to optimize GPU utilization in single GPU, multi-GPUs or even multi-nodes.

distributed_training

The examples show how to execute distributed training and evaluation based on 3 different frameworks:

PyTorch native DistributedDataParallel module with torch.distributed.launch.
Horovod APIs with horovodrun.
PyTorch ignite and MONAI workflows.

They can run on several distributed nodes with multiple GPU devices on every node.

automatic_mixed_precision

And compares the training speed and memory usage with/without AMP.

dataset_type_performance

This notebook compares the performance of Dataset, CacheDataset and PersistentDataset. These classes differ in how data is stored (in memory or on disk), and at which moment transforms are applied.

fast_training_tutorial

This tutorial compares the training performance of pure PyTorch program and optimized program in MONAI based on NVIDIA GPU device and latest CUDA library. The optimization methods mainly include: AMP, CacheDataset, GPU transforms, ThreadDataLoader, DiceCELoss and SGD.

threadbuffer_performance

Demonstrates the use of the ThreadBuffer class used to generate data batches during training in a separate thread.

transform_speed

Illustrate reading NIfTI files and test speed of different transforms on different devices.

TensorRT_inference_acceleration

This notebook shows how to use TensorRT to accelerate the model and achieve a better inference latency.

Model Zoo

easy_integrate_bundle

This tutorial shows a straightforward ensemble application to instruct users on how to integrate existing bundles in their own projects. By simply changing the data path and the path where the bundle is located, training and ensemble inference can be performed.

Computer Assisted Intervention

video segmentation

This tutorial shows how to train a surgical tool segmentation model to locate tools in a given image. In addition, it also builds an example pipeline of an end-to-end video tool segmentation, with video input and video output.

endoscopic inbody classification

Tutorial to show the pipeline of fine tuning an endoscopic inbody classification model based on a corresponding pretrained bundle.

Modules

bundle

Get started tutorial and concrete training / inference examples for MONAI bundle features.

competitions

MONAI based solutions of competitions in healthcare imaging.

engines

Training and evaluation examples of 3D segmentation based on UNet3D and synthetic dataset with MONAI workflows, which contains engines, event-handlers, and post-transforms. And GAN training and evaluation example for a medical image generative adversarial network. Easy run training script uses GanTrainer to train a 2D CT scan reconstruction network. Evaluation script generates random samples from a trained network.

The examples are built with MONAI workflows, mainly contain: trainer/evaluator, handlers, post_transforms, etc.

3d_image_transforms

This notebook demonstrates the transformations on volumetric images.

2d_inference_3d_volume

Tutorial that demonstrates how monai SlidingWindowInferer can be used when a 3D volume input needs to be provided slice-by-slice to a 2D model and finally, aggregated into a 3D volume.

autoencoder_mednist

This tutorial uses the MedNIST hand CT scan dataset to demonstrate MONAI's autoencoder class. The autoencoder is used with an identity encode/decode (i.e., what you put in is what you should get back), as well as demonstrating its usage for de-blurring and de-noising.

batch_output_transform

Tutorial to explain and show how to set batch_transform and output_transform of handlers to work with MONAI engines.

bending_energy_diffusion_loss_notes

This notebook demonstrates when and how to compute normalized bending energy and diffusion loss.

compute_metric

Example shows how to compute metrics from saved predictions and labels with PyTorch multi-processing support.

csv_datasets

Tutorial shows the usage of CSVDataset and CSVIterableDataset, load multiple CSV files and execute postprocessing logic.

decollate_batch

Tutorial shows how to decollate batch data to simplify post processing transforms and execute more flexible following operations.

image_dataset

Notebook introduces basic usages of monai.data.ImageDataset module.

dynunet_tutorial

This tutorial shows how to train 3D segmentation tasks on all the 10 decathlon datasets with the reimplementation of dynUNet in MONAI.

integrate_3rd_party_transforms

This tutorial shows how to integrate 3rd party transforms into MONAI program. Mainly shows transforms from BatchGenerator, TorchIO, Rising and ITK.

inverse transformations and test-time augmentations

This notebook demonstrates the use of invertible transforms, and then leveraging inverse transformations to perform test-time augmentations.

layer wise learning rate

This notebook demonstrates how to select or filter out expected network layers and set customized learning rate values.

learning rate finder

This notebook demonstrates how to use LearningRateFinder API to tune the learning rate values for the network.

load_medical_images

This notebook introduces how to easily load different formats of medical images in MONAI and execute many additional operations.

mednist_GAN_tutorial

This notebook illustrates the use of MONAI for training a network to generate images from a random input tensor. A simple GAN is employed to do with a separate Generator and Discriminator networks.

mednist_GAN_workflow_dict

This notebook shows the GanTrainer, a MONAI workflow engine for modularized adversarial learning. Train a medical image reconstruction network using the MedNIST hand CT scan dataset. Dictionary version.

mednist_GAN_workflow_array

This notebook shows the GanTrainer, a MONAI workflow engine for modularized adversarial learning. Train a medical image reconstruction network using the MedNIST hand CT scan dataset. Array version.

cross_validation_models_ensemble

This tutorial shows how to leverage CrossValidation, EnsembleEvaluator, MeanEnsemble and VoteEnsemble modules in MONAI to set up cross validation and ensemble program.

nifti_read_example

Illustrate reading NIfTI files and iterating over image patches of the volumes loaded from them.

network_api

This tutorial illustrates the flexible network APIs and utilities.

postprocessing_transforms

This notebook shows the usage of several postprocessing transforms based on the model output of spleen segmentation task.

public_datasets

This notebook shows how to quickly set up training workflow based on MedNISTDataset and DecathlonDataset, and how to create a new dataset.

tcia_csv_processing

This notebook shows how to load the TCIA data with CSVDataset from CSV file and extract information for TCIA data to fetch DICOM images based on REST API.

transforms_demo_2d

This notebook demonstrates the image transformations on histology images using

UNet_input_size_constraints

This tutorial shows how to determine a reasonable spatial size of the input data for MONAI UNet, which not only supports residual units, but also can use more hyperparameters (like strides, kernel_size and up_kernel_size) than the basic UNet implementation.

TorchIO, MONAI, PyTorch Lightning

This notebook demonstrates how the three libraries from the official PyTorch Ecosystem can be used together to segment the hippocampus on brain MRIs from the Medical Segmentation Decathlon.

varautoencoder_mednist

This tutorial uses the MedNIST scan (or alternatively the MNIST) dataset to demonstrate MONAI's variational autoencoder class.

interpretability

Tutorials in this folder demonstrate model visualisation and interpretability features of MONAI. Currently, it consists of class activation mapping and occlusion sensitivity for 3D classification model visualisations and analysis.

Transform visualization

This tutorial shows several visualization approaches for 3D image during transform augmentation.

Auto3DSeg

This folder shows how to run the comprehensive Auto3DSeg pipeline with minimal inputs and customize the Auto3Dseg modules to meet different user requirements.

Self-Supervised Learning

self_supervised_pretraining

This tutorial shows how to construct a training workflow of self-supervised learning where unlabeled data is utilized. The tutorial shows how to train a model on TCIA dataset of unlabeled Covid-19 cases.

self_supervised_pretraining_based_finetuning

This tutorial shows how to utilize pre-trained weights from the self-supervised learning framework where unlabeled data is utilized. This tutorial shows how to train a model of multi-class 3D segmentation using pretrained weights.

Generative Model

3D latent diffusion model

This tutorial shows the use cases of training and validating a 3D Latent Diffusion Model.

2D latent diffusion model

This tutorial shows the use cases of training and validating a 2D Latent Diffusion Model.

tutorials's People

Contributors

Stargazers

Watchers

Forkers

rijobro matteomaspero yuantinghsieh neuronflow suprimnakarmi saruarlive cindyqi7788 dzenanz bjz205588 mkvarun owkin johnnie21 imsugeno wyli ericspod dootmaan dancebean fuzzythecat ragprog podismine arthur1511 s-shailja francescolr leong1230 hiyuhan hhhhhscott saeedseyyedi chenefei1003 sixitingting song-a koide-lab sekhar101 rrwww javierberna ronakkaoshik42 drbeh reyn4bo cklee19800303 zxyskyfly sandbornm sajalroychowdhury kwxu krishnarastogi nan-hk amulmgr mormona jttecson roijo sunyeoplee ashokohio llockhar danielschulz staffantackstrom cheikhdjennel simaoppcastro avain bradleyerickson-flowsigma deepmd-io bonbonpapa suprosanna kate-sann5100 elizavwp dgidgidgi aki-wada phillipchoi007 abedygathaba bartth madhu081096 sushma1125 mfernezir adamaji nianweijie hhtsai-ntu yeechingtiger cbe135 hugowww explcre adamwu1979 antoine-ls matteobe prashulsingh whsu2s dianemarquette vigsivan siyun-jung allenjwzhu foresterhema nicolizamacorrea yellowsimulator jpcenteno80 archietram helwilliams kqdhx anupriya-4 newcooldiscoveries raoufartikodin ziyuexu77 nabeel-penkar7 edwinlzw ngocthienle

tutorials's Issues

fixing the conda env

Is your feature request related to a problem? Please describe.
would be great to fix the Anaconda Python distribution with a predefined yml file, such as
https://github.com/Project-MONAI/MONAIBootcamp2020#instal-local-environment

How to handle RGB 2D images

Describe the bug
I just tried running the 2D segmentation tutorial, but on my own 2D images (a mixed dataset of TIFF, PNG, JPG and BMP images). I ran into several problems, e.g. LoadPNGd cannot handle TIFF images, the rest of the transform pipeline throws an error (I think PIL loads TIF images in a different way than others - I usually use skimage.io, which always returns a numpy array). The biggest problem though is that the transforms pipeline cannot handle the color channel in RGB images, or I am doing sth wrong when applying the Resized() transform - the latter is necessary because I need images at a fixed size of 320x240 at the end of the transform pipeline.

To Reproduce
Put a few RGB color images (maybe including at least one TIFF image ;) into a directory, then set up a simple transform pipeline like this:

train_transforms = Compose(
    [
        LoadImaged(keys=["img"]),
        LoadNumpyd(keys=["seg"]), # my segs are four channels stored as numpy array, of shape (height,width,4)
        ScaleIntensityd(keys="img"),
        Resized(keys=["img", "seg"], spatial_size=(240,320), mode='bilinear', align_corners=True),
        RandFlipd(keys=["img", "seg"], prob=0.5),
        ToTensord(keys=["img", "seg"]),
    ]
)

Then, to check the shape of the output tensors:

# define check dataset, check data loader
check_ds = monai.data.Dataset(data=train_files, transform=train_transforms)
# use batch_size=2 to load images and use RandCropByPosNegLabeld to generate 2 x 4 images for network training
check_loader = DataLoader(check_ds, batch_size=2, num_workers=4, collate_fn=list_data_collate)
check_data = monai.utils.misc.first(check_loader)
print(check_data["img"].shape, check_data["seg"].shape)
plt.imshow(np.squeeze(check_data["img"][0,0,:,:]))

Expected behavior
If the color channel is handled correctly, I expect the shape of the tensors to be [2,3,240,320].

Observed behavior
The output shape is [2,300,240,320] (please note that in my case, monai.utils.misc.first(check_loader) loads an image of shape [300,400,3]).

Environment (please complete the following information):

Ubuntu 18.04
MONAI version: 0.2.0+166.g12b3fbf
Python version: 3.6.10 |Anaconda, Inc.| (default, May 8 2020, 02:54:21) [GCC 7.3.0]
Numpy version: 1.19.1
Pytorch version: 1.7.0a0+8deb4fe
Optional dependencies:
Pytorch Ignite version: 0.3.0
Nibabel version: 3.1.1
scikit-image version: 0.15.0
Pillow version: 7.2.0
Tensorboard version: 1.15.0+nv
gdown version: 3.12.2
TorchVision version: 0.8.0a0
ITK version: 5.1.1

Quality assurance of the MONAI/examples folder

Is your feature request related to a problem? Please describe.
As the size and content scope of the MONAI/examples folder increase,
it's necessary to figure out the hardware/software requirements for running the examples,
and also provide some forms of quality assurance of the example codes.

Describe the solution you'd like
could automatically run the examples as a part of the automated CI/CD pipeline?

Describe alternatives you've considered
manually verifying all the examples regularly (tedious and error-prone)

Additional context
see also https://github.com/Project-MONAI/MONAI/issues/296

3D Classifier

Dear all,

I'm adapting the 3D classifier tutorial "densenet_training" to my example files.
My nifti files have a different size, so I get this error when doing the input to the model:

Expected 5-dimensional input for 5-dimensional weight [64, 1, 7, 7, 7], but got 4-dimensional input of size [2, 224, 224, 160] instead

How can I modify the code so I can test the tutorial on my files?

Thanks

show the dicom loading usage

would be great to extend the https://github.com/Project-MONAI/Tutorials/blob/master/load_medical_images.ipynb to load DICOM

Spleen example crashes if I modify it to use different input dataset

Describe the bug
The example crashes. I tried different roi_sizes, but setting e.g. (-1, -1, -1) just postpones the crash for later in the process.

To Reproduce
Run my notebook which is a modified copy of the spleen example.

Expected behavior
Training finishes after a while.

Environment (please complete the following information):
Windows 10
MONAI version: 0.2.0
Python version: 3.7.9 (tags/v3.7.9:13c94747c7, Aug 17 2020, 18:58:18) [MSC v.1900 64 bit (AMD64)]
Numpy version: 1.19.1
Pytorch version: 1.4.0+cpu

Optional dependencies:
Pytorch Ignite version: 0.3.0
Nibabel version: 3.1.1
scikit-image version: 0.17.2
Pillow version: 7.2.0
Tensorboard version: 2.3.0

Additional context

----------
epoch 1/10
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-12-8f78531d9ef2> in <module>
     19         )
     20         optimizer.zero_grad()
---> 21         outputs = model(inputs)
     22         loss = loss_function(outputs, labels)
     23         loss.backward()

c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

c:\dev\monai\pyenv\lib\site-packages\monai\networks\nets\unet.py in forward(self, x)
    125 
    126     def forward(self, x):
--> 127         x = self.model(x)
    128         return x
    129 

c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\container.py in forward(self, input)
     98     def forward(self, input):
     99         for module in self:
--> 100             input = module(input)
    101         return input
    102 

c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

c:\dev\monai\pyenv\lib\site-packages\monai\networks\layers\simplelayers.py in forward(self, x)
     31 
     32     def forward(self, x):
---> 33         return torch.cat([x, self.submodule(x)], self.cat_dim)
     34 
     35 

c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\container.py in forward(self, input)
     98     def forward(self, input):
     99         for module in self:
--> 100             input = module(input)
    101         return input
    102 

c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

c:\dev\monai\pyenv\lib\site-packages\monai\networks\layers\simplelayers.py in forward(self, x)
     31 
     32     def forward(self, x):
---> 33         return torch.cat([x, self.submodule(x)], self.cat_dim)
     34 
     35 

c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\container.py in forward(self, input)
     98     def forward(self, input):
     99         for module in self:
--> 100             input = module(input)
    101         return input
    102 

c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

c:\dev\monai\pyenv\lib\site-packages\monai\networks\layers\simplelayers.py in forward(self, x)
     31 
     32     def forward(self, x):
---> 33         return torch.cat([x, self.submodule(x)], self.cat_dim)
     34 
     35 

RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 7 and 8 in dimension 3 at C:\w\1\s\windows\pytorch\aten\src\TH/generic/THTensor.cpp:612

Tutorial for the image readers and the LoadImage transform

As the experimental new APIs have been implemented for MONAI I/O Project-MONAI/MONAI#909
this ticket looks for a tutorial to show that:

the image reader APIs could be used independently as file format specific loaders
LoadImage transform could be used as a format-agnostic module, typically as the first transform in a 'transform chain'

(might be useful to briefly mention the optional import feature of MONAI)

TypeError: init() got an unexpected keyword argument 'to_onehot_y'

Describe the bug
The code from A U-Net model for lung lesion segmentation from CT images could not be runned.

The error information is: TypeError: __init__() got an unexpected keyword argument 'to_onehot_y'.

To Reproduce
Steps to reproduce the behavior:

Go to https://github.com/Project-MONAI/tutorials/tree/master/3d_segmentation/challenge_baseline/
Install MONAI by pip install "git+https://github.com/Project-MONAI/MONAI#egg=monai[nibabel,ignite,tqdm]"
Run commands python run_net.py train --data_folder "COVID-19-20_v2/Train" --model_folder "runs"

Expected behavior
Start training of the model.

Screenshots

Environment (please complete the following information):

OS: CentOS
Python version: 3.6
MONAI version [e.g. git commit hash]: 0.3.0
CUDA/cuDNN version:
GPU models and configuration

Additional context
https://covid-segmentation.grand-challenge.org/Resource/

AttributeError: module 'monai.networks.nets' has no attribute 'BasicUNet'

Describe the bug
The code from A U-Net model for lung lesion segmentation from CT images could not be runned.

The error information is: TypeError: __init__() got an unexpected keyword argument 'to_onehot_y'.

To Reproduce
Steps to reproduce the behavior:

Go to https://github.com/Project-MONAI/tutorials/tree/master/3d_segmentation/challenge_baseline/
Install MONAI by pip install monai. NOTE: this is important. Different install methods lead to different errors #60
Run commands python run_net.py train --data_folder "COVID-19-20_v2/Train" --model_folder "runs"

Expected behavior
Start training of the model.

Screenshots

Environment (please complete the following information):

OS: CentOS
Python version: 3.6
MONAI version [e.g. git commit hash]: 0.3.0
CUDA/cuDNN version:
GPU models and configuration

Additional context
https://covid-segmentation.grand-challenge.org/Resource/

Develop FL example based on Clara FL

Is your feature request related to a problem? Please describe.
(originally from Project-MONAI/MONAI#498 )We can use MONAI to build many FL examples based on different FL architectures, this issue is to track the development of an example based on NVIDIA Clara FL.

Model fine tuning

I trained the spleen segmentation model for 200 epochs with the decathlon database. Then I evaluated it with my own dataset and the segmentation performance was extremely poor, do you know how I can finetune the model parameters with my own dataset? how should I do that and how many epochs should I do? (My dataset comprehend 20 manually segmented spleens)

Thanks
Aymen

automate the testing of the jupyter notebooks

this ticket looks for an automated CI setup to ensure the quality of the notebooks.
see also discussions:

rename the modules/workflows folder to modules/engines

the folder mainly demonstrate MONAI's engines and handlers implementation
cc @pfjaeger @zephyrie @Nic-Ma

Develop a tutorial about how to develop networks based on MONAI APIs

Is your feature request related to a problem? Please describe.
We have very rich network layers, blocks, etc. and support both 2D and 3D, we also have layer factory to generate common layers. But currently, we don't have a step by step tutorial to show how to use the APIs to develop networks.

Crashed (Baseline Unet model training for Covid-19 sementation challenge)

(pytorch) rasho@rasho-WS-E500-G5-WS690T:~/covid-19_3D_Segmentation$ python run_net.py train --data_folder "COVID-19-20_v2/Train" --model_folder "runs"
MONAI version: 0.3.0+81.g62b0bbb
Python version: 3.7.9 (default, Aug 31 2020, 12:42:55) [GCC 7.3.0]
OS version: Linux (5.4.0-53-generic)
Numpy version: 1.19.2
Pytorch version: 1.7.0
MONAI flags: HAS_EXT = False, USE_COMPILED = False

Optional dependencies:
Pytorch Ignite version: 0.4.2
Nibabel version: 3.2.0
scikit-image version: NOT INSTALLED or UNKNOWN VERSION.
Pillow version: 8.0.1
Tensorboard version: NOT INSTALLED or UNKNOWN VERSION.
gdown version: NOT INSTALLED or UNKNOWN VERSION.
TorchVision version: 0.8.1
ITK version: NOT INSTALLED or UNKNOWN VERSION.
tqdm version: 4.52.0

For details about installing the optional dependencies, please visit:
https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies

INFO:root:training: image/label (199) folder: COVID-19-20_v2/Train
INFO:root:training: train 160 val 39, folder: COVID-19-20_v2/Train
INFO:root:batch size 2
Load and cache transformed data: 100%|█████████████████████████████████████████████████████████████████████| 160/160 [04:13<00:00, 1.58s/it]
Load and cache transformed data: 100%|███████████████████████████████████████████████████████████████████████| 39/39 [00:58<00:00, 1.51s/it]
BasicUNet features: (32, 32, 64, 128, 256, 32).
INFO:root:epochs 500, lr 0.0001, momentum 0.95
INFO:ignite.engine.engine.SupervisedTrainer:Engine run resuming from iteration 0, epoch 0 until 500 epochs
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 1/80 -- train_loss: 1.4053
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 2/80 -- train_loss: 1.3833
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 3/80 -- train_loss: 1.3598
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 4/80 -- train_loss: 1.3268
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 5/80 -- train_loss: 1.3438
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 6/80 -- train_loss: 1.3146
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 7/80 -- train_loss: 1.3164
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 8/80 -- train_loss: 1.3118
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 9/80 -- train_loss: 1.2970
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 10/80 -- train_loss: 1.2957
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 11/80 -- train_loss: 1.2779
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 12/80 -- train_loss: 1.2499
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 13/80 -- train_loss: 1.2641
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 14/80 -- train_loss: 1.2634
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 15/80 -- train_loss: 1.2439
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 16/80 -- train_loss: 1.2206
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 17/80 -- train_loss: 1.2209
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 18/80 -- train_loss: 1.2143
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 19/80 -- train_loss: 1.1976
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 20/80 -- train_loss: 1.1950
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 21/80 -- train_loss: 1.1833
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 22/80 -- train_loss: 1.1747
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 23/80 -- train_loss: 1.1739
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 24/80 -- train_loss: 1.1676
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 25/80 -- train_loss: 1.1586
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 26/80 -- train_loss: 1.1585
Killed

tikinter issue: RuntimeError: main thread is not in main loop

Describe the bug
I got a tkinter runtime error related with threads when locally running the spleen_segmentation_3d.ipynb in the epoch cell.

epoch 12/600
1/16, train_loss: 0.5761
2/16, train_loss: 0.5969
3/16, train_loss: 0.4487
4/16, train_loss: 0.5615
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
5/16, train_loss: 0.5464
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x7f1d2ec583a0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 351, in __del__
    if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x7f1d2ec583a0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 351, in __del__
    if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x7f1d2ec583a0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 351, in __del__
    if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x7f1d2ec583a0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 351, in __del__
    if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x7f1d2ec583a0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 351, in __del__
    if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x7f1d2ec583a0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 351, in __del__
    if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x7f1d2ec583a0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 351, in __del__
    if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x7f1d2ec583a0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 351, in __del__
    if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Tcl_AsyncDelete: async handler deleted by the wrong thread
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Tcl_AsyncDelete: async handler deleted by the wrong thread
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Tcl_AsyncDelete: async handler deleted by the wrong thread
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Tcl_AsyncDelete: async handler deleted by the wrong thread
Traceback (most recent call last):
  File "/home/phc/.virtualenvs/monai/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 872, in _try_get_data
    data = self._data_queue.get(timeout=timeout)
  File "/usr/lib/python3.8/multiprocessing/queues.py", line 107, in get
    if not self._poll(timeout):
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 257, in poll
    return self._poll(timeout)
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 424, in _poll
    r = wait([self], timeout)
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 931, in wait
    ready = selector.select(timeout)
  File "/usr/lib/python3.8/selectors.py", line 415, in select
    fd_event_list = self._selector.poll(timeout)
  File "/home/phc/.virtualenvs/monai/lib/python3.8/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
    _error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 355179) is killed by signal: Aborted.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "spleen_segmentation_3d.py", line 268, in <module>
    for batch_data in train_loader:
  File "/home/phc/.virtualenvs/monai/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 435, in __next__
    data = self._next_data()
  File "/home/phc/.virtualenvs/monai/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1068, in _next_data
    idx, data = self._get_data()
  File "/home/phc/.virtualenvs/monai/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1034, in _get_data
    success, data = self._try_get_data()
  File "/home/phc/.virtualenvs/monai/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 885, in _try_get_data
    raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str)) from e
RuntimeError: DataLoader worker (pid(s) 355179) exited unexpectedly

Environment (please complete the following information):

OS: Linux (arch)
Python version: 3.8.6
MONAI version [e.g. git commit hash] 0.4.0
CUDA/cuDNN version: 11.1
GPU models and configuration: 3090

Additional context
Sorry for the brevitiy of the report. The notebook is run as a python script using jupytext (converts ipynb to py).

Sliding Window Inference giving error on 0.4.0

Command used for Sliding Window Inference (on Monai 0.4.0 but its working fine on 0.3.0)

Code Snippet:

for val_data in self.val_loader:
    val_step_start_time = time.time()
    val_images, val_labels = val_data["image"].to(self.device), val_data["label"].to(self.device)
    roi_size = (128, 128, 128)
    sw_batch_size = 6
    if amp:
        with torch.cuda.amp.autocast():
            val_outputs = sliding_window_inference(val_images, roi_size, sw_batch_size, self.network, sw_device=self.device, device=self.device)
    else:
    val_outputs = sliding_window_inference(val_images, roi_size, sw_batch_size, self.network)

Getting same error with and without amp

Error:

AttributeError Traceback (most recent call last)
/data/archit/Liver/Experiments/monai/main.py in
25
26 if name == "main":
---> 27 main()

/data/archit/Liver/Experiments/monai/main.py in main()
19 if args.continue_training == True:
20 trainer.load_best_checkpoint()
---> 21 trainer.trainProcess(amp=True)
22
23

/data/archit/Liver/Experiments/monai/TeraReconAI/train/segmentationTrainer.py in trainProcess(self, amp)
71 self.initialize_network()
72 amp_start = time.time()
---> 73 super()._trainProcess(amp)
74 amp_total_time = time.time() - amp_start
75 print(f"Total training time with AMP: {amp_total_time:.4f}")

/data/archit/Liver/Experiments/monai/TeraReconAI/train/trainer.py in _trainProcess(self, amp)
168 # else:
169 self.network = self.network.to(self.device)
--> 170 val_outputs = sliding_window_inference(val_images, roi_size, sw_batch_size, self.network, sw_device=self.device, device=self.device)
171
172

/data/archit/Software/anaconda3/envs/monai/lib/python3.8/site-packages/monai/inferers/utils.py in sliding_window_inference(inputs, roi_size, sw_batch_size, predictor, overlap, mode, sigma_scale, padding_mode, cval, sw_device, device, *args, **kwargs)
127 ]
128 window_data = torch.cat([inputs[win_slice] for win_slice in unravel_slice]).to(sw_device)
--> 129 seg_prob = predictor(window_data, *args, **kwargs).to(device) # batched patch segmentation
130
131 if not _initialized: # init. buffer at the first iteration

AttributeError: 'list' object has no attribute 'to'

ValueError while training the model for Brain Tumor Segmentation

Hi there -

I'm new to MONAI and doing some learning of the brain tumor segmentation code - referring to the file brats_segmentation_3d.ipynb under tutorials/3d_segmentation. I'm using this code AS-IS in my Jupyter Notebook. While training on the Medical Decathlon dataset, exactly after epoch 2 I see the following error:

_epoch 2 average loss: 0.8960

ValueError Traceback (most recent call last)
in
54 # metric_sum += value.item() * not_nans
55 # compute mean dice for TC
---> 56 value_tc, not_nans = dice_metric(y_pred=val_outputs[:, 0:1], y=val_labels[:, 0:1])
57 not_nans = not_nans.item()
58 metric_count_tc += not_nans

ValueError: not enough values to unpack (expected 2, got 1)_

Can you please suggest anything to rectify this problem?

Many thanks,
Sekhar H.

ROI size sliding window

Hi,
I was wondering how UNet deals with the sliding window input.
Because the ROI you set is bigger than the patches UNet is trained on.
How does this work?

Thanks.
Kirsten

mkdtemp missing brackets

In a few places (e.g., 3rd cell here), we have:

root_dir = tempfile.mkdtemp if directory is None else directory

which is missing the brackets:

root_dir = tempfile.mkdtemp() if directory is None else directory

Might be worth grepping and replacing all mkdtemp[space] with mkdtemp()[space].

Do data transforms happen on every yield from the train loader or once at load time?

QUESTION 1:

When I apply a list of transforms as in the Spleen tutorial notebook, do they happen once here:

train_ds = CacheDataset(data=train_files, transform=train_trans, cache_rate=1.0, num_workers=8)

Note that it says

Load and cache transformed data: 100%|██████████| 41/41 [00:15<00:00, 2.65it/s]

The past tense "transformed" seems to indicate that transformations only happen once. Or, after defining

train_loader = DataLoader(train_ds, batch_size=2, shuffle=True, num_workers=loader_workers)

do the transformations actually happen on every reference to an item in the training queue, specifically here:

for batch_data in train_loader:

This is the ideal case for me. In the former case, should I repeat my data 100 times before running it through CacheDataset to get my augmentations? Is that standard? It seems it would be a lot better to do the transformations on the fly. Also very necessary for a subsampling transformation like RandCropByPosNegLabeld.

This could be a dumb question, I just don't see it spelled out in the docs and the logging printed out by CacheDataset.

NOTE: I'm guessing this happens with every train_loader yield, because my training loop has slowed way down. This leads to

QUESTION 2: Would it be possible to do these transforms in the GPU? I'm assuming the slowdown happens because they are on CPU, as shown by the attached picture, which depicts a very lightly loaded GPU and 1 hammered CPU core. This leads to

QUESTION 3: Can I speed up the train loader transformations by adding workers? I'm guessing Yes. If not, should be Yes. I'll try it now.

ANSWER 1&3: Yes it must be happening for each train_loader yield, yes adding workers helps. NOTE: A comment in the tutorial notebook says "because this is cached in memory, you only need one work". This is misleading. And on Question 2: The 8 cores I added are 100% active. The GPU is 10% to 25% loaded max. These transforms should happen on the GPU!! Most of the compute time is spent in the transforms. Very little in the training.

ITK version: NOT INSTALLED or UNKNOWN VERSION

Dear all,

After pip installing ITK or SimpleITK the "print_config()" prompt does not find the installed ITK version.
Moreover, while executing the "densenet_training.array.ipynb" tutorial I get this error:

Load and cache transformed data: 0%| | 0/9 [00:00<?, ?it/s]

OptionalImportError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/monai/transforms/utils.py in apply_transform(transform, data, map_items)
308 return [transform(item) for item in data]
--> 309 return transform(data)
310 except Exception as e:

35 frames
OptionalImportError: import itk (No module named 'itk').

For details about installing the optional dependencies, please visit:
https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies

During handling of the above exception, another exception occurred:

OptionalImportError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/monai/utils/module.py in optional_import(module, version, version_checker, name, descriptor, version_args, allow_namespace_pkg)
165 actual_cmd = f"import {module}"
166 try:
--> 167 pkg = import(module) # top level module
168 the_module = import_module(module)
169 if not allow_namespace_pkg:

OptionalImportError: Applying transform <monai.transforms.io.dictionary.LoadImaged object at 0x7f7d33d66860>.

Regards,
Sebastian

U-Net model for lung lesion segmentation model does not run using colab

I tried running the following command python run_net.py train --data_folder "COVID-19-20_v2/Train" --model_folder "runs" per the instructions and I get the output below in example A. It The model does not seem to be training. I also tried running the inference command python run_net.py infer --data_folder "COVID-19-20_v2/Validation" --model_folder "runs" and I get the error in example B.

When I check the runs folder, I do not see any indication that model ran or checkpoints saved.

I am using google Colab to train to the model.

example A

MONAI version: 0.3.0+57.g70650b8
Python version: 3.6.9 (default, Oct  8 2020, 12:12:24)  [GCC 8.4.0]
OS version: Linux (4.19.112+)
Numpy version: 1.18.5
Pytorch version: 1.7.0+cu101
MONAI flags: HAS_EXT = False, USE_COMPILED = False

Optional dependencies:
Pytorch Ignite version: 0.4.2
Nibabel version: 3.0.2
scikit-image version: 0.16.2
Pillow version: 7.0.0
Tensorboard version: 2.3.0
gdown version: 3.6.4
TorchVision version: 0.8.1+cu101
ITK version: NOT INSTALLED or UNKNOWN VERSION.
tqdm version: 4.51.0

For details about installing the optional dependencies, please visit:
    https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies

INFO:root:training: image/label (199) folder: COVID-19-20_v2/Train
INFO:root:training: train 160 val 39, folder: COVID-19-20_v2/Train
INFO:root:batch size 2
Load and cache transformed data: 100% 160/160 [05:18<00:00,  1.99s/it]
Load and cache transformed data: 100% 39/39 [01:21<00:00,  2.10s/it]
BasicUNet features: (32, 32, 64, 128, 256, 32).
^C

example B

MONAI version: 0.3.0+57.g70650b8
Python version: 3.6.9 (default, Oct  8 2020, 12:12:24)  [GCC 8.4.0]
OS version: Linux (4.19.112+)
Numpy version: 1.18.5
Pytorch version: 1.7.0+cu101
MONAI flags: HAS_EXT = False, USE_COMPILED = False

Optional dependencies:
Pytorch Ignite version: 0.4.2
Nibabel version: 3.0.2
scikit-image version: 0.16.2
Pillow version: 7.0.0
Tensorboard version: 2.3.0
gdown version: 3.6.4
TorchVision version: 0.8.1+cu101
ITK version: NOT INSTALLED or UNKNOWN VERSION.
tqdm version: 4.51.0

For details about installing the optional dependencies, please visit:
    https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies

Traceback (most recent call last):
  File "run_net.py", line 264, in <module>
    infer(data_folder=data_folder, model_folder=args.model_folder)
  File "run_net.py", line 179, in infer
    ckpt = ckpts[-1]
IndexError: list index out of range

Explicit for-loop optimisation or SupervisedTrainer

It seems that in the majority of tutorials, the optimisation for loop is given explicitly. In relatively few places, the SupervisedTrainer is used, despite existing for this reason.

I can see why having the explicit for loop is beneficial for tutorials - so that people are more aware of the inner workings. However, for the sake of conciseness, I would be in favour of having just one notebook (named suitably) in which the explicit for loop is given, and then from there on, using the SupervisedTrainer. Notebooks using SupervisedTrainer could then refer to the explicit notebook.

I think @ericspod is in favour of leaving the notebooks as they are, so as not to hide anything (which I understand). Anyone else have an opinion?

covid challenge evaluation

hello, I am the participant of the covid challenge. Now, the submit has been closed. I have a new prediction and I want to know its dice score to do my own research. Could you please open the evaluation website for me or share the evaluation method? It will be better if the organization can release the ground truth labels of test and validation dataset. Thank you so much!

Update to use LoadImage transform

Is your feature request related to a problem? Please describe.
As we updated LoadImage as the recommended loading method, need to update all the examples and tutorials.

MONAI flags: HAS_EXT = False, USE_COMPILED = False

What does this line meaning??
I wanted to reproduce the baseline lesion network
Every time I started training, this line occured. Is there something wrong??

autoencoder_mednist transformations failure

Describe the bug
A clear and concise description of what the bug is.

When running the current version of autoencoder_mednist tutorial it will crash while trying to perform transformations on data. Specifically while creating CasheDataset.

To Reproduce
Steps to reproduce the behavior:
Simply run all the cells until you reach creating the CasheDataset - that's where it crashes

Expected behavior
It should perform the transformations

Additional context

Simple solution I found is to add "reader" parameter to LoadImageD transformation. In case of mednist Hand dataset(which is the default in this tutorial) it should be reader="PILReader" as all the images as .jpg

How to modify the loss function as Dice + CE loss?

Hi,
I am conducting a segmentation task with only one target structure. Now I try to modify the loss function as Dice + CE loss, then I just change the code as shown here,

class CrossEntropyLoss(nn.Module):
    def __init__(self):
        super().__init__()
        self.loss = nn.CrossEntropyLoss()

    def forward(self, y_pred, y_true):
        # CrossEntropyLoss target needs to have shape (B, D, H, W)
        # Target from pipeline has shape (B, 1, D, H, W)
        y_true = torch.squeeze(y_true, dim=1).long()
        return self.loss(y_pred, y_true)


class DiceCELoss(nn.Module):
    def __init__(self):
        super().__init__()
        self.dice = monai.losses.DiceLoss(sigmoid = True)
        self.cross_entropy = CrossEntropyLoss()

    def forward(self, y_pred, y_true):
        dice = self.dice(y_pred, y_true)
        cross_entropy = self.cross_entropy(y_pred, y_true)
        return dice + cross_entropy

loss_function = DiceCELoss()

However, after changing this part, the code seems cannot work now, and reports such kind of error. Could you please help me find what is wrong?

/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [34,0,0], thread: [893,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [34,0,0], thread: [894,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [34,0,0], thread: [895,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [34,0,0], thread: [384,0,0] Assertion `t >= 0 && t < n_classes` failed.
Traceback (most recent call last):
  File "2D_UNet.py", line 259, in <module>
    main()
  File "2D_UNet.py", line 191, in main
    loss.backward()
  File "/home/anaconda3/envs/monai/lib/python3.7/site-packages/torch/tensor.py", line 221, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/anaconda3/envs/monai/lib/python3.7/site-packages/torch/autograd/__init__.py", line 132, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED

Thanks a lot.

review notebooks with reviewnb

install reviewnb https://www.reviewnb.com/ on this repo for diff & Commenting pull requests of jupyter notebooks

Pip installs at start of each notebook?

Should we be pip installing monai and its dependencies at the start of each notebook?

Discussion continued from #47.

My personal feeling is that we should lift all pip installs from our notebooks as the relevant instructions are already in our README.md. It also saves us from having to update as our notebooks/dependencies change.

a tutorial of GradCAM

GradCAM module is inplace, would be great to have a 3D classification model demo, with a medical image related task

Project-MONAI/MONAI#1303

missing colab button and install part in ThreadBuffer notebook

Describe the bug
Hi @ericspod , could you please help add the Colab button and installation from latest MONAI code(as we haven't released 0.4 yet) to the ThreadBuffer notebook?
Thanks.

LR SCHEDULER and LR Finder

It would be nice to have examples on how to use Learning rate schedulers using the MONAI classes. And It would be nice to have a LR finder like the FASTAI one cycle one.

Fail to load state dict

Hi,

I load the state dict in the same model generated by monai.networks.nets.UNet. However, it reports such an error. Actually, I successfully load the state dict before, but I am not sure what is wrong this time. It seems that the weight has a difference between 'act' and 'adn.A'.

Thank you.

RuntimeError: Error(s) in loading state_dict for UNet:
        Missing key(s) in state_dict: "model.0.conv.unit0.act.weight", "model.0.conv.unit1.act.weight", "model.1.submodule.0.conv.unit0.act.weight", "model.1.submodule.0.conv.unit1.act.weight", "model.1.submodule.1.submodule.0.conv.unit0.act.weight", "model.1.submodule.1.submodule.0.conv.unit1.act.weight", "model.1.submodule.1.submodule.1.submodule.0.conv.unit0.act.weight", "model.1.submodule.1.submodule.1.submodule.0.conv.unit1.act.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.0.conv.unit0.act.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.0.conv.unit1.act.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.1.submodule.conv.unit0.act.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.1.submodule.conv.unit1.act.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.2.0.act.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.2.1.conv.unit0.act.weight", "model.1.submodule.1.submodule.1.submodule.2.0.act.weight", "model.1.submodule.1.submodule.1.submodule.2.1.conv.unit0.act.weight", "model.1.submodule.1.submodule.2.0.act.weight", "model.1.submodule.1.submodule.2.1.conv.unit0.act.weight", "model.1.submodule.2.0.act.weight", "model.1.submodule.2.1.conv.unit0.act.weight", "model.2.0.act.weight". 
        Unexpected key(s) in state_dict: "model.0.conv.unit0.adn.A.weight", "model.0.conv.unit1.adn.A.weight", "model.1.submodule.0.conv.unit0.adn.A.weight", "model.1.submodule.0.conv.unit1.adn.A.weight", "model.1.submodule.1.submodule.0.conv.unit0.adn.A.weight", "model.1.submodule.1.submodule.0.conv.unit1.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.0.conv.unit0.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.0.conv.unit1.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.0.conv.unit0.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.0.conv.unit1.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.1.submodule.conv.unit0.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.1.submodule.conv.unit1.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.2.0.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.2.1.conv.unit0.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.2.0.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.2.1.conv.unit0.adn.A.weight", "model.1.submodule.1.submodule.2.0.adn.A.weight", "model.1.submodule.1.submodule.2.1.conv.unit0.adn.A.weight", "model.1.submodule.2.0.adn.A.weight", "model.1.submodule.2.1.conv.unit0.adn.A.weight", "model.2.0.adn.A.weight".

Runtime error in the Code. I am using Colab with reduced data in train and validation set. And i set the epoch to 100. But this happens

INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 5/100, Iter: 21/66 -- train_loss: 0.8563
ERROR:ignite.engine.engine.SupervisedTrainer:Current run is terminating due to exception: DataLoader worker (pid 1476) is killed by signal: Killed. .
ERROR:ignite.engine.engine.SupervisedTrainer:Exception: DataLoader worker (pid 1476) is killed by signal: Killed.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 811, in _run_once_on_dataset
self.state.output = self._process_function(self, self.state.batch)
File "/usr/local/lib/python3.6/dist-packages/monai/engines/trainer.py", line 156, in _iteration
self.scaler.step(self.optimizer)
File "/usr/local/lib/python3.6/dist-packages/torch/cuda/amp/grad_scaler.py", line 320, in step
if not sum(v.item() for v in optimizer_state["found_inf_per_device"].values()):
File "/usr/local/lib/python3.6/dist-packages/torch/cuda/amp/grad_scaler.py", line 320, in
if not sum(v.item() for v in optimizer_state["found_inf_per_device"].values()):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 1476) is killed by signal: Killed.
ERROR:ignite.engine.engine.SupervisedTrainer:Engine run is terminating due to exception: DataLoader worker (pid 1476) is killed by signal: Killed. .
ERROR:ignite.engine.engine.SupervisedTrainer:Exception: DataLoader worker (pid 1476) is killed by signal: Killed.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 730, in _internal_run
time_taken = self._run_once_on_dataset()
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 828, in _run_once_on_dataset
self._handle_exception(e)
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 465, in _handle_exception
self._fire_event(Events.EXCEPTION_RAISED, e)
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 423, in _fire_event
func(*first, *(event_args + others), **kwargs)
File "/usr/local/lib/python3.6/dist-packages/monai/handlers/stats_handler.py", line 145, in exception_raised
raise e
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 811, in _run_once_on_dataset
self.state.output = self._process_function(self, self.state.batch)
File "/usr/local/lib/python3.6/dist-packages/monai/engines/trainer.py", line 156, in _iteration
self.scaler.step(self.optimizer)
File "/usr/local/lib/python3.6/dist-packages/torch/cuda/amp/grad_scaler.py", line 320, in step
if not sum(v.item() for v in optimizer_state["found_inf_per_device"].values()):
File "/usr/local/lib/python3.6/dist-packages/torch/cuda/amp/grad_scaler.py", line 320, in
if not sum(v.item() for v in optimizer_state["found_inf_per_device"].values()):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 1476) is killed by signal: Killed.
Traceback (most recent call last):
File "run_net.py", line 301, in
train(data_folder=data_folder, model_folder=args.model_folder)
File "run_net.py", line 211, in train
trainer.run()
File "/usr/local/lib/python3.6/dist-packages/monai/engines/trainer.py", line 46, in run
super().run()
File "/usr/local/lib/python3.6/dist-packages/monai/engines/workflow.py", line 163, in run
super().run(data=self.data_loader, max_epochs=self.state.max_epochs)
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 691, in run
return self._internal_run()
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 762, in _internal_run
self._handle_exception(e)
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 465, in _handle_exception
self._fire_event(Events.EXCEPTION_RAISED, e)
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 423, in _fire_event
func(*first, *(event_args + others), **kwargs)
File "/usr/local/lib/python3.6/dist-packages/monai/handlers/stats_handler.py", line 145, in exception_raised
raise e
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 730, in _internal_run
time_taken = self._run_once_on_dataset()
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 828, in _run_once_on_dataset
self._handle_exception(e)
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 465, in _handle_exception
self._fire_event(Events.EXCEPTION_RAISED, e)
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 423, in _fire_event
func(*first, *(event_args + others), **kwargs)
File "/usr/local/lib/python3.6/dist-packages/monai/handlers/stats_handler.py", line 145, in exception_raised
raise e
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 811, in _run_once_on_dataset
self.state.output = self._process_function(self, self.state.batch)
File "/usr/local/lib/python3.6/dist-packages/monai/engines/trainer.py", line 156, in _iteration
self.scaler.step(self.optimizer)
File "/usr/local/lib/python3.6/dist-packages/torch/cuda/amp/grad_scaler.py", line 320, in step
if not sum(v.item() for v in optimizer_state["found_inf_per_device"].values()):
File "/usr/local/lib/python3.6/dist-packages/torch/cuda/amp/grad_scaler.py", line 320, in
if not sum(v.item() for v in optimizer_state["found_inf_per_device"].values()):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 1476) is killed by signal: Killed.

The Code used for the training stops

!python run_net.py train --data_folder "COVID-19-20_v2/Train" --model_folder "runs"

Code cell stops after showing this result. Don't know what should I do next, How can I find the models?

MONAI version: 0.3.0+87.ge94e243
Python version: 3.6.9 (default, Oct 8 2020, 12:12:24) [GCC 8.4.0]
OS version: Linux (4.19.112+)
Numpy version: 1.18.5
Pytorch version: 1.7.0+cu101
MONAI flags: HAS_EXT = False, USE_COMPILED = False

Optional dependencies:
Pytorch Ignite version: 0.4.2
Nibabel version: 3.0.2
scikit-image version: 0.16.2
Pillow version: 7.0.0
Tensorboard version: 2.3.0
gdown version: 3.6.4
TorchVision version: 0.8.1+cu101
ITK version: NOT INSTALLED or UNKNOWN VERSION.
tqdm version: 4.53.0

For details about installing the optional dependencies, please visit:
https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies

INFO:root:training: image/label (199) folder: COVID-19-20_v2/Train
INFO:root:training: train 160 val 39, folder: COVID-19-20_v2/Train
INFO:root:batch size 2
Load and cache transformed data: 100% 160/160 [05:30<00:00, 2.06s/it]
Load and cache transformed data: 100% 39/39 [01:24<00:00, 2.18s/it]
BasicUNet features: (32, 32, 64, 128, 256, 32).
^C

Soft labels to reflect uncertainty on boundary of ground truth annotation

Hey all,

I am currently using monai to participate in the grand challenge for COVID segmentation. As a baseline model I use the DynUnet with parameters adapted from nn-Net. This works great and gave me a validation dice of around 0.7. To further improve the results I wanted to focus on handeling the noisy ground truth annotations. Since the ground truth annotations from this project are not really clean, I want to implement some form of 'soft labels'. By gaussian smoothing the masks, the probability drops below 1 on the borders of the lesions reflecting the uncertainty of the ground truth annotation.

I tried implementing this with monai building blocks, but I got stuck while using dice-loss since the one_hot function that is called in there expects binary masks input and doesn't work as expected for probabilistic masks. I now wrote my own 'soft_label_dice' that handles probabilistic labels in the case of only 2 class labels. I thought this might be an interesting feature for monai since multiple segmentations problems have uncertain ground truth boundaries.

I was wondering what you guys think of this soft labeling strategy. I know other methods exist for increasing noise robustness, but it seemed my model was being punished to hard for making mistakes during training on regions that are only coarsely annotated.
Below I added a snippet with my soft_label_dice function.

Kind regards,
Joris Wuts

`def soft_label_dice(preds, label):
preds = torch.softmax(preds, 1)
# label is of shape (B1H[WD]) having float values ranging from 0-1
label=torch.cat((label,(1-label)),1)

reduce_axis = list(range(2, len(preds.shape)))
nom=torch.sum(torch.pow((preds -label),2), dim=reduce_axis)

ground_o = torch.sum(preds, dim=reduce_axis)
pred_o = torch.sum(label, dim=reduce_axis)

denominator = ground_o + pred_o +0.00001
f: torch.Tensor = nom / denominator 
f = torch.mean(f)
return f`

WASSERTEIN distance

Hello, I wanted to ask if anyone has used Wasserstein distance in brain different structures segmentation because I have some issues. For example, the argument that I should pass to my pipeline is a matrix distance and I would like to know how I construct this matrix, I don't know where those numbers come from. And the other question if is the loss of Wasserstein distance is finished and prove in any experiment (not in brain tumor because the labels are continuous and my labels are separated).

Thank you.

verify notebooks using the latest monai 0.4.0 pypi release

subtask of Project-MONAI/MONAI#1318
need to

check that pip install monai[all] works for all the notebooks and demos
tag a 0.4.0 version of this repo

double check all examples/tutorials

since there're breaking changes since v0.2, we need to rerun and double-check all the examples/tutorials for v0.3

Adopt PEP8 in MONAI/examples/notebooks/

Is your feature request related to a problem? Please describe.
Some notebooks are not following the PEP8 style guide.

Describe the solution you'd like
Please, consider following the PEP8 style guide in the notebooks from MONAI/examples/notebooks/.
For example, in examples/notebooks/mednist_tutorial.ipynb, cell 4 has variables named using the CamelCase style instead snake_case (https://www.python.org/dev/peps/pep-0008/#id45), for example:

dataDir = './MedNIST/'
classNames = os.listdir(dataDir)
numClass = len(classNames)

Later, in the same notebook, the snake_case is adopted.

train_ds = MedNISTDataset(trainX, trainY, train_transforms)
train_loader = DataLoader(train_ds, batch_size=300, shuffle=True, num_workers=10)

val_ds = MedNISTDataset(valX, valY, val_transforms)
val_loader = DataLoader(val_ds, batch_size=300, num_workers=10)

Log and plotting result for training proccess

Thank you for the tutorials!

I can't find any log files saved in ./run and it seems that this part is not included in the code. (./3d_segmentation/baseline)
It would be much clearer if the training information is saved and plotted.

Anthor question is that, do the images under 'Validation' folder have labels (groung truth?) and where is it?

Thank you!

Colab links point to main respository

Tutorials have a link for opening them with Colab:

But this points to their location prior to being moved into a separate repository:

Crash in spleen 3D segmentation tutorial

Describe the bug
Trying to follow https://github.com/Project-MONAI/tutorials/blob/17bf2ec91e2871898198084f4ba5e968c2bef47e/3d_segmentation/spleen_segmentation_3d.ipynb I run into a traceback at step "Execute a typical PyTorch training process".

To Reproduce
I installed everything using pip, in a virtual environment. I needed to allow CPU back-end, as my laptop has AMD GPU: device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

Screenshots

----------
epoch 1/600
---------------------------------------------------------------------------
PicklingError                             Traceback (most recent call last)
<ipython-input-13-ab25791c97e3> in <module>
     12     epoch_loss = 0
     13     step = 0
---> 14     for batch_data in train_loader:
     15         step += 1
     16         inputs, labels = (

c:\dev\monai\pyenv\lib\site-packages\torch\utils\data\dataloader.py in __iter__(self)
    277             return _SingleProcessDataLoaderIter(self)
    278         else:
--> 279             return _MultiProcessingDataLoaderIter(self)
    280 
    281     @property

c:\dev\monai\pyenv\lib\site-packages\torch\utils\data\dataloader.py in __init__(self, loader)
    717             #     before it starts, and __del__ tries to join but will get:
    718             #     AssertionError: can only join a started process.
--> 719             w.start()
    720             self._index_queues.append(index_queue)
    721             self._workers.append(w)

C:\Dev\Python3.7.9\lib\multiprocessing\process.py in start(self)
    110                'daemonic processes are not allowed to have children'
    111         _cleanup()
--> 112         self._popen = self._Popen(self)
    113         self._sentinel = self._popen.sentinel
    114         # Avoid a refcycle if the target function holds an indirect

C:\Dev\Python3.7.9\lib\multiprocessing\context.py in _Popen(process_obj)
    221     @staticmethod
    222     def _Popen(process_obj):
--> 223         return _default_context.get_context().Process._Popen(process_obj)
    224 
    225 class DefaultContext(BaseContext):

C:\Dev\Python3.7.9\lib\multiprocessing\context.py in _Popen(process_obj)
    320         def _Popen(process_obj):
    321             from .popen_spawn_win32 import Popen
--> 322             return Popen(process_obj)
    323 
    324     class SpawnContext(BaseContext):

C:\Dev\Python3.7.9\lib\multiprocessing\popen_spawn_win32.py in __init__(self, process_obj)
     87             try:
     88                 reduction.dump(prep_data, to_child)
---> 89                 reduction.dump(process_obj, to_child)
     90             finally:
     91                 set_spawning_popen(None)

C:\Dev\Python3.7.9\lib\multiprocessing\reduction.py in dump(obj, file, protocol)
     58 def dump(obj, file, protocol=None):
     59     '''Replacement for pickle.dump() using ForkingPickler.'''
---> 60     ForkingPickler(file, protocol).dump(obj)
     61 
     62 #

PicklingError: Can't pickle <function CropForegroundd.<lambda> at 0x000001E8B1AED318>: attribute lookup CropForegroundd.<lambda> on monai.transforms.croppad.dictionary failed

Optional dependencies:
Pytorch Ignite version: 0.3.0
Nibabel version: 3.1.1
scikit-image version: 0.17.2
Pillow version: 7.2.0
Tensorboard version: 2.3.0

TensorboardImageHandler for challenge_baseline

Please can anybody tell me how to use the TensorboardimageHandler for the challnege_baseline script?
What is the output_transform to use?

[Question] augmentation tutorial, how to deal with labels?

I am trying to implement torchio and batchgenerators augmentations following this tutorial:
https://github.com/Project-MONAI/Tutorials/blob/master/integrate_3rd_party_transforms.ipynb

The spatial transformations should also affect my label maps, however I don't want to use linear or bspline interpolation which makes sense for image data for my label maps. What is the best way to implement that?

Error adapting Spleen example to different shaped dataset

Describe the bug
I am trying to adapt the spleen_segmentation_3d.ipynb notebook to imaging data with a slightly different shape. The images in the Spleen set are 226x257 with 113 planes in the stack. My images are 1200x340 with 20 planes in the stack. The notebook samples the data in cubes of size 96x96x96. To get the example notebook to work, I have to duplicate my data on the planes to be 20+20+20+20+16 = 96. Otherwise it breaks, for the obvious reason that you can't get 96 slices out of 20.

Suppose however that I change the cube size to 20x20x20, so I don't duplicate planes to match the exact setup of the notebook. I still get a problem. Here is the problem, please let me know how to resolve it:

RuntimeError                              Traceback (most recent call last)
<ipython-input-15-26b65d7e4120> in <module>
     19         )
     20         optimizer.zero_grad()
---> 21         outputs = model(inputs)
     22         loss = loss_function(outputs, labels)
     23         loss.backward()

~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

~/anaconda3/envs/mona/lib/python3.8/site-packages/monai/networks/nets/unet.py in forward(self, x)
    190 
    191     def forward(self, x: torch.Tensor) -> torch.Tensor:
--> 192         x = self.model(x)
    193         return x
    194 

~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/container.py in forward(self, input)
     98     def forward(self, input):
     99         for module in self:
--> 100             input = module(input)
    101         return input
    102 

~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

~/anaconda3/envs/mona/lib/python3.8/site-packages/monai/networks/layers/simplelayers.py in forward(self, x)
     37 
     38     def forward(self, x: torch.Tensor) -> torch.Tensor:
---> 39         return torch.cat([x, self.submodule(x)], self.cat_dim)
     40 
     41 

~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/container.py in forward(self, input)
     98     def forward(self, input):
     99         for module in self:
--> 100             input = module(input)
    101         return input
    102 

~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

~/anaconda3/envs/mona/lib/python3.8/site-packages/monai/networks/layers/simplelayers.py in forward(self, x)
     37 
     38     def forward(self, x: torch.Tensor) -> torch.Tensor:
---> 39         return torch.cat([x, self.submodule(x)], self.cat_dim)
     40 
     41 

~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/container.py in forward(self, input)
     98     def forward(self, input):
     99         for module in self:
--> 100             input = module(input)
    101         return input
    102 

~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

~/anaconda3/envs/mona/lib/python3.8/site-packages/monai/networks/layers/simplelayers.py in forward(self, x)
     37 
     38     def forward(self, x: torch.Tensor) -> torch.Tensor:
---> 39         return torch.cat([x, self.submodule(x)], self.cat_dim)
     40 
     41 

RuntimeError: Sizes of tensors must match except in dimension 2. Got 4 and 3

To Reproduce
Here is the code:

import glob, os, torch
from monai.data import CacheDataset, DataLoader, Dataset
from monai.inferers import sliding_window_inference
from monai.losses import DiceLoss
from monai.metrics import compute_meandice
from monai.networks.layers import Norm
from monai.networks.nets import UNet
from monai.utils import first, set_determinism
data_dir='nf1_monai'
os.environ['MONAI_DATA_DIRECTORY']=data_dir
directory = os.environ.get("MONAI_DATA_DIRECTORY")
root_dir = directory
train_images = sorted(glob.glob(os.path.join(data_dir, "imagesTr", "*.npy")))
train_labels = sorted(glob.glob(os.path.join(data_dir, "labelsTr", "*.npy")))
data_dicts = [
    {"image": image_name, "label": label_name}
    for image_name, label_name in zip(train_images, train_labels)
]
train_files, val_files = data_dicts[:-10], data_dicts[-10:]
set_determinism(seed=0)
from monai.transforms import (
    AddChanneld,
    Compose,
    LoadNumpyd,
    RandCropByPosNegLabeld,
    ToTensord,
)
train_transforms = Compose(
    [
        LoadNumpyd(keys=["image", "label"]),
        AddChanneld(keys=["image", "label"]),
        RandCropByPosNegLabeld(
            keys=["image", "label"],
            label_key="label",
            spatial_size=(20,20,20),
            pos=1,
            neg=1,
            num_samples=4,
            image_key="image",
            image_threshold=0,
        ),
        ToTensord(keys=["image", "label"]),
    ]
)
val_transforms = Compose(
    [
        LoadNumpyd(keys=["image", "label"]),
        AddChanneld(keys=["image", "label"]),
        ToTensord(keys=["image", "label"]),
    ]
)
device = torch.device("cuda:0")
model = UNet(
    dimensions=3,
    in_channels=1,
    out_channels=2,
    channels=(16, 32, 64, 128, 256),
    strides=(2, 2, 2, 2),
    num_res_units=2,
    norm=Norm.BATCH,
).to(device)
loss_function = DiceLoss(to_onehot_y=True, softmax=True)
optimizer = torch.optim.Adam(model.parameters(), 1e-4)
train_ds = CacheDataset(data=train_files, transform=train_transforms, cache_rate=1.0, num_workers=1)
train_loader = DataLoader(train_ds, batch_size=6, shuffle=True, num_workers=16)
val_ds = CacheDataset(data=val_files, transform=val_transforms, cache_rate=1.0, num_workers=16)
val_loader = DataLoader(val_ds, batch_size=1, num_workers=1)
epoch_num = 600
val_interval = 2
best_metric = -1
best_metric_epoch = -1
epoch_loss_values = list()
metric_values = list()
for epoch in range(epoch_num):
    print("-" * 10)
    print(f"epoch {epoch + 1}/{epoch_num}")
    model.train()
    epoch_loss = 0
    step = 0
    for batch_data in train_loader:
        step += 1
        inputs, labels = (
            batch_data["image"].to(device),
            batch_data["label"].to(device),
        )
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = loss_function(outputs, labels)
        loss.backward()
        optimizer.step()
        epoch_loss += loss.item()
        print(f"{step}/{len(train_ds) // train_loader.batch_size}, train_loss: {loss.item():.4f}")
    epoch_loss /= step
    epoch_loss_values.append(epoch_loss)
    print(f"epoch {epoch + 1} average loss: {epoch_loss:.4f}")

Expected behavior
The UNet should train and not break.

Environment (please complete the following information):
OS: Ubuntu 20.04LTS
MONAI version: 0.3.0
Python version: 3.8.2 (default, Mar 26 2020, 15:53:00) [GCC 7.3.0]
OS version: Linux (5.4.0-52-generic)
Numpy version: 1.18.1
Pytorch version: 1.5.0
MONAI flags: HAS_EXT = False, USE_COMPILED = False

Optional dependencies:
Pytorch Ignite version: 0.3.0
Nibabel version: 3.1.0
scikit-image version: 0.16.2
Pillow version: 7.1.2
Tensorboard version: 2.2.1
gdown version: 3.12.2
TorchVision version: 0.6.0a0+82fd1c8
ITK version: 5.1.1
tqdm version: 4.50.2

Additional context
I am trying to do tumor detection on whole-body MRI scans. The tumors are small and the body is large. So far this is giving me an average F1 score of 0.17 using this library, training with the 20+20+20+20+16 stacking workaround.

3d classification no data download

Describe the bug
All 3D classification tutorials assume that the data is in 'workspace/data/medical/ixi/IXI-T1/, but none do the download.

To Reproduce

Load any 3D classification tutorial and run.

Expected behavior
Tutorial should be able to run the whole way through without user intervention.

Environment (please complete the following information):
N/A