Vision Transformer for 3D medical image registration (Pytorch).

License: MIT License

Python 100.00%

vision-transformer image-registration pytorch-implementation convolutional-neural-networks

vit-v-net_for_3d_image_registration_pytorch's Introduction

ViT-V-Net: Vision Transformer for Volumetric Medical Image Registration

Please also check out our newly proposed registration model 👉 TransMorph
The pretrained model and the quantitative results of ViT-V-Net on IXI dataset are available here: IXI_dataset.
Additionally, we have made our preprocessed IXI dataset publicly available!

keywords: vision transformer, convolutional neural networks, image registration

This is a PyTorch implementation of my short paper:

Chen, Junyu, et al. "ViT-V-Net: Vision Transformer for Unsupervised Volumetric Medical Image Registration. " Medical Imaging with Deep Learning (MIDL), 2021.

train.py is the training script. models.py contains ViT-V-Net model.

Pretrained ViT-V-Net: pretrained model

Dataset: Due to restrictions, we cannot distribute our brain MRI data. However, several brain MRI datasets are publicly available online: IXI, ADNI, OASIS, ABIDE, etc. Note that those datasets may not contain labels (segmentation). To generate labels, you can use FreeSurfer, which is an open-source software for normalizing brain MRI images. Here are some useful commands in FreeSurfer: Brain MRI preprocessing and subcortical segmentation using FreeSurfer.

Model Architecture:

Vision Transformer Achitecture:

Example Results:

Quantitative Results:

Reference:

TransUnet

ViT-pytorch

VoxelMorph

If you find this code is useful in your research, please consider to cite:

@inproceedings{chen2021vitvnet,
 title={ViT-V-Net: Vision Transformer for Unsupervised Volumetric Medical Image Registration},
 author={Junyu Chen and Yufan He and Eric Frey and Ye Li and Yong Du},
  booktitle={Medical Imaging with Deep Learning},
  year={2021},
  url={https://openreview.net/forum?id=h3HC1EU7AEz}
  }

About Me

vit-v-net_for_3d_image_registration_pytorch's People

Stargazers

Watchers

vit-v-net_for_3d_image_registration_pytorch's Issues

How to use the pretrained model?

I'm glad to find that you have open-sourced your pretrained model, I got it from google drive, but how can we just apply it to infer.py?
What do the files in the dicterory mean in your pretrained model?
Thanks for your time and reply.
$4O} X_L4V34S{D11_WD1HJQ$

test the model

Thank you very much for your code!
Using the train.py, we can train our own model. We are looking forward to your upload of the test code.

The process of making and loading datasets

Hi,

I'm sorry to bother you. I want to run this architecture on my own datasets, but I ran into some difficulties. Prior to this, I had read the issue and code, but could not solve the problem independently, so I would like to ask the following questions:

What structure does ".pkl" use to encapsulate datasets?
Further, how to wrap the "nii/niI.gz" file into the above ".pkl" according to the desired structure? I have the preprocessed datasets of LPBA40(".niI.gz" format, a total of 40 MRI images and their corresponding Ground truth), but I do not know how to load these data into correct "pkl" format files. I tried to find the relevant information, but did not understand and implement it. I hope to get your instructions, thank you!

In view of these two questions, I made the following reflections combined with issues opened by others, but failed to solve these two problems independently, I will describe my ideas so that you can quickly clarify my questions, My partial understanding and guess are as follows:

For data encapsulation of medical images, a pre-processed MRI image and its corresponding segmentation result (label/ground truth) are encapsulated into an independent "***.pkl" file for data reading. PKL = pkload("***.pkl"); "data = pkload("***.pkl")"; Where 2 means 1(preprocessed MRI image)+1(segmentation result, i.e., ground truth), that is, data[0] is preprocessed MRI image and data[1] is its corresponding ground truth;
But according to the code of datasets.py, this does not seem to be the case, because in "class JHUBrainDataset(Dataset)" getitem is "x, y = pkload(path)", In "class JHUBrainInferDataset(Dataset)", the getitem mode is "x, y, x_seg, y_seg = pkload(path)", It looks like a "***.pkl" file containing two MRI images and their corresponding two ground truths.
According to the code part of reading the dataset in the training stage and validation stage of "datasets.py", I speculated as follows: In the training stage, only preprocessed MRI images were used, without ground truth, because there are no "x_seg" and "y_seg" in "__getitem__" of "class JHUBrainDataset" ; In the validation stage, pre-processed MRI images and their corresponding segmentation results (label/ground truth) were used at the same time.

Thanks

How was the visualization results visualized?

Hi, I really wonder how the registration results were visualized? Especially for the deformed grid, many papers talked about registration but no one even mentioned how the results were visualized.

How do you setting your data in order to run python train.py?

Please help me setting the data in order to train. Or help me have the simply data to train ( I also download your datas that mentioned but I can not use it)

I assumer use the BraTS_2018 dataset. It looks like:

Val:

When I set the above path, the following error occurs:

Thank you, very much.

The process of making data

Hi！
I'm sorry to bother you.
I followed your suggestion to convert the file to pkl format, but it still doesn't work properly. Could you please help me see where I went wrong?

Train

Val

Very grateful for your help

Why doesn't the model use the Vecint function？

Thank you for providing open source code!

My question is why the VitNet class annotates VecInt in model.py. What is the reason?

I see that voxelmorph has used this and the effect has improved, so I have this question

train error

输出x的shape: torch.Size([1, 1, 304, 224, 112])
输出y的shape: torch.Size([1, 1, 304, 224, 112])
输出model_in的shape: torch.Size([1, 2, 304, 224, 112])
Traceback (most recent call last):
File "/home/yuanpeng/Registration/Vit/ViT-V-Net/train.py", line 171, in
main()
File "/home/yuanpeng/Registration/Vit/ViT-V-Net/train.py", line 94, in main
model_out = model(model_in)
File "/home/yuanpeng/anaconda3/envs/python37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yuanpeng/Registration/Vit/ViT-V-Net/models.py", line 425, in forward
x = self.decoder(x, features)
File "/home/yuanpeng/anaconda3/envs/python37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yuanpeng/Registration/Vit/ViT-V-Net/models.py", line 295, in forward
x = decoder_block(x, skip=skip)
File "/home/yuanpeng/anaconda3/envs/python37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yuanpeng/Registration/Vit/ViT-V-Net/models.py", line 254, in forward
x = torch.cat([x, skip], dim=1)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 18 but got size 19 for tensor number 1 in the list.
输出x的shape: torch.Size([1, 512, 18, 14, 6])
输出skip的shape: torch.Size([1, 32, 19, 14, 7])

Process finished with exit code 1

Excuse me, the size of the image I input is 304, 224, 112, and an error is reported. Do you have any requirements for the size of the input image?

trainning error

Why am I training 100 rounds and my accuracy is still around 0.4

junyuchen245 / vit-v-net_for_3d_image_registration_pytorch Goto Github PK