Giter VIP home page Giter VIP logo

tianchi-medical-lungtumordetect's Introduction

阿里云天池医疗大赛·肺结节检测

Features

  • 3D Segmentation & Classification with Keras
  • Fine preprocessing with scikit-image
  • Fine visualization for clarification
  • Modified UNet for segmentation
  • Modified VGG/Inception/ResNet/DenseNet for classification ensemble
  • Fine hyperparameter tunning with both models and training process.

Code Hierarchy

- config.py # good practice to centralize hyper parameters

- preprocess.py # Step 1, preprocess, store numpy/meta 'cache' at ./preprocess/

- train_segmentation.py # Step 2, segmentation with UNet Model
- model_UNet.py # UNet model definition

- train_classificaion.py # Step 3, classificaiton with VGG/Inception/ResNet/DenseNet
- model_VGG.py # VGG model definition
- model_Inception.py # Inception model definition
- model_ResNet.py # ResNet model definition
- model_DenseNet.py # DenseNet model definition

- generators.py # generator for segmentation & classificaiton models
- visual_utils.py # 3D visual tools

- dataset/ # dataset, changed in config.py
- preprocess/ # 'cache' preprocessed numpy/meta data, changed in config.py

- train_ipynbs # training process notebooks

Preprocess

  • use SimpleITK to read CT files, process, and store into cache with numpy arrays
  • process with scikit-image lib, try lots of parameters for best cutting
    • binarized
    • clear-board
    • label
    • regions
    • closing
    • dilation
  • collect all meta information(seriesuid, shape, file_path, origin, spacing, coordinates, cover_ratio, etc.) and store in ONE cache file for fast training init.
  • see preprocessing in /train_ipynbs/preprocess.ipynb file

Distribution of the lung part takes on a whole CT.

Tumor size distribution

Segmentation

  • A simplified and full UNet both tested.
  • dice_coef_loss as loss function.
  • Periodically evaluate model with lots of metrics, which helps a lot to understand the model.
  • 30% of negative sample, which has no tumor, for generalization.
  • Due to memory limitation, 16 batch size used.

Classification

VGG

  • A simplified and full VGG model both tested. Use simplified VGG as baseline.

Pictures tells that: hyperparameter tunning really matters.

Inception

  • A simplified Inception-module based network, with each block has 4-5 different type of conv.
    • 1*1*1 depth-size seperable conv
    • 1*1*1 depth-size seperable conv, then 3*3*3 conv_bn_relu
    • 1*1*1 depth-size seperable conv, then 2 3*3*3 conv_bn_relu
    • AveragePooling3D, then 1*1*1 depth-size seperable conv
    • (optional in config) 1*1*1 depth-size seperable conv, and (5, 1, 1), (1, 5, 1), (1, 1, 5) spatial separable convolution
    • Concatenate above.

ResNet

  • use bottleneck block instead of basic_block for implementation.
  • A bottleneck residual block consists of:
    • (1, 1, 1) conv_bn_relu
    • (3, 3, 3) conv_bn_relu
    • (1, 1, 1) conv_bn_relu
    • (optional in config) kernel_size=(3, 3, 3), strides=(2, 2, 2) conv_bn_relu for compression.
    • Add(not Concatenate) with input
  • Leave RESNET_BLOCKS as config to tune

DenseNet

  • DenseNet draws tons of experience from origin paper. https://arxiv.org/abs/1608.06993
    • 3 dense_block with 5 bn_relu_conv layers according to paper.
    • transition_block after every dense_block, expcet the last one.
    • Optional config for DenseNet-BC(paper called it): 1*1*1 depth-size seperable conv, and transition_block compression.

Fine Tunning & Experience Got

  • Learning rate: 3e-5 works well for UNet, 1e-4 works well for classification models.
  • Due to memory limitation, 16 batch size used.
  • Data Augumentation: shift, rotate, etc.
  • Visualization cannot be more important!!!
  • coord(x, y, z) accord to (width, height, depth), naughty bugs.
  • Put all config in one file save tons of time. Make everything clean and tidy
  • Disk read is bottle neck. Read from SSD.
  • Different runs has different running log dirs, for better TensorBoard visualization. Make it like /train_logs/<model-name>-run-<hour>-<minute>.
  • Lots of debug options in config file.
  • 4 times probability strengthened for tumors < 10mm, 3 for tumor > 10mm and < 30mm, keep for > 30mm. Give more focus on small tumors, like below.

tianchi-medical-lungtumordetect's People

Contributors

wikke avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tianchi-medical-lungtumordetect's Issues

为什么drop掉小于12的结节

楼主的代码感觉好多疑惑,虽然都测试跑通了,但是发现问题如下:
1.uent训练的是大于12的结节,肺区占比大于0.1的,这样就少了很多结节.
希望楼主看见后能给回复下

test数据集

你好,感谢您的分享,请问您的test数据集怎么测试的?train_segmentation分割后的数据没有用到train_classification的训练中吗?两个问题请教您,谢谢您。期待您的回复

关于unet分割时的generators中的get_block和get_mask不匹配问题疑问

mask[coord[0] - radius[0]:coord[0] + radius[0] + 1,
coord[1] - radius[1]:coord[1] + radius[1] + 1,
coord[2] - radius[2]:coord[2] + radius[2] + 1] = 1.0
这行是根据结节的半径,在mask的正**,构造了一个立方体结节部分为1
w, h, d = int(coord[0] - shape[0] // 2), int(coord[1] - shape[1] // 2), int(coord[2] - shape[2] // 2)
w, h, d = max(w, 0), max(h, 0), max(d, 0)
w, h, d = min(w, W - shape[0] - 1), min(h, H - shape[1] - 1), min(d, D - shape[2] - 1)
block = hf['img'][w:w + shape[0], h:h + shape[1], d:d + shape[2]]
这几行代码生成的block,如果w, h, d = max(w, 0), max(h, 0), max(d, 0)中取了0值的话,那么生成的block中的结节不在正**,会不会对应的mask不匹配?

目前也是用的楼主代码,在做可视化,请问楼主是否有可视化的代码,自己写的一些一直没调通,一个月了,不知道为什么,能否参考下您的模型训练后的可视化代码,万分感谢[email protected]

目前也是用的楼主代码,在做可视化,请问楼主是否有可视化的代码,自己写的一些一直没调通,一个月了,不知道为什么,能否参考下您的模型训练后的可视化代码,万分感谢[email protected]

关于预测问题

大神您好:
我是一名研一的研究生,刚刚入手结节分割这方面,研读您的代码很久了,现在遇到一些问题想请教您,恳请您回复:
1.我不是很明白预测出来的结果图是什么,我在训练的过程中画出的只有真实的掩码图。自己单独画出的预测图类似于原始图,我理解的预测应该是预测掩码图。
2.我是否可以预测出结节的真实形状
这些是我学习过程中遇到的困惑,期待您的回复

Can't find dataset!!!

Hello everyone,
Can anyone help me to download data-set for training my data.
Apart from that if anyone have video demonstration for this project then please share the link for better understanding.

分割数据用于测试的问题

您好,非常感谢您的代码,研读您的代码几个月了,还是卡在用分割后的数据来分类,尝试了很久还是没能完成。如果是单独的两个网络感觉意义不大,就想完善它。但是自身能力不足,恳请您给些建议。
或者有做了的朋友,也请给些意见
qq1970427510
谢谢啦

dice_coef_loss不下降

请问我在训练语义分割网络的时候,为什么dice_coef_loss不下降呢,训练几个epoch就爆炸了,还请指点一下

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.