Giter VIP home page Giter VIP logo

shelfnet's Introduction

ShelfNet

Results

  • We tested ShelfNet with ResNet50 and ResNet101 as the backbone respectively: they achieved 59 FPS and 42 FPS respectively on a GTX 1080Ti GPU with a 512x512 input image.
  • On PASCAL VOC 2012 test set, it achieved 84.2% mIoU with ResNet101 backbone and 82.8% mIoU with ResNet50 backbone.
  • It achieved 75.8% mIoU with ResNet50 backbone on Cityscapes dataset.

Differences from results reported in the paper

  • The result of ShelfNet is slightly different on this implementation and reported in the paper (75.4% in this implementation, 75.8% in the paper).
  • The paper trains 500 epochs, while here the training epoch is 240.
  • The paper does not use synchronized batch normalization, while this implementation uses synchronized batch normalization across multiple GPUs.
  • For training on coarse labelled data, in this implementation the learning rate is set as 0.01 and remains constant; in results for the paper, the training on coarse labelled data uses a poly decay schedule, but the total epochs is set as 500, while I stopped the training mannualy at epoch 35 (In this way, there is a very slight decay on learning rate instead of constant).

Citation

Please cite our paper

@article{zhuang2018shelfnet,
  title={ShelfNet for Real-time Semantic Segmentation},
  author={Zhuang, Juntang and Yang, Junlin},
  journal={arXiv preprint arXiv:1811.11254},
  year={2018}
}

Requirements

  • Please refer to torch-encoding for implementation on synchronized batch-normalization layer.
  • PyTorch 0.4.1
  • Python 3.6
  • requests
  • nose
  • scipy
  • tqdm
  • Other requirements by torch-encoding.

How to run

Environment setup

  • run python setup.py install to install torch-encoding
  • make sure you have the same path for a datset in /scripts/prepare_xx.py and /encoding/datasets/xxx.py, default path is ~/.encoding/data, which is a hidden folder. You will need to type Ctrl + h to show is in Files

PASCAL dataset preparation

  • run cd scripts
  • run python prepared_xx.py to prepare datasets, including MS COCO, PASCAL VOC, PASCAL Aug, PASCAL Context
  • Download test dataset from official evaluation server for PASCAL, extract and merge with training data folder, e.g. ~/.encoding/data/VOCdevkit

Cityscapes dataset preparation

  • The data preparation code is modified from fyu implementation
  • The scripts are in folder scripts/prepare_citys
  • Step 1, download Cityscapes and Cityscapes Coarse dataset from Cityscapes official website, you need to download gtFine_trainvaltest.zip, gtFine_trainvaltest.zip , leftImg8bit_trainvaltest.zip, leftImg8bit_trainvaltest.zip and unzip them into one folder
  • Step 2, prepare fine labelled dataset:
    • convert original segmentation id into 19 training ids python3 scripts/prepare_citys/prepare_data.py <cityscape folder>/gtFine/
    • Run sh create_lists_citys.sh in cityscape data folder, and move info.json into the data folder
  • Step 3, prepare coarse labelled dataset:
    • convert original segmentation id into 19 training ids python3 scripts/prepare_citys/prepare_data.py <cityscape folder>/gtCoarse/
    • Run sh create_lists_citys_coarse.sh in cityscape data folder, and move info.json into the data folder

Configurations (refer to /experiments/option.py)

  • --diflr: default value is True. If set as True, the head uses 10x larger learning rate than the backbone; otherwise head and backbone uses the same learning rate.
  • --model: which model to use, default is shelfnet, other options include pspnet, encnet,fcn
  • --backbone: backbone of the model, resnet50 or resnet101
  • --dataset: which dataset to train on, coco for MS COCO, pascal_aug for augmented PASCAL,pascal_voc for PASCAL VOC,pcontext for pascal context.
  • --aux: if type --aux, the model will use auxilliray layer, which is a FCN head based on the final block of backbone.
  • --se_loss: a context module based on final block of backbone, the shape is 1xm where m is number of categories. It penalizes whether a category is present or not.
  • --resume: default is None. It specifies the checkpoint to load
  • --ft: fine tune flag. If set as True, the code will resume from checkpoint but forget optimizer information.
  • --checkname: folder name to store trained weights
  • Other parameters are trevial, please refer to /experiments/segmentation/option.py for more details

Training scripts on PASCAL VOC

  • run cd /experiments/segmentation
  • pre-train ShelfNet50 on COCO,
    python train.py --backbone resnet50 --dataset coco --aux --se-loss --checkname ShelfNet50_aux
  • fine-tune ShelfNet50 on PASCAL_aug, you may need to double check the path for resume.
    python train.py --backbone resnet50 --dataset pascal_aug --aux --se-loss --checkname ShelfNet50_aux --resume ./runs/coco/shelfnet/ShelfNet50_aux_se/model_best.pth.tar -ft
  • fine-tune ShelfNet50 on PASCAL VOC, you may need to double check the path for resume.
    python train.py --backbone resnet50 --dataset pascal_voc --aux --se-loss --checkname ShelfNet50_aux --resume ./runs/pascal_aug/shelfnet/ShelfNet50_aux_se/model_best.pth.tar -ft

Training scripts on Cityscapes

  • run cd /experiments/segmentation
  • pre-train ShelfNet50 on coarse labelled dataset,
    python train.py --diflr False --backbone resnet50 --dataset citys_coarse --checkname ShelfNet50_citys_coarse --lr-schedule step
  • fine-tune ShelfNet50 on fine labelled dataset, you may need to double check the path for resume.
    python train.py --diflr False --backbone resnet50 --dataset citys --checkname citys_coarse --resume ./runs/citys_coarse/shelfnet/ShelfNet50_citys_coarse/model_best.pth.tar -ft

Test scripts on PASCAL VOC

  • To test on PASCAL_VOC with multiple-scales input [0.5, 0.75, 1.0, 1.25, 1.5, 1.75].
    python test.py --backbone resnet50 --dataset pascal_voc --resume ./runs/pascal_voc/shelfnet/ShelfNet50_aux_se/model_best.pth.tar
  • To test on PASCAL_VOC with single-scale input
    python test_single_scale.py --backbone resnet50 --dataset pascal_voc --resume ./runs/pascal_voc/shelfnet/ShelfNet50_aux_se/model_best.pth.tar
  • Similar experiments can be performed on ShelfNet with ResNet101 backbone, and experiments on Cityscapes can be performed by changing dataset as --dataset citys

Evaluation scripts

  • You can use the following script to generate ground truth - prediction pairs on PASCAL VOC validation set.
    python evaluate_and_save.py --backbone resnet50 --dataset pascal_voc --resume ./runs/pascal_voc/shelfnet/ShelfNet50_aux_se/model_best.pth.tar --eval

Measure running speed

  • Measure running speed of ShelfNet on 512x512 image.
    python test_speed.py --model shelfnet --backbone resnet101
    python test_speed.py --model pspnet --backbone resnet101

Pre-trained weights

Structure of ShelfNet

structure

Examples on Pascal VOC datasets

Pascal results

Video Demo on Cityscapes datasets

  • Video demo of ShelfNet50 on Cityscapes Video demo of ShelfNet50
  • Video demo of ShelfNet101 on Cityscapes Video demo of ShelfNet101

Numerical results on Pascal VOC test set

Numerical Results

shelfnet's People

Contributors

juntang-zhuang avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.