Giter VIP home page Giter VIP logo

denseaspp's Introduction

DenseASPP for Semantic Segmentation in Street Scenes pdf

Introduction

Semantic image segmentation is a basic street scene understanding task in autonomous driving, where each pixel in a high resolution image is categorized into a set of semantic labels. Unlike other scenarios, objects in autonomous driving scene exhibit very large scale changes, which poses great challenges for high-level feature representation in a sense that multi-scale information must be correctly encoded.

To remedy this problem, atrous convolution[2, 3] was introduced to generate features with larger receptive fields without sacrificing spatial resolution. Built upon atrous convolution, Atrous Spatial Pyramid Pooling (ASPP)[3] was proposed to concatenate multiple atrous-convolved features using different dilation rates into a final feature representation. Although ASPP is able to generate multi-scale features, we argue the feature resolution in the scale-axis is not dense enough for the autonomous driving scenario. To this end, we propose Densely connected Atrous Spatial Pyramid Pooling (DenseASPP), which connects a set of atrous convolutional layers in a dense way, such that it generates multi-scale features that not only cover a larger scale range, but also cover that scale range densely, without significantly increasing the model size. We evaluate DenseASPP on the street scene benchmark Cityscapes[4] and achieve state-of-the-art performance.

Usage

1. Clone the repository:

git clone https://github.com/DeepMotionAIResearch/DenseASPP.git

2. Download pretrained model:

Put the model at the folder weights. We provide some checkpoints to run the code:

DenseNet161 based model: GoogleDrive

Mobilenet v2 based model: Coming soon.

Performance of these checkpoints:

Checkpoint name Multi-scale inference Cityscapes mIOU (val) Cityscapes mIOU (test) File Size
DenseASPP161 False
True
78.8%
79.8 %
-
79.0%
142.7 MB
MobileNetDenseASPP False
True
74.5%
75.0 %
-
-
10.2 MB

Please note that the performance of these checkpoints can be further improved by fine-tuning. Besides, these models were trained with Pytorch 0.3.1

3. Inference

First cd to your code root, then run:

 python demo.py  --model_name DenseASPP161 --model_path <your checkpoint path> --img_dir <your img directory>

4. Evaluation the results

Please cd to ./utils, then run:

 python transfer.py

And eval the results with the official evaluation code of Cityscapes, which can be found at there

References

  1. DenseASPP for Semantic Segmentation in Street Scenes
    Maoke Yang, Kun Yu, Chi Zhang, Zhiwei Li, Kuiyuan Yang.
    link. In CVPR, 2018.

  2. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs
    Liang-Chieh Chen+, George Papandreou+, Iasonas Kokkinos, Kevin Murphy, Alan L. Yuille (+ equal contribution).
    link. In ICLR, 2015.

  3. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
    Liang-Chieh Chen+, George Papandreou+, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille (+ equal contribution).
    link. TPAMI 2017.

  4. The Cityscapes Dataset for Semantic Urban Scene Understanding
    Cordts, Marius, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, Bernt Schiele.
    link. In CVPR, 2016.

denseaspp's People

Contributors

deepmotionairesearch avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.