Tree-structured Kronecker Convolutional Network for Semantic Segmentation

Introduction

Most existing semantic segmentation methods employ atrous convolution to enlarge the receptive field of filters, but neglect important local contextual information. To tackle this issue, we firstly propose a novel Kronecker convolution which adopts Kronecker product to expand its kernel for taking into account the feature vectors neglected by atrous convolutions. Therefore, it can capture local contextual information and enlarge the field of view of filters simultaneously without introducing extra parameters. Secondly, we propose Tree-structured Feature Aggregation (TFA) module which follows a recursive rule to expand and forms a hierarchical structure. Thus, it can naturally learn representations of multi-scale objects and encode hierarchical contextual information in complex scenes. Finally, we design Tree-structured Kronecker Convolutional Network (TKCN) that employs Kronecker convolution and TFA module. Extensive experiments on three datasets, PASCAL VOC 2012, PASCAL-Context and Cityscapes, verify the effectiveness of our proposed approach.

Approach

Performance

For VOC 2012, we evaluate the proposed TKCN model on test set without external data such as COCO dataset.

For Cityscapes, the proposed TKCN only trains with the fine-labeled set.

Method	Conference	Backbone	PASCAL VOC 2012 test set	Cityscapes test set	PASCAL-Context val set
DeepLabv2	-	ResNet-101	79.7	70.4	45.7
RefineNet	CVPR2017	ResNet-101	82.4	73.6	47.1
SAC	ICCV2017	ResNet-101	-	78.1	-
PSPNet	CVPR2017	ResNet-101	82.6	78.4	47.8
DUC-HDC	WACV2018	ResNet-101	-	77.6	-
AAF	ECCV2018	ResNet-101	82.2	79.1	-
BiSeNet	ECCV2018	ResNet-101	-	78.9	-
PSANet	ECCV2018	ResNet-101	-	80.1	-
DeepLabv3+	ECCV2018	Xception	89.0	-	-
DFN	CVPR2018	ResNet-101	82.7	79.3	-
DSSPN	CVPR2018	ResNet-101	-	77.8	-
CCL	CVPR2018	ResNet-101	-	-	51.6
EncNet	CVPR2018	ResNet-101	82.9	-	51.7
DenseASPP	CVPR2018	DenseNet	-	80.6	-
TKCN	-	ResNet-101	83.2	79.5	51.8

Note that: DeepLabv3+ employs a more powerful network (Xception) as the backbone and is pretrained on MS-COCO and JFT. "-" indicates that the approaches do not report the corresponding results. DenseASPP employs a more powerful backbone network (DenseNet).

Installation

Install PyTorch

The code is developed on python3.6.6 on Ubuntu 16.04. (GPU: Tesla K80; PyTorch: 0.5.0a0+a24163a; Cuda: 8.0)

Clone the repository

git clone https://github.com/wutianyiRosun/TKCN.git 
cd TKCN
python setup.py install

Pretrained model The pretrained model ImageNet_ResNet-101 can be available at here. Put it under the folder "./TKCN/tkcn/pretrained_models".
Dataset Configuration

Download the Cityscapes dataset and convert the dataset to 19 categories. It should have this basic structure.

├── cityscapes_test_list.txt
├── cityscapes_train_list.txt
├── cityscapes_trainval_list.txt
├── cityscapes_val_list.txt
├── cityscapes_val.txt
├── gtCoarse
│   ├── train
│   ├── train_extra
│   └── val
├── gtFine
│   ├── test
│   ├── train
│   └── val
├── leftImg8bit
│   ├── test
│   ├── train
│   └── val
├── license.txt

These .txt files can be downloaded from here

Train your own model

For Cityscapes

training on train+val set

cd tkcn
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5 python train.py  --model tkcnet --backbone resnet101

single-scale testing (on test set)

  CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python eval.py  --model tkcnet --backbone resnet101  --resume-dir cityscapes/model/tkcnet_model_resnet101_cityscapes_gpu6bs6epochs240/TKCNet101 --resume-file checkpoint_240.pth.tar

multi-scale testing (on test set)

  CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python eval.py  --model tkcnet --backbone resnet101  --multi-scales  --resume-dir cityscapes/model/tkcnet_model_resnet101_cityscapes_gpu6bs6epochs240/TKCNet101 --resume-file checkpoint_240.pth.tar

For testing, the pretrained model file can be downloaded here: tkcn_cityscapes_checkpoint_240_ontrainval.pth

Citation

If TKCN is useful for your research, please consider citing:

@article{wu2018tree,
  title={Tree-structured Kronecker Convolutional Networks for Semantic Segmentation},
  author={Wu, Tianyi and Tang, Sheng and Zhang, Rui and Li, Jintao},
  journal={arXiv preprint arXiv:1812.04945},
  year={2018}
}

trendingtechnology / tkcn Goto Github PK

tkcn's Introduction

Tree-structured Kronecker Convolutional Network for Semantic Segmentation

Introduction

Approach

Performance

Installation

Train your own model

For Cityscapes

Citation

License

Thanks to the Third Party Libs

Note

tkcn's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent