layumi / seg-uncertainty Goto Github PK

IJCAI2020 & IJCV2021 :city_sunrise: Unsupervised Scene Adaptation with Memory Regularization in vivo

Home Page: https://arxiv.org/abs/1912.11164

License: MIT License

Python 100.00%

ijcai domain-adaptation semantic-segmentation pytorch ijcai2020 pytorch-implementation gta5 cityscapes synthia robotcar self-driving-car domainadaptation transfer-learning mrnet ijcv

seg-uncertainty's Introduction

Seg_Uncertainty

Zhedong Zheng, Yi Yang

In this repo, we provide the code for the two papers, i.e.,

MRNet：Unsupervised Scene Adaptation with Memory Regularization in vivo, IJCAI (2020)
MRNet+Rectifying: Rectifying Pseudo Label Learning via Uncertainty Estimation for Domain Adaptive Semantic Segmentation, IJCV (2021) [中文介绍] [Poster]
[中文介绍视频]

Initial Model

The original DeepLab link of ucmerced is failed. Please use the following link.

[Google Drive] https://drive.google.com/file/d/1BMTTMCNkV98pjZh_rU0Pp47zeVqF3MEc/view?usp=share_link

[One Drive] https://1drv.ms/u/s!Avx-MJllNj5b3SqR7yurCxTgIUOK?e=A1dq3m

or use

pip install gdown
pip install --upgrade gdown
gdown 1BMTTMCNkV98pjZh_rU0Pp47zeVqF3MEc

CommonQ&A
The Core Code
Prerequisites
Prepare Data
Training
Testing
Trained Model
Related Works
Citation

News

[19 Jan 2024] We further apply the uncertainty to compositional image retrieval. The paper is accepted by ICLR'24 [code].
[27 Jan 2023] You are welcomed to check our new transformer-based work PiPa, which achieves 75.6 mIoU on GTA5->Cityscapes.
[5 Sep 2021] Zheng etal. apply the Uncertainty to domain adaptive reid, and also achieve good performance. "Exploiting Sample Uncertainty for Domain Adaptive Person Re-Identification" Kecheng Zheng, Cuiling Lan, Wenjun Zeng, Zhizheng Zhang, and Zheng-Jun Zha. AAAI 2021
[13 Aug 2021] We release one new method by Adaptive Boosting (AdaBoost) for Domain Adaptation. You may check the project at https://github.com/layumi/AdaBoost_Seg

Common Q&A

Why KLDivergence is always non-negative (>=0)?

Please check the wikipedia at (https://en.wikipedia.org/wiki/Kullback–Leibler_divergence#Properties) . It provides one good demonstration.

Why both log_sm and sm are used?

You may check the pytorch doc at https://pytorch.org/docs/stable/generated/torch.nn.KLDivLoss.html?highlight=nn%20kldivloss#torch.nn.KLDivLoss. I follow the discussion at https://discuss.pytorch.org/t/kl-divergence-loss/65393

The Core Code

Core code is relatively simple, and could be directly applied to other works.

Memory in vivo: https://github.com/layumi/Seg_Uncertainty/blob/master/trainer_ms.py#L232
Recitfying Pseudo label: https://github.com/layumi/Seg_Uncertainty/blob/master/trainer_ms_variance.py#L166

Prerequisites

Python 3.6
GPU Memory >= 11G (e.g., GTX2080Ti or GTX1080Ti)
Pytorch or Paddlepaddle

Prepare Data

Download [GTA5] and [Cityscapes] to run the basic code. Alternatively, you could download extra two datasets from [SYNTHIA] and [OxfordRobotCar].

Download The GTA5 Dataset
Download The SYNTHIA Dataset SYNTHIA-RAND-CITYSCAPES (CVPR16)
Download The Cityscapes Dataset
Download The Oxford RobotCar Dataset

The data folder is structured as follows:

├── data/
│   ├── Cityscapes/  
|   |   ├── data/
|   |       ├── gtFine/
|   |       ├── leftImg8bit/
│   ├── GTA5/
|   |   ├── images/
|   |   ├── labels/
|   |   ├── ...
│   ├── synthia/ 
|   |   ├── RGB/
|   |   ├── GT/
|   |   ├── Depth/
|   |   ├── ...
│   └── Oxford_Robot_ICCV19
|   |   ├── train/
|   |   ├── ...

Training

Stage-I:

python train_ms.py --snapshot-dir ./snapshots/SE_GN_batchsize2_1024x512_pp_ms_me0_classbalance7_kl0.1_lr2_drop0.1_seg0.5  --drop 0.1 --warm-up 5000 --batch-size 2 --learning-rate 2e-4 --crop-size 1024,512 --lambda-seg 0.5  --lambda-adv-target1 0.0002 --lambda-adv-target2 0.001   --lambda-me-target 0  --lambda-kl-target 0.1  --norm-style gn  --class-balance  --only-hard-label 80  --max-value 7  --gpu-ids 0,1  --often-balance  --use-se

Generate Pseudo Label:

python generate_plabel_cityscapes.py  --restore-from ./snapshots/SE_GN_batchsize2_1024x512_pp_ms_me0_classbalance7_kl0.1_lr2_drop0.1_seg0.5/GTA5_25000.pth

Stage-II (with recitfying pseudo label):

python train_ft.py --snapshot-dir ./snapshots/1280x640_restore_ft_GN_batchsize9_512x256_pp_ms_me0_classbalance7_kl0_lr1_drop0.2_seg0.5_BN_80_255_0.8_Noaug --restore-from ./snapshots/SE_GN_batchsize2_1024x512_pp_ms_me0_classbalance7_kl0.1_lr2_drop0.1_seg0.5/GTA5_25000.pth --drop 0.2 --warm-up 5000 --batch-size 9 --learning-rate 1e-4 --crop-size 512,256 --lambda-seg 0.5 --lambda-adv-target1 0 --lambda-adv-target2 0 --lambda-me-target 0 --lambda-kl-target 0 --norm-style gn --class-balance --only-hard-label 80 --max-value 7 --gpu-ids 0,1,2 --often-balance  --use-se  --input-size 1280,640  --train_bn  --autoaug False

*** If you want to run the code without rectifying pseudo label, please change [this line] to 'from trainer_ms import AD_Trainer', which would apply the conventional pseudo label learning. ***

Testing

python evaluate_cityscapes.py --restore-from ./snapshots/1280x640_restore_ft_GN_batchsize9_512x256_pp_ms_me0_classbalance7_kl0_lr1_drop0.2_seg0.5_BN_80_255_0.8_Noaug/GTA5_25000.pth

Trained Model

The trained model is available at https://drive.google.com/file/d/1smh1sbOutJwhrfK8dk-tNvonc0HLaSsw/view?usp=sharing

The folder with SY in name is for SYNTHIA-to-Cityscapes
The folder with RB in name is for Cityscapes-to-Robot Car

One Note for SYNTHIA-to-Cityscapes

Note that the evaluation code I provided for SYNTHIA-to-Cityscapes is still average the IoU by divide 19. Actually, you need to re-calculate the value by divide 16. There are only 16 shared classes for SYNTHIA-to-Cityscapes. In this way, the result is same as the value reported in paper.

Related Works

We also would like to thank great works as follows:

Citation

@inproceedings{zheng2020unsupervised,
  title={Unsupervised Scene Adaptation with Memory Regularization in vivo},
  author={Zheng, Zhedong and Yang, Yi},
  booktitle={IJCAI},
  year={2020}
}
@article{zheng2021rectifying,
  title={Rectifying Pseudo Label Learning via Uncertainty Estimation for Domain Adaptive Semantic Segmentation },
  author={Zheng, Zhedong and Yang, Yi},
  journal={International Journal of Computer Vision (IJCV)},
  doi={10.1007/s11263-020-01395-y},
  note={\mbox{doi}:\url{10.1007/s11263-020-01395-y}},
  year={2021}
}

seg-uncertainty's People

Contributors

Stargazers

Watchers

seg-uncertainty's Issues

Question about stage2

Nice work! I am a novice in the domain adaptation field. I have a question about stage2.
In my understanding about your paper, uncertainty estimation is used for target domain images.
But in your code trainer_ms_variance.py, it seems to be calculated on the image of the source domain.

    def gen_update(self, images, images_t, labels, labels_t, i_iter):
            self.gen_opt.zero_grad()

            pred1, pred2 = self.G(images)
            pred1 = self.interp(pred1)
            pred2 = self.interp(pred2)

            if self.class_balance:            
                self.seg_loss = self.update_class_criterion(labels)

            loss_seg1 = self.update_variance(labels, pred1, pred2) 
            loss_seg2 = self.update_variance(labels, pred2, pred1)
 
            loss = loss_seg2 + self.lambda_seg * loss_seg1

batchsize

How much performance will decrease if batchsize=1 is used in the first stage? thanks.

Question about Pseudo-labels are source domain but not target domain in the code

Hi, thanks for the nice code~
I have some questions about the pseudo-labels training part of the code,

this is the pseudo labeling loss of the source data as Eq.11 in paper:
loss_seg1 = self.update_variance(labels, pred1, pred2), in which the labels come from the source domain.

the target domain does not use the pseudo-labels but the entropy minimization:
loss_kl = ( self.kl_loss(self.log_sm(pred_target2) , mean_pred) + self.kl_loss(self.log_sm(pred_target1) , mean_pred))/(nhw)

can not get good performance

Hi, I have some problems
1.for your release model:
stage 1 model (SE_GN_batchsize2_1024x512_pp_ms_me0_classbalance7_kl0.1_lr2_drop0.1_seg0.5), I test this model using your code, the mIoU is only 38.07, much lower than 45.5(stage 1 miou in MRNet paper). stage2 model(1280x640_restore_ft_GN_batchsize9_512x256_pp_ms_me0_classbalance7_kl0_lr1_drop0.2_seg0.5_BN_80_255_0.8_Noaug) is 50.34 , which is same as the report result in second paper.
2. train result
I use your released stage 1 model to generate pseudo label, then train stage 2, However, the performance is low: 43.89 for 25000.pth, 42.69 for 50000.pth, 41.98 for 100000.pth, much lower than the result in your paper. And I do not change any code except variable DATA_DIRECTORY

problems about training

why you train your model in stage 1 and 3 with 100000 iterations, but choose 25000th iteration for finetunning and final evaluation?

low performance for stage 1

I just run same setting as your code, but i only got 43.32 miou, compared with your 45.46.

About pre-trained model

Could you please provide VGG-based pre-trained model?

Synthia -> Cityscapes pre-trained model

Hi, can you share the SYNTHIA to Cityscapes pre-trained model to replicate the results of the paper? Thank you for the great work!

Question about stage 3

Hi, thanks for your great job!
Recently I've been reading your code and I have a question about stage 2(rectifying). You set lambda_adv_target1 and lambda_adv_target2 to 0 which means there is no adv training in stage 2(Right?), but you keep training generator with false instruction from discriminator(the weight of discriminators is not loaded in stage 3), you annotated here which means you keep training G, but here you never updated D, is this the right behavior or maybe I misunderstood sth?

load state dict error

When conducting experiments of stage 1 training, state dicts of self.G model is inconsistent with the pretrained model in 'http://vllab.ucmerced.edu/ytsai/CVPR18/DeepLab_resnet_pretrained_init-f81d91e8.pth'. For example, key of state dict of self.G is 'layer5.conv2d_list.0.0.weight' and 'layer6.bottleneck.0.se.0.weight', but key of the pretrained model is 'layer5.conv2d_list.0.weight' and there is no module 'layer6.bottleneck.0.se.0.weight'. Should I set the strict as False in load_state_dict()?

Out of memory

I use two titanx gpus.Can you tell me the type of GPU you use？

Question about SYNTHIA dataset classes num

Hello~As we know, SYNTHIA dataset only share 16 classes with Cityscapes. So when training the model on the Synthia dataset, do I need to change the model classification layer to 16 channels? Does this matter?

Without recitfying pseudo label in Stage 2

If I want to run the code without rectifying the pseudo label, I am confused that why in stage 2 the loss weight lambda-kl-target is set to 0? (From README)
From Eq.9 of the IJCAI2020 paper, I understand the KL-divergence loss is should be used.
Thanks

About data loading

Hello, thank for sharing the great work. I have a quick question. I find that ''random scale'' and ''mirror'' are used for preprocessing data, which is a little different from previous works.

So, for what reason did you adopt these strategies?
Have you conducted experiments to verify how much performance improvement they bring?

Thanks!

About rectification in stage 2

Hi, thanks for the code!
I was wondering why in stage 2 the loss weight lambda-kl-target is set to 0.
From my understanding, the rectification is done by using this term.
Thanks in advance.

Prediction variance and Prediction confidence

Hello,

could you maybe explain the difference between prediction variance and prediction Confidence? In the paper, Figure 5 visualizes both, but I'm not sure how do you calculate them. I assume the KL term divergence is used as variance, then how to get the confidence?

Thanks in advance.

Question about no folder of "ground truth" of SYNTHIA dataset

Sorry to bother you, when I downloaded the SYNTHIA dataset and decompressed it. I found there was no folder with "ground truth"(GT). I want to know if someone else is in this situation.

AttributeError: 'Namespace' object has no attribute 'sync_bn'

Thank you for sharing this code!
I have a small question, when I run the train_ms_robot. py file, I always get an error, AttributeError: 'Namespace' object has no attribute 'sync_bn'.

Discriminator' Problem

Thank you for sharing this wonderful code first!
And I have a small question in discriminator.I find the adversarial loss of AdaptSegNet is very unstable because of the global alignment in the segmentation output.I add non-local attention in the discriminator,but the performance drops dramatically.Then your discriminator,you say 'we follow the PatchGAN and deploy the multi-scale discrimimator model'.So for what consideration you utilize this strategy,and do you do an experiment to see how much performance improvement does this approach bring?

How to change class balance parameters according to input size

Hi, thank you for your great work on domain adaptation!
The origin input image is 1024512, but now I want to train on higher resolution, like 19201080.
How should I change the class balance parameters like max_value and 6464n.
In my opinion, it will be better to use ratio threshold instead of a hard assigned number like 64 like this:
for i in range(self.num_classes):
count[i] = torch.sum(labels==i)
if count[i]/(nhw) < 1/128.0: #small classes, original train size is 1024512
weight[i] = self.max_value
and 6464/(5121024) = 1/128
But I get worse results than default settings trained with input size (1024512).
Is there anything that I missed?
Any help will be appreciated.

Where is the code for MC-Dropout?

Hi @layumi , I noticed that in your paper, the uncertainty is obtained by computing both MC-Dropout and difference between two classifiers, but I cannot find the code used for MC-Dropout, could you please kindly tell me?

tabular data/ noisy instances

Hi,
thanks for sharing your implementation. I have two questions about it:

Does it also work on tabular data?
Is it possible to identify the noisy instances (return the noisy IDs or the clean set)?