lhoyer / mic Goto Github PK

View Code? Open in Web Editor NEW

248.0 248.0 38.0 4.22 MB

[CVPR23] Official Implementation of MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation

Python 88.62% C 0.78% C++ 1.17% Cuda 9.28% Shell 0.16%

mic's People

Contributors

Stargazers

Watchers

mic's Issues

Question about reproducing DAFormer for Cityscapes-to-DarkZurich

Hi, Thanks for your great work.

I attempted to reproduce your previous work(DAFormer) and achieved similar results for the GTA to Cityscapes and Synthia to Cityscapes tasks. However, I did not achieve the performance levels stated in the paper for the Cityscapes to DarkZurich (Test) and Cityscapes to ACDC (Test) tasks (mIoU=53.8, 55.4).

Regarding the implementation details, the only information I could find for DAFormer without MIC was the usage of half resolution. Therefore, I only made a change in the configuration file from '../base/datasets/uda_gta_to_cityscapes_512x512.py' to '../base/datasets/uda_cityscapes_to_darkzurich_512x512.py'. I wonder if there's anything I missed.

Here are the details of my environment:

CUDA_HOME: /usr/local/cuda-11.3
NVCC: Build cuda_11.3.r11.3/compiler.29920130_0
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.10.1+cu113
PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX512
  - CUDA Runtime 11.3
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
  - CuDNN 8.2
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.10.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 

TorchVision: 0.11.2+cu113
OpenCV: 4.4.0
MMCV: 1.4.2
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 11.3
MMSegmentation: 0.16.0+81782e0

Here is the result (Cityscapes to DarkZurich (Test)):

IoU road            : 88.17
IoU sidewalk        : 48.98
IoU building        : 69.75
IoU wall            : 29.99
IoU fence           : 16.46
IoU pole            : 50.43
IoU traffic light   : 39.95
IoU traffic sign    : 39.59
IoU vegetation      : 54.96
IoU terrain         : 29.87
IoU sky             : 59.57
IoU person          : 53.39
IoU rider           : 53.64
IoU car             : 79.65
IoU truck           : 60.29
IoU bus             : 4.12
IoU train           : 89.25
IoU motorcycle      : 40.03
IoU bicycle         : 34.07
------------------------------------
Mean IoU over 19 classes: 49.59

Thank you once again for your excellent work.

Get a much lower seg training result

Dear @lhoyer ,

Thanks for your outstanding work in domain adaptation, and I'm very interested in this method.

Best,
Jetitime

About the code implementation

Hi, authors

Thanks for your great recent work MIC in DA, I am interested in it.

I want to training the model myself, but met the below issue, could you pls help?

Traceback (most recent call last):
File "/home/Desktop/MIC/seg/run_experiments.py", line 79, in
f.write(child_cfg.pretty_text)
AttributeError: 'dict' object has no attribute 'pretty_text'

Thanks

If I want to implement sim10k2cityscapes in det code, what do I do？

Thank you for your excellent work！If I want to implement sim10k2cityscapes in det code，How do I get a label file that contains only the car？In addition, what are the requirements for sim10k label file？

How do I config MIC (DAFormer) ?

Hi, I want to implement DAFormer with MIC, I should how to modify the config file.

questions about mask

Hi!
Your work is excellent! But in your paper 4.5. In-Depth Analysis of MIC, you say "Therefore, we also apply MIC to the source domain, in addition to the default target domain, for clear-to-adverse-weather and day-to-nighttime adaptation", however, in your config, you use mask_mode: separatetrgaug, did you use mask module in cityscapes2acdc in soure domain?

About the unstable performance of the class train on seed 2

Dear author,

First of all, thanks for your contributions to the deep learning community :)
After reproducing the experiments, I was able to achieve the same results as recorded in the paper with some random seeds, but sometimes wasn't:

76.0 (seed=1)
72.47 (seed=2)
75.86 (seed=3)

I find the bad performance with random seed=2 is because of the low IoU on "train", which is near 0 at 40k iterations and only 30.48 at 60k iterations.
Have you faced this issue before?
If so, what do you think might be the reason for it?

Thank you!
YuKai Chen

Questions about distributed training

I noticed that your project only uses one GPU. Can your code be trained distributed?

How can I run the training code?

Dear Lukas,

First of all, thank you for providing valuable experiment results and code. I attempted to train the classification UDA model, but encountered a "No such file or directory" error.

I followed the installation instructions in the README and executed the command "python run_experiments.py --exp 2". However, I received the following error message after some sentences:

"Run config 1/13: {'exp': 1, 'name': 'office_ar2cl_cdan_mcc_sdat_masking_m64-07-a09-plw-cj02-02-b_vit', 'subfolder': 'examples', 'NGPUS': 1, 'NCPUS': 8, 'gpu_model': 'NVIDIATITANRTX', 'EXEC_CMD': 'python cdan_mcc_sdat_masking.py data/office-home -d OfficeHome -s Ar -t Cl --epochs 30 -b 24 --no-pool --lr 0.002 --seed 0 -a vit_base_patch16_224 --gpu 0 --rho 0.02 --alpha 0.9 --pseudo_label_weight prob --mask_block_size 64 --mask_ratio 0.7 --mask_color_jitter_s 0.2 --mask_color_jitter_p 0.2 --mask_blur True --log logs/cdan_mcc_sdat_vit/office_ar2cl --log_name office_ar2cl_cdan_mcc_sdat_masking_m64-07-a09-plw-cj02-02-b_vit --log_results'}: No such file or directory"

Is there anything else I need to do in order to run the code successfully?

Thank you.

Discuss about "mask"

Hi lhoyer,
This is a very solid and interesting work, however, after reading your paper carefully, I have some questions. It should be noted that the following questions do not negate the contribution of your work at all, but rather discuss the questionable parts of "mask".

Regarding the masking of images, this technique is supposed to have been pioneered by SimCLR. Since then, this method has been widely used as a way of data augmentation in semi-supervised object detection (e.g., UBT) and unsupervised object detection (e.g., PT), as well as in image classification. But there seems to be no comparison with these methods in the paper. I am not sure if you have conducted any relevant experiments. The following image shows the results of a pair using the masking approach in UBT.

And this is the code of the mask adot in some papers (e.g UBT, SimCLR):

I have experimented with the random mask method for UBT and there is a 2% AP impact in the Cityscapes2 CityscapesFoggy experiment.

Whether it is fair to use FPN in the model?

Hi, thanks to your work! I want to ask whether it is fair to use FPN? Because previous works mostly used Faster R-CNN without FPN, how do you think of it, thank you very much!

Coco format label conversion

Hello,

I would appreciate it if you could provide me with specific instructions on how to convert the cityscape label to COCO format.

With many annotation files available in the cityscape, I am uncertain about which one to use and how to use them.

Or perhaps you can share the link for the COCO format label, as it would be more convenient for me to use.

Thank you.

A question about reproducing the result (environment setup)

Dear authors,

I am interested in your work and would like to learn more about it! I would like to ask if I need to change the config file to reproduce the result reported in your paper MIC?
I tried once using the command : python run_experiments.py --config configs/mic/gtaHR2csHR_mic_hrda.py , and get a result around 73 mIoU after 40000 iterations, which is a bit lower. my environment is shown below

Thanks a lot!
Kai

Visualize Detection Result

How to visualize the detection bboxes upon images using the bbox.json (or .pth files) from testing?

BatchNorm dimension mismatch

Dear authors,

I'm trying to reproduce the office-home experiments in the cls directory with python run_experiments.py --exp 1. However, I commented out Line 117 ('VisDA2017', 'Synthetic', 'Real'). When running the experiment, I got the following error:

[INFORMATION] The bottleneck dim is  256
[Masking] Use color augmentation.
lr_bbone: 0.0002
lr_btlnck: 0.002
Traceback (most recent call last):
  File "cdan_mcc_sdat_masking.py", line 393, in <module>
    main(args)
  File "cdan_mcc_sdat_masking.py", line 171, in main
    train(train_source_iter, train_target_iter, classifier, teacher,
  File "cdan_mcc_sdat_masking.py", line 235, in train
    pseudo_label_t, pseudo_prob_t = teacher(x_t)
  File "/hpc/home/zj63/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/hpc/home/zj63/.local/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "../dalib/modules/teacher.py", line 58, in forward
    logits, _ = self.ema_model(target_img)
  File "/hpc/home/zj63/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "../common/modules/classifier.py", line 80, in forward
    f = self.bottleneck(f)
  File "/hpc/home/zj63/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/hpc/home/zj63/.local/lib/python3.8/site-packages/torch/nn/modules/container.py", line 141, in forward
    input = module(input)
  File "/hpc/home/zj63/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/hpc/home/zj63/.local/lib/python3.8/site-packages/torch/nn/modules/batchnorm.py", line 168, in forward
    return F.batch_norm(
  File "/hpc/home/zj63/.local/lib/python3.8/site-packages/torch/nn/functional.py", line 2282, in batch_norm
    return torch.batch_norm(
RuntimeError: running_mean should contain 197 elements not 256

Could you please give me some instructions on how to specify the hyperparameters correctly to reproduce only the office-home experiments?

Thanks and best regards

Questions about rare_class_sampling

Hi @lhoyer ,
While using run_experiments.py --exp to get configs file, it was reported an error that said 'TypeError: UDADataset: init() got an unexpected keyword argument 'rare_class_sampling' '. Hoping that you can give me some guidance about it. Thank you!

RuntimeError: Error compiling objects for extension

I am using PyTorch 1.9.0 and CUDA 11.1, which should be similar to the requirement (PyTorch 1.12.0 and CUDA 11.3).

However, when I run python setup.py build develop, I get the following error:

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "setup.py", line 59, in <module>
    setup(
  File "/root/miniconda3/lib/python3.8/site-packages/setuptools/__init__.py", line 153, in setup
    return distutils.core.setup(**attrs)
  File "/root/miniconda3/lib/python3.8/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/root/miniconda3/lib/python3.8/distutils/dist.py", line 966, in run_commands
    self.run_command(cmd)
  File "/root/miniconda3/lib/python3.8/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/root/miniconda3/lib/python3.8/distutils/command/build.py", line 135, in run
    self.run_command(cmd_name)
  File "/root/miniconda3/lib/python3.8/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/root/miniconda3/lib/python3.8/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/root/miniconda3/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 79, in run
    _build_ext.run(self)
  File "/root/miniconda3/lib/python3.8/site-packages/Cython/Distutils/old_build_ext.py", line 186, in run
    _build_ext.build_ext.run(self)
  File "/root/miniconda3/lib/python3.8/distutils/command/build_ext.py", line 340, in run
    self.build_extensions()
  File "/root/miniconda3/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 709, in build_extensions
    build_ext.build_extensions(self)
  File "/root/miniconda3/lib/python3.8/site-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions
    _build_ext.build_ext.build_extensions(self)
  File "/root/miniconda3/lib/python3.8/distutils/command/build_ext.py", line 449, in build_extensions
    self._build_extensions_serial()
  File "/root/miniconda3/lib/python3.8/distutils/command/build_ext.py", line 474, in _build_extensions_serial
    self.build_extension(ext)
  File "/root/miniconda3/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 196, in build_extension
    _build_ext.build_extension(self, ext)
  File "/root/miniconda3/lib/python3.8/distutils/command/build_ext.py", line 528, in build_extension
    objects = self.compiler.compile(sources,
  File "/root/miniconda3/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 530, in unix_wrap_ninja_compile
    _write_ninja_file_and_compile_objects(
  File "/root/miniconda3/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1355, in _write_ninja_file_and_compile_objects
    _run_ninja_build(
  File "/root/miniconda3/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1682, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension

I'm uncertain if this installation problem is caused by a mismatched torch and cuda version because I couldn't find any hard constraints for the code related to torch and cuda versions.

Get “Loss Loss is NaN” when training with R-50-FPN-RETINANET

Hi，thanks to your excellent work！I want to train with another P6P7，so I changed the config file，and modified as follows:

  BACKBONE:     
      CONV_BODY: "R-50-FPN-RETINANET"

I want to know why it get NAN loss when comes around 400 iterations, thank you for your reply!

2023-04-20 15:38:41,454 maskrcnn_benchmark.trainer INFO: eta: 2:49:17 iter: 380 loss: 14.9084 (13.1246) loss_classifier: 0.3986 (0.6124) loss_box_reg: 0.0224 (0.0164) loss_objectness: 0.4440 (0.6667) loss_rpn_box_reg: 12.3700 (10.2623) loss_da_image: 0.6944 (0.6936) loss_da_instance: 0.6991 (0.6960) loss_da_consistency: 0.2380 (0.1773) time: 0.1650 (0.1704) data: 0.0422 (0.0489) lr: 0.002103 max mem: 2533
2023-04-20 15:38:44,755 maskrcnn_benchmark.trainer INFO: eta: 2:48:58 iter: 400 loss: 14.6111 (13.1887) loss_classifier: 0.4045 (0.6058) loss_box_reg: 0.0159 (0.0186) loss_objectness: 0.4932 (0.6650) loss_rpn_box_reg: 11.5461 (10.3287) loss_da_image: 0.6944 (0.6936) loss_da_instance: 0.7034 (0.6968) loss_da_consistency: 0.2248 (0.1802) time: 0.1650 (0.1701) data: 0.0417 (0.0485) lr: 0.002170 max mem: 2533
2023-04-20 15:38:48,067 maskrcnn_benchmark.trainer INFO: eta: 2:48:42 iter: 420 loss: 13.8691 (13.3205) loss_classifier: 0.3920 (0.6000) loss_box_reg: 0.0038 (0.0190) loss_objectness: 0.6271 (0.6678) loss_rpn_box_reg: 11.1613 (10.4623) loss_da_image: 0.6944 (0.6937) loss_da_instance: 0.7033 (0.6970) loss_da_consistency: 0.1586 (0.1807) time: 0.1655 (0.1699) data: 0.0422 (0.0482) lr: 0.002237 max mem: 2533
2023-04-20 15:38:49,089 maskrcnn_benchmark.trainer CRITICAL: Loss is NaN, exiting...

Installation Instructions

I noticed that there are some issues in the directory structure in the installation instructions (the module we need for convert_cityscapes_to_coco.py doesn't get installed correctly into cityscapesScripts).

How to run the source-only model?

Questions about another domain adaption dataset work in segmentation

Dear Author,
Thank you for your excellent work from DAFormer to HRDA and now MIC!

I want to use another domain adaption dataset with MIC seg task. I have get 3 json files for new dataset.
I add exp in experiments.py.
experiments.txt

And I created /MIC-master/seg/configs/base/datasets/uda_potsdam_to_isprs_1024x1024.py
uda_potsdam_to_isprs_1024x1024.txt

It's really sad that something goes wrong and I don't know how to deal with it. Really hoping to get your advice. Thank you very much.

Question about reproducing the result from cityscapes to acdc

Dear authors:

I am interested in your nice work and try to reproduce the results in your paper. I setup the environment according to issue 8 and can alomot reproduce the result from gta to cityscapes. But I can reproduce the result from cityscapes to acdc. Here is the training logs of from gta to cityscapes and from cityscapes to acdc. Could you please help me identify any issues during the training process?

Thanks a lot!
Weihao

20230508_074602.log
20230518_144316.log

A question about DACS implementation

Dear @lhoyer,

While I'm studying your research in detail, I have a question about your code implementation.

In DACS, the algorithm selects half of the classes for a given image.

However, your implementation selects half of the classes for all given images in batch.

MIC/seg/mmseg/models/utils/dacs_transforms.py

Line 93 in a655163

classes = torch.unique(labels)

I think torch.unique(labels) should be torch.unique(label).

Although my understanding of DACS could be incorrect, could you give me any advice about this?

Best regards,
Jeongkee

question about run the code twice, occour ModuleNotFoundError: No module named 'torch._six'

When I run the code python

run_experiments.py --config configs/mic/gtaHR2csHR_mic_hrda.py first time.

It success run and here the result;

n a future version of Python.
  mem_mb = torch.tensor([mem / (1024 * 1024)],
2023-06-19 01:14:18,488 - mmseg - INFO - Iter [39950/40000]	lr: 7.650e-08, eta: 0:02:25, time: 2.888, data_time: 0.035, memory: 22371, decode.loss_seg: 0.1006, decode.acc_seg: 88.0206, decode.hr.loss_seg: 0.0143, decode.hr.acc_seg: 88.1527, src.loss_imnet_feat_dist: nan, mix.decode.loss_seg: 0.0874, mix.decode.acc_seg: 90.3488, mix.decode.hr.loss_seg: 0.0113, mix.decode.hr.acc_seg: 90.4039, masked.decode.loss_seg: 0.1380, masked.decode.acc_seg: 91.3917, masked.decode.hr.loss_seg: 0.0231, masked.decode.hr.acc_seg: 88.0851
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 500/500, 1.1 task/s, elapsed: 452s, ETA:     0s2023-06-19 01:24:59,256 - mmseg - INFO - per class results:
2023-06-19 01:24:59,257 - mmseg - INFO - 
+---------------+-------+-------+
|     Class     |  IoU  |  Acc  |
+---------------+-------+-------+
|      road     | 96.96 | 98.93 |
|    sidewalk   | 77.87 | 87.97 |
|    building   | 91.44 | 96.48 |
|      wall     | 61.14 | 68.31 |
|     fence     | 51.73 | 60.98 |
|      pole     | 59.01 | 64.47 |
| traffic light | 65.67 |  78.1 |
|  traffic sign | 72.44 | 77.06 |
|   vegetation  | 91.67 | 96.35 |
|    terrain    | 50.95 | 57.63 |
|      sky      | 93.72 | 98.74 |
|     person    | 79.91 | 87.21 |
|     rider     | 55.39 | 76.59 |
|      car      | 94.53 | 96.29 |
|     truck     | 85.63 | 91.15 |
|      bus      | 88.46 | 91.15 |
|     train     | 61.48 | 85.21 |
|   motorcycle  | 65.66 | 80.18 |
|    bicycle    | 68.82 | 79.65 |
+---------------+-------+-------+
2023-06-19 01:24:59,258 - mmseg - INFO - Summary:
2023-06-19 01:24:59,258 - mmseg - INFO - 
+-------+-------+-------+
|  aAcc |  mIoU |  mAcc |
+-------+-------+-------+
| 95.18 | 74.34 | 82.76 |
+-------+-------+-------+
2023-06-19 01:24:59,263 - mmseg - INFO - Saving checkpoint at 40000 iterations
/home/ailab/anaconda3/envs/SMIC/lib/python3.8/site-packages/mmcv/runner/hooks/logger/text.py:55: DeprecationWarning: an integer is required (got type float).  Implicit conversion to integers using __int__ is deprecated, and may be removed in a future version of Python.
  mem_mb = torch.tensor([mem / (1024 * 1024)],
2023-06-19 01:25:01,556 - mmseg - INFO - Exp name: 230617_1656_gtaHR2csHR_mic_hrda_s2_7894c
2023-06-19 01:25:01,556 - mmseg - INFO - Iter [500/40000]	lr: 1.500e-09, eta: 0:00:00, time: 2.904, data_time: 0.036, memory: 22371, aAcc: 0.9518, mIoU: 0.7434, mAcc: 0.8276, IoU.road: 0.9696, IoU.sidewalk: 0.7787, IoU.building: 0.9144, IoU.wall: 0.6114, IoU.fence: 0.5173, IoU.pole: 0.5901, IoU.traffic light: 0.6567, IoU.traffic sign: 0.7244, IoU.vegetation: 0.9167, IoU.terrain: 0.5095, IoU.sky: 0.9372, IoU.person: 0.7991, IoU.rider: 0.5539, IoU.car: 0.9453, IoU.truck: 0.8563, IoU.bus: 0.8846, IoU.train: 0.6148, IoU.motorcycle: 0.6566, IoU.bicycle: 0.6882, Acc.road: 0.9893, Acc.sidewalk: 0.8797, Acc.building: 0.9648, Acc.wall: 0.6831, Acc.fence: 0.6098, Acc.pole: 0.6447, Acc.traffic light: 0.7810, Acc.traffic sign: 0.7706, Acc.vegetation: 0.9635, Acc.terrain: 0.5763, Acc.sky: 0.9874, Acc.person: 0.8721, Acc.rider: 0.7659, Acc.car: 0.9629, Acc.truck: 0.9115, Acc.bus: 0.9115, Acc.train: 0.8521, Acc.motorcycle: 0.8018, Acc.bicycle: 0.7965, decode.loss_seg: 0.0802, decode.acc_seg: 89.6686, decode.hr.loss_seg: 0.0093, decode.hr.acc_seg: 91.4197, src.loss_imnet_feat_dist: nan, mix.decode.loss_seg: 0.0924, mix.decode.acc_seg: 91.6247, mix.decode.hr.loss_seg: 0.0115, mix.decode.hr.acc_seg: 91.5063, masked.decode.loss_seg: 0.1476, masked.decode.acc_seg: 90.4134, masked.decode.hr.loss_seg: 0.0244, masked.decode.hr.acc_seg: 86.8202

But, I run agagin, the same python,

python run_experiments.py --config configs/mic/gtaHR2csHR_mic_hrda.py

it occour problem

ModuleNotFoundError: No module named 'torch._six'

Do you know why.

seg: 89.8760, mix.decode.hr.loss_seg: 0.0150, mix.decode.hr.acc_seg: 90.4203, masked.decode.loss_seg: 0.1437, masked.decode.acc_seg: 90.9696, masked.decode.hr.loss_seg: 0.0235, masked.decode.hr.acc_seg: 87.6535
/home/ailab/anaconda3/envs/SMIC/lib/python3.8/site-packages/mmcv/runner/hooks/logger/text.py:55: DeprecationWarning: an integer is required (got type float).  Implicit conversion to integers using __int__ is deprecated, and may be removed in a future version of Python.
  mem_mb = torch.tensor([mem / (1024 * 1024)],
2023-06-19 01:11:54,109 - mmseg - INFO - Iter [39900/40000]	lr: 1.515e-07, eta: 0:04:50, time: 2.899, data_time: 0.036, memory: 22371, decode.loss_seg: 0.0819, decode.acc_seg: 87.3618, decode.hr.loss_seg: 0.0101, decode.hr.acc_seg: 89.3332, src.loss_imnet_feat_dist: nan, mix.decode.loss_seg: 0.0927, mix.decode.acc_seg: 89.5286, mix.decode.hr.loss_seg: 0.0105, mix.decode.hr.acc_seg: 91.2364, masked.decode.loss_seg: 0.1427, masked.decode.acc_seg: 91.0264, masked.decode.hr.loss_seg: 0.0224, masked.decode.hr.acc_seg: 88.7455
/home/ailab/anaconda3/envs/SMIC/lib/python3.8/site-packages/mmcv/runner/hooks/logger/text.py:55: DeprecationWarning: an integer is required (got type float).  Implicit conversion to integers using __int__ is deprecated, and may be removed in a future version of Python.
  mem_mb = torch.tensor([mem / (1024 * 1024)],
2023-06-19 01:14:18,488 - mmseg - INFO - Iter [39950/40000]	lr: 7.650e-08, eta: 0:02:25, time: 2.888, data_time: 0.035, memory: 22371, decode.loss_seg: 0.1006, decode.acc_seg: 88.0206, decode.hr.loss_seg: 0.0143, decode.hr.acc_seg: 88.1527, src.loss_imnet_feat_dist: nan, mix.decode.loss_seg: 0.0874, mix.decode.acc_seg: 90.3488, mix.decode.hr.loss_seg: 0.0113, mix.decode.hr.acc_seg: 90.4039, masked.decode.loss_seg: 0.1380, masked.decode.acc_seg: 91.3917, masked.decode.hr.loss_seg: 0.0231, masked.decode.hr.acc_seg: 88.0851
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 500/500, 1.1 task/s, elapsed: 452s, ETA:     0s2023-06-19 01:24:59,256 - mmseg - INFO - per class results:
2023-06-19 01:24:59,257 - mmseg - INFO - 
+---------------+-------+-------+
|     Class     |  IoU  |  Acc  |
+---------------+-------+-------+
|      road     | 96.96 | 98.93 |
|    sidewalk   | 77.87 | 87.97 |
|    building   | 91.44 | 96.48 |
|      wall     | 61.14 | 68.31 |
|     fence     | 51.73 | 60.98 |
|      pole     | 59.01 | 64.47 |
| traffic light | 65.67 |  78.1 |
|  traffic sign | 72.44 | 77.06 |
|   vegetation  | 91.67 | 96.35 |
|    terrain    | 50.95 | 57.63 |
|      sky      | 93.72 | 98.74 |
|     person    | 79.91 | 87.21 |
|     rider     | 55.39 | 76.59 |
|      car      | 94.53 | 96.29 |
|     truck     | 85.63 | 91.15 |
|      bus      | 88.46 | 91.15 |
|     train     | 61.48 | 85.21 |
|   motorcycle  | 65.66 | 80.18 |
|    bicycle    | 68.82 | 79.65 |
+---------------+-------+-------+
2023-06-19 01:24:59,258 - mmseg - INFO - Summary:
2023-06-19 01:24:59,258 - mmseg - INFO - 
+-------+-------+-------+
|  aAcc |  mIoU |  mAcc |
+-------+-------+-------+
| 95.18 | 74.34 | 82.76 |
+-------+-------+-------+
2023-06-19 01:24:59,263 - mmseg - INFO - Saving checkpoint at 40000 iterations
/home/ailab/anaconda3/envs/SMIC/lib/python3.8/site-packages/mmcv/runner/hooks/logger/text.py:55: DeprecationWarning: an integer is required (got type float).  Implicit conversion to integers using __int__ is deprecated, and may be removed in a future version of Python.
  mem_mb = torch.tensor([mem / (1024 * 1024)],
2023-06-19 01:25:01,556 - mmseg - INFO - Exp name: 230617_1656_gtaHR2csHR_mic_hrda_s2_7894c
2023-06-19 01:25:01,556 - mmseg - INFO - Iter [500/40000]	lr: 1.500e-09, eta: 0:00:00, time: 2.904, data_time: 0.036, memory: 22371, aAcc: 0.9518, mIoU: 0.7434, mAcc: 0.8276, IoU.road: 0.9696, IoU.sidewalk: 0.7787, IoU.building: 0.9144, IoU.wall: 0.6114, IoU.fence: 0.5173, IoU.pole: 0.5901, IoU.traffic light: 0.6567, IoU.traffic sign: 0.7244, IoU.vegetation: 0.9167, IoU.terrain: 0.5095, IoU.sky: 0.9372, IoU.person: 0.7991, IoU.rider: 0.5539, IoU.car: 0.9453, IoU.truck: 0.8563, IoU.bus: 0.8846, IoU.train: 0.6148, IoU.motorcycle: 0.6566, IoU.bicycle: 0.6882, Acc.road: 0.9893, Acc.sidewalk: 0.8797, Acc.building: 0.9648, Acc.wall: 0.6831, Acc.fence: 0.6098, Acc.pole: 0.6447, Acc.traffic light: 0.7810, Acc.traffic sign: 0.7706, Acc.vegetation: 0.9635, Acc.terrain: 0.5763, Acc.sky: 0.9874, Acc.person: 0.8721, Acc.rider: 0.7659, Acc.car: 0.9629, Acc.truck: 0.9115, Acc.bus: 0.9115, Acc.train: 0.8521, Acc.motorcycle: 0.8018, Acc.bicycle: 0.7965, decode.loss_seg: 0.0802, decode.acc_seg: 89.6686, decode.hr.loss_seg: 0.0093, decode.hr.acc_seg: 91.4197, src.loss_imnet_feat_dist: nan, mix.decode.loss_seg: 0.0924, mix.decode.acc_seg: 91.6247, mix.decode.hr.loss_seg: 0.0115, mix.decode.hr.acc_seg: 91.5063, masked.decode.loss_seg: 0.1476, masked.decode.acc_seg: 90.4134, masked.decode.hr.loss_seg: 0.0244, masked.decode.hr.acc_seg: 86.8202
(SMIC) ailab@ailab:/media/ailab/data/yy/S_MIC/MIC/seg$ python run_experiments.py --config configs/mic/gtaHR2csHR_mic_hrda.py
Traceback (most recent call last):
  File "run_experiments.py", line 23, in <module>
    from tools import train
  File "/media/ailab/data/yy/S_MIC/MIC/seg/tools/train.py", line 21, in <module>
    from mmseg.apis import set_random_seed, train_segmentor
  File "/media/ailab/data/yy/S_MIC/MIC/seg/mmseg/apis/__init__.py", line 3, in <module>
    from .inference import inference_segmentor, init_segmentor, show_result_pyplot
  File "/media/ailab/data/yy/S_MIC/MIC/seg/mmseg/apis/inference.py", line 12, in <module>
    from mmseg.models import build_segmentor
  File "/media/ailab/data/yy/S_MIC/MIC/seg/mmseg/models/__init__.py", line 1, in <module>
    from .backbones import *  # noqa: F401,F403
  File "/media/ailab/data/yy/S_MIC/MIC/seg/mmseg/models/backbones/__init__.py", line 4, in <module>
    from .mix_transformer import (MixVisionTransformer, mit_b0, mit_b1, mit_b2,
  File "/media/ailab/data/yy/S_MIC/MIC/seg/mmseg/models/backbones/mix_transformer.py", line 16, in <module>
    from timm.models.layers import DropPath, to_2tuple, trunc_normal_
  File "/home/ailab/anaconda3/envs/SMIC/lib/python3.8/site-packages/timm/__init__.py", line 2, in <module>
    from .models import create_model, list_models, is_model, list_modules, model_entrypoint, \
  File "/home/ailab/anaconda3/envs/SMIC/lib/python3.8/site-packages/timm/models/__init__.py", line 1, in <module>
    from .cspnet import *
  File "/home/ailab/anaconda3/envs/SMIC/lib/python3.8/site-packages/timm/models/cspnet.py", line 20, in <module>
    from .helpers import build_model_with_cfg
  File "/home/ailab/anaconda3/envs/SMIC/lib/python3.8/site-packages/timm/models/helpers.py", line 17, in <module>
    from .layers import Conv2dSame, Linear
  File "/home/ailab/anaconda3/envs/SMIC/lib/python3.8/site-packages/timm/models/layers/__init__.py", line 7, in <module>
    from .cond_conv2d import CondConv2d, get_condconv_initializer
  File "/home/ailab/anaconda3/envs/SMIC/lib/python3.8/site-packages/timm/models/layers/cond_conv2d.py", line 16, in <module>
    from .helpers import to_2tuple
  File "/home/ailab/anaconda3/envs/SMIC/lib/python3.8/site-packages/timm/models/layers/helpers.py", line 6, in <module>
    from torch._six import container_abcs
ModuleNotFoundError: No module named 'torch._six'

got error for multi resolution target domain

hi thank you for your amazing work 👍
I got error when I was training the model on my custom dataset with "MIC_ON: True" :
my custom dataset has multi resolution images and when i was training i got error in a file: "MIC/det/maskrcnn_benchmark/structures/boxlist_ops.py"
at line "113,114" ---->
113: size = bboxes[0].size
114: assert all(bbox.size == size for bbox in bboxes)

i just print the "size" and "bbox.size" and look like there is a mismatch here!
size (1088, 800)
(1088, 800)
(1066, 800)

do you have any idea how can I resolve this problem??

the full error here:

Traceback (most recent call last):
File "tools/train_net.py", line 270, in
main()
File "tools/train_net.py", line 263, in main
model = train(cfg, args.local_rank, args.distributed)
File "tools/train_net.py", line 114, in train
do_mask_da_train(
File "/content/MIC/det/maskrcnn_benchmark/engine/trainer.py", line 246, in do_mask_da_train
masked_loss_dict = model(masked_images, masked_taget, use_pseudo_labeling_weight=cfg.MODEL.PSEUDO_LABEL_WEIGHT, with_DA_ON=False)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/content/MIC/det/maskrcnn_benchmark/modeling/detector/generalized_rcnn.py", line 54, in forward
proposals, proposal_losses = self.rpn(images, features, targets, use_pseudo_labeling_weight=use_pseudo_labeling_weight)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/content/MIC/det/maskrcnn_benchmark/modeling/rpn/rpn.py", line 100, in forward
return self._forward_train(anchors, objectness, rpn_box_regression, targets, use_pseudo_labeling_weight)
File "/content/MIC/det/maskrcnn_benchmark/modeling/rpn/rpn.py", line 115, in _forward_train
boxes = self.box_selector_train(
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/content/MIC/det/maskrcnn_benchmark/modeling/rpn/inference.py", line 148, in forward
boxlists = self.add_gt_proposals(boxlists, targets)
File "/content/MIC/det/maskrcnn_benchmark/modeling/rpn/inference.py", line 67, in add_gt_proposals
proposals = [
File "/content/MIC/det/maskrcnn_benchmark/modeling/rpn/inference.py", line 68, in
cat_boxlist((proposal, gt_box))
File "/content/MIC/det/maskrcnn_benchmark/structures/boxlist_ops.py", line 121, in cat_boxlist
assert all(bbox.size == size for bbox in bboxes)
AssertionError

by the way there is no issue to train model with "MIC_ON: False"

Train on labelled target images

Currently, MIC are trained using labeled source images as well as unlabeled target images.

Have you tried training using labeled source images and labeled target images?

How can I modify the code to do that?

Thanks!

The Training Data Format

Hi, I want to ask if the source domain use VOC format, the target domain use COCO format, or if both must be COCO format? Thank you!

how to modify the file under convert_datasets to adapt other UDA task

Dear @lhoyer,
Thanks for your outstanding work in UDA, and I have successfully achieved gta->cityscapes results you given in the paper. Now I want to try to use this method in remote sensing UDA task.

I have add new id in experiments.py like the picture shows:

And I have designed uda_potsdam_to_isprs_1024x1024.py.
uda_potsdam_to_isprs_1024x1024.txt

But I have no idea to design potsdam.py(likyseg/tools/convert_datasets/cityscapes.py ) due to the cityscapesscripts.preparation.json2labelImg.

Can you give me some advice about how to realize potsdam->vaihingen.py MIC task?

Best,
jetitime

Consistency Loss not used for detection

In the context of object detection, the original call to the consistency_loss function is replaced with a calculation for L1 loss. The rationale behind this modification is unclear from the provided information.

Besides, the idea of enforcing consistency between detection results for a target image and its masked version appears counterintuitive. It is expected that the detection performance for a completely masked object would be compromised, making the pursuit of consistency between the two scenarios challenging.

Could you explain it a little bit? Thanks!

why target-domain pictures for training also need annotations？

how to configure the datset

Hi, authors

I have a question about using MIC in domain adaption object detection task. Say, I have a data set which is a subset from cityscapes data set just as that used in your paper. And I organize it as following

data

•   cityscapes (source)
  
  - train image folder (with labeling)
  
  - val image folder (with labeling)
  
• foggy_cityscapes (target)
  
  - train image folder (without labeling)

  - val image folder (with labeling)

the problem is when I revising the corresponding path in DatasetCatalog class, I notice that it requires user to specify the annotation path even the target training data set is under consideration.

If I’m not misunderstanding to the method described in your paper, the target samples should not have any ground truth and the pseudo-label is generated by EMA-teacher model. The question is why do we still need to give annotations to the target training set?

Could you please give me some insight about how to configure the paths to fit the data I have? Thank you

How many gpus do you use for seg tasks

Thanks for your great work. I want to know how many gpus you use for training? It seems you use only one gpu for training.

Implementation Details of Masked Images

Great work by leveraging masked image modeling (MIM) to learn target context!

I am asking for implementation details of masked images.
Specifically, how to make the encoder deal with masked images? Did you introduce extra mask tokens? Or, did you simply follow MAE, taking only visible patches as input (but this can not be achieved with CNN-based models)?

Looking forward to your reply.

(Detection, DAInsHead) Loop ends at the first step

Hi, would you mind to help me about this code?

Inside the inshead for instances' domain classification, the loop spanning levels ends at the first step.

MIC/det/maskrcnn_benchmark/modeling/da_heads/da_heads.py

Lines 174 to 191 in 1aa2f5e

 for level, (fc1_da, fc2_da, fc3_da) in \ 

 enumerate(zip(self.da_ins_fc1_layers, 

 self.da_ins_fc2_layers, self.da_ins_fc3_layers)): 

 idx_in_level = torch.nonzero(levels == level).squeeze(1) 

 if len(idx_in_level) > 0: 

 xs = x[idx_in_level, :] 

 xs = F.relu(getattr(self, fc1_da)(xs)) 

 xs = F.dropout(xs, p=0.5, training=self.training) 

 xs = F.relu(getattr(self, fc2_da)(xs)) 

 xs = F.dropout(xs, p=0.5, training=self.training) 

 result[idx_in_level] = getattr(self, fc3_da)(xs) 

 return result

It is different to imghead for images' domain classification which operates over all features from FPN, and I wonder if this is a bug.

Additionally, is levels indicates levels of FPN output?

Thank you!

Question about the ResNet in MIM

Hi! In traditional MIM research, Transformer-based structures are frequently used because the unit of an image is a patch, not a pixel. But it is a little contradictory to the convolutional network such as ResNet, because CNN is based on pixel-wise convolution not in a patch-wise style. Plus, the masked patches participate in the convolution, which will lead to the distribution shift and ignored pixels will introduce irrelavant information. So I wonder how can we adapt CNN to MIM and I am quite looking forward to your reply. Thanks!

Training on Custom Dataset

I want to train the model for semantic segmentation on a custom dataset. What steps do I need to take to do it?
I see that I'll have to write some scripts for my dataset and put them in under "MIC/seg/mmseg/datasets." I have done this but when start training i get
"KeyError: 'MyDataset is not in the dataset registry'"

Are there any more steps I have to do? Is there a detailed guide for using MIC on a Custom Dataset?

Validation dataset is same as test data?

Hi,

I noticed that the validation data appears to be same as the test data in image classification code.
get_dataset function (MIC/cls/examples/utils.py#L85). Is this correct?

Is there a mistake in Figure1?

Thanks for your impressive work!
I want to kindly remind you whether the stop-gradient mark should be put on the student branch rather than the teacher branch in Figure 1(c).

Discussion: why does distribution shift (caused by the large ratio of masking) not affect the UDA?

This is a really nice work, showing significant improvement with such a simple solution.

I am thinking to apply the similar idea into other research topics such as semi-supervised learning, etc. However, I found that it actually brings negative effects to the performance, or at the very most no improvement at all. I was trying to figure out the reasons. Here are my initial guess.

Unlike the MAE method, the masked blocks in MIC are also used in the training. Hence, the introduced masking brings distribution shift to the input and each layers thereafter: the labeled data and the unlabeled data will be under different distribution. Even though it may make sense, I didn't get why such distribution shift is not an issue for the UDA tasks. Could you please offer your idea about it? or the reason is just in another direction.

Thank you.

Some questions about another domain adaption dataset performance in object detection?

Thank you for your excellent work!
As shown in the paper, only the result of cityscapes->fog-cityscapes dataset has been given, and recently I use some other datasets like sim10k->cityscapes, kitti->cityscapes. The first one can get a good result and can clearly show the good effect of mask augmentation and domain adaption head, the car AP can get 60%, and can get sota compared to other works. But when I use the second one (kitti->cityscapes), the car AP in mic raw can get about 46%，but when I set the three parameters of da_heads to 0.0. The result of car AP can gain into 48%, I don't know how to explain this phenomenon. What's more, the result of kitti->cityscapes can not reach sota like the other two(cityscapes->fog-cityscapes and sim10k->cityscapes) , wish for your reply, thank you!

Missing instancesonly_filtered_gtFine_train_poly File

I'm getting the following error when I try to train the model:

No such file or directory: 'datasets/cityscapes/annotations/instancesonly_filtered_gtFine_train_poly.json'

However, I can't locate where this file is supposed to come from.
If you could provide some help I would highly appreciate it. Thanks.

Can I train the model using target domain data without semantic segmentation annotations?

Hello, @lhoyer

I'm interested in using your model for my project, and I have a question regarding the training data. I have two datasets:

Source domain: I have collected this dataset using the Carla simulator, and it comes with proper semantic segmentation annotations.
Target domain: For this dataset, I do not have semantic segmentation annotations available.

Is it possible to train the model using this combination of datasets？

Please let me know if there are any workarounds or alternative approaches that I can consider to make use of the data I have. Your guidance would be much appreciated.

Assert batch_size==2 in MaskingConsistencyModule?

Hi,
when I increased my batch size from 2 to 3, I ran into an assertion error:

MIC/seg/mmseg/models/uda/masking_consistency_module.py

Line 98 in 81782e0

assert img.shape[0] == 2

Is it a bug or intentional?

Is supervised loss L^s applied to both source and target data?

Hi, I read your paper very interestingly and currently following your code implementation.

Unlike classification branch which doesn't use target domain's label information, it seems that the code of detection branch is using target domain's ground truth bboxes. Is it intentional or did I miss something?

Thank you.

The cls branch:

MIC/cls/examples/cdan_mcc_sdat.py

Lines 206 to 207 in 1aa2f5e

 x_s, labels_s = next(train_source_iter) 

 x_t, _ = next(train_target_iter)

At the det branch:

MIC/det/maskrcnn_benchmark/engine/trainer.py

Lines 137 to 145 in 1aa2f5e

 for iteration, ((source_images, source_targets, idx1), (target_images, target_targets, idx2)) in enumerate(zip(source_data_loader, target_data_loader), start_iter): 

 data_time = time.time() - end 

 arguments["iteration"] = iteration 

 scheduler.step() 

 images = (source_images+target_images).to(device) 

 targets = [target.to(device) for target in list(source_targets+target_targets)] 

 loss_dict = model(images, targets)

MIC for detection with another model

Hi, thank you for publishing this amazing work. I just have one question and maybe you can guide me a little. If I would like to experiment with other models, like a DETR for example, is there a "simple" way of doing this? I've seen the code is really integrated with the maskrcnn code and so I have some trouble thinking about the best approach for this.

Thank you again for your contribution.

About the code

 Hi,your work is awesome!So I'm eager to know the implementation of MIC. But I have several problems as follows.
 1. How can I get the config of model Adversarial[86] which in Table1. ?  Can you provide it ?   Adversaria[86]  is a simple UDA model.  By debugging the code of this model, I can easily understand the idea and implementation details of MIC .
 2. This code runs on the basis of mmcv. I haven't touched the mmcv framework. It is difficult for me to understand and debug the code, and to clearly understand the implementation details of MIC. I found that others also mentioned in the issue that mmcv is difficult to debug.Can you provide concise code about MIC? I believe this will be beneficial to the promotion and use of MIC.
 I'm looking forward to your early reply! Thank you very much!

About the result of gta->cityscapes

Hello author, I retrained mic according to your setting, but the result in gta->cityscapes was only 72.15, which was much lower than the one reported in your paper. The following is my training log：

https://airportal.cn/469474/MMejPgx5t6

About the test process

Hi, Thanks for your great work and share the codes.I met a problem when I finished the training and wanted to do an evaluation. I followed the guideline but I got a bug:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp6admysin/tmpjdmakb9n.json'.
To be specifical, I reproduced in the seg task.

MIC seg got much lower miou than you have described in the paper

hi lukas
I trained the MIC segmentation according to your code. But something seems to have gone wrong, the miou is much lower than what you describe in your paper.
Here is the log output. Can you help me analyze the reason? Thank you!
20230306_143102.log

Do I get normal results when I run the det task code you provided？

I run the task of det with the command you provided:python tools/train_net.py --config-file configs/da_faster_rcnn/e2e_da_faster_rcnn_R_50_FPN_masking_cs.yaml
the result of this command is :


> 2023-02-23 04:45:55,436 maskrcnn_benchmark.trainer INFO: eta: 0:00:27 iter: 59960 loss: 1.3695 (1.3960) loss_classifier: 0.0041 (0.0161) loss_box_reg: 0.0011 (0.0056) loss_objectness: 0.0001 (0.0008) loss_rpn_box_reg: 0.0001 (0.0014) loss_da_image: 0.6725 (0.6725) loss_da_instance: 0.6370 (0.6485) loss_da_consistency: 0.0328 (0.0328) time: 0.6736 (0.6803) data: 0.1442 (0.1458) loss_classifier_mask: 0.0108 (0.0181) loss_box_reg_mask: 0.0000 (0.0000) loss_objectness_mask: 0.0000 (0.0002) loss_rpn_box_reg_mask: 0.0000 (0.0000) lr: 0.000025 max mem: 4830
> 2023-02-23 04:46:09,082 maskrcnn_benchmark.trainer INFO: eta: 0:00:13 iter: 59980 loss: 1.3549 (1.3960) loss_classifier: 0.0036 (0.0161) loss_box_reg: 0.0014 (0.0056) loss_objectness: 0.0001 (0.0008) loss_rpn_box_reg: 0.0001 (0.0014) loss_da_image: 0.6720 (0.6725) loss_da_instance: 0.6311 (0.6485) loss_da_consistency: 0.0372 (0.0328) time: 0.6808 (0.6803) data: 0.1469 (0.1458) loss_classifier_mask: 0.0086 (0.0181) loss_box_reg_mask: 0.0000 (0.0000) loss_objectness_mask: 0.0000 (0.0002) loss_rpn_box_reg_mask: 0.0000 (0.0000) lr: 0.000025 max mem: 4830
> 2023-02-23 04:46:22,004 maskrcnn_benchmark.utils.checkpoint INFO: Saving checkpoint to ./model_final.pth
> 2023-02-23 04:46:22,641 maskrcnn_benchmark.utils.checkpoint INFO: Saving checkpoint to ./model_final_teacher.pth
> 2023-02-23 04:46:23,656 maskrcnn_benchmark.trainer INFO: Total training time: 9:26:56.134488 (0.5669 s / it)
> 2023-02-23 04:46:25,150 maskrcnn_benchmark.inference INFO: Start evaluation on foggy_cityscapes_fine_instanceonly_seg_val_cocostyle dataset(500 images).
> 2023-02-23 04:47:37,854 maskrcnn_benchmark.inference INFO: Total inference time: 0:01:12.702893 (0.1454057869911194 s / img per device, on 1 devices)
> 2023-02-23 04:47:37,953 maskrcnn_benchmark.inference INFO: Preparing results for COCO format
> 2023-02-23 04:47:37,954 maskrcnn_benchmark.inference INFO: Preparing bbox results
> 2023-02-23 04:47:37,993 maskrcnn_benchmark.inference INFO: Evaluating predictions
> 2023-02-23 04:47:46,284 maskrcnn_benchmark.inference INFO: OrderedDict([('bbox', OrderedDict([('AP', 0.046721092567063256), ('AP50', 0.10801086282918171), ('AP75', 0.03347308765404664), ('APs', 0.011136878321116622), ('APm', 0.042912148009011025), ('APl', 0.1141302621395987), (1, {'AP': 0.04754311477581737, 'AP50': 0.12631306393273378, 'AP75': 0.024579608891169252, 'APs': 0.011413596466218806, 'APm': 0.07694720138951046, 'APl': 0.11525433263017916}), (2, {'AP': 0.02694019627095777, 'AP50': 0.07169054732782362, 'AP75': 0.012810392494757676, 'APs': 0.007123212321232124, 'APm': 0.03868854942648566, 'APl': 0.15393689918037945}), (3, {'AP': 0.1916884024998948, 'AP50': 0.3854402289058304, 'AP75': 0.16949702514062337, 'APs': 0.02424258502374857, 'APm': 0.16762120810606068, 'APl': 0.4920231630199325}), (4, {'AP': 0.018656371866422525, 'AP50': 0.039666138540764374, 'AP75': 0.010554180418041806, 'APs': 0.0, 'APm': 0.00019801980198019803, 'APl': 0.04544799416470635}), (5, {'AP': 0.04388717514060653, 'AP50': 0.09847948190013352, 'AP75': 0.034641713223155436, 'APs': 0.0, 'APm': 0.0, 'APl': 0.07181533926874291}), (6, {'AP': 0.0, 'AP50': 0.0, 'AP75': 0.0, 'APs': 0.0, 'APm': 0.0, 'APl': 0.0}), (7, {'AP': 0.019533559852592444, 'AP50': 0.05752011177493723, 'AP75': 0.0033003300330033, 'APs': 0.038648436272198654, 'APm': 0.014975492992880094, 'APl': 0.0019801980198019802}), (8, {'AP': 0.025519920130214577, 'AP50': 0.08497733025123083, 'AP75': 0.01240145103162234, 'APs': 0.007667196485534841, 'APm': 0.0448667123551711, 'APl': 0.03258417083304729})]))])

then,I run the next command you provide:python tools/test_net.py --config-file "configs/da_faster_rcnn/e2e_da_faster_rcnn_R_50_FPN_masking_cs.yaml" MODEL.WEIGHT <path_to_store_weight>/model_final.pth,and the result is :

2023-02-23 09:30:58,591 maskrcnn_benchmark.utils.model_serialization INFO: rpn.anchor_generator.cell_anchors.0              loaded from rpn.anchor_generator.cell_anchors.1              of shape (3, 4)
2023-02-23 09:30:58,591 maskrcnn_benchmark.utils.model_serialization INFO: rpn.head.bbox_pred.bias                          loaded from rpn.head.bboxlogits.bias                         of shape (3,)
2023-02-23 09:30:58,591 maskrcnn_benchmark.utils.model_serialization INFO: rpn.head.cls_logits.weight                       loaded from rpn.head.cls_logits.weight                       of shape (3, 256, 1, 1)
2023-02-23 09:30:58,591 maskrcnn_benchmark.utils.model_serialization INFO: rpn.head.conv.weight                             loaded from rpn.head.conv
index created!
2023-02-23 09:31:00,847 maskrcnn_benchmark.inference INFO: Start evaluation on foggy_cityscapes_fine_instanceonly_seg_val_cocostyle dataset(500 images).
  0%|                                                                                                                        | 0/500 [00:00<?, ?it/s]
/home/cuiyiming/.local/lib/python3.7/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3190.)
Loading and preparing results...
index created!
DONE (t=0.05s).
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.214
Category: 2 {'id': 2, 'name': 'rider'}
Loading and preparing results...
index created!
DONE (t=0.02s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.027
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.072
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.013
Category: 3 {'id': 3, 'name': 'car'}
Loading and preparing results...
index created!
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.570
Category: 4 {'id': 4, 'name': 'truck'}
Loading and preparing results...
index created!
Category: 5 {'id': 5, 'name': 'bus'}
Loading and preparing results...
index created!
DONE (t=0.01s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.044
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.098
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.035
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Category: 6 {'id': 6, 'name': 'train'}
Loading and preparing results...
index created!
DONE (t=0.01s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
Category: 7 {'id': 7, 'name': 'motorcycle'}
Loading and preparing results...
index created!
Category: 8 {'id': 8, 'name': 'bicycle'}
creating index...
Accumulating evaluation results...
DONE (t=0.03s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.026
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.085
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.012
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.008
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.045
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.033
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.064
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=3.06s).
Accumulating evaluation results...
DONE (t=0.32s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.047
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.108
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.033
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.011
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.043
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.114
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.043 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.079 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.080 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.017 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.062 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.1952023-02-23 09:32:12,628 maskrcnn_benchmark.inference INFO: OrderedDict([('bbox', OrderedDict([('AP', 0.046721092567063256), ('AP50', 0.10801086282918171), ('AP75', 0.03347308765404664), ('APs', 0.011136878321116622), ('APm', 0.042912148009011025), ('APl', 0.1141302621395987), (1, {'AP': 0.04754311477581737, 'AP50': 0.12631306393273378, 'AP75': 0.024579608891169252, 'APs': 0.011413596466218806, 'APm': 0.07694720138951046, 'APl': 0.11525433263017916}), (2, {'AP': 0.02694019627095777, 'AP50': 0.07169054732782362, 'AP75': 0.012810392494757676, 'APs': 0.007123212321232124, 'APm': 0.03868854942648566, 'APl': 0.15393689918037945}), (3, {'AP': 0.1916884024998948, 'AP50': 0.3854402289058304, 'AP75': 0.16949702514062337, 'APs': 0.02424258502374857, 'APm': 0.16762120810606068, 'APl': 0.4920231630199325}), (4, {'AP': 0.018656371866422525, 'AP50': 0.039666138540764374, 'AP75': 0.010554180418041806, 'APs': 0.0, 'APm': 0.00019801980198019803, 'APl': 0.04544799416470635}), (5, {'AP': 0.04388717514060653, 'AP50': 0.09847948190013352, 'AP75': 0.034641713223155436, 'APs': 0.0, 'APm': 0.0, 'APl': 0.07181533926874291}), (6, {'AP': 0.0, 'AP50': 0.0, 'AP75': 0.0, 'APs': 0.0, 'APm': 0.0, 'APl': 0.0}), (7, {'AP': 0.019533559852592444, 'AP50': 0.05752011177493723, 'AP75': 0.0033003300330033, 'APs': 0.038648436272198654, 'APm': 0.014975492992880094, 'APl': 0.0019801980198019802}), (8, {'AP': 0.025519920130214577, 'AP50': 0.08497733025123083, 'AP75': 0.01240145103162234, 'APs': 0.007667196485534841, 'APm': 0.0448667123551711, 'APl': 0.03258417083304729})]))])

So I want to ask you is my result normal?
thank you very much!

	for level, (fc1_da, fc2_da, fc3_da) in \
	enumerate(zip(self.da_ins_fc1_layers,
	self.da_ins_fc2_layers, self.da_ins_fc3_layers)):

	idx_in_level = torch.nonzero(levels == level).squeeze(1)

	if len(idx_in_level) > 0:
	xs = x[idx_in_level, :]

	xs = F.relu(getattr(self, fc1_da)(xs))
	xs = F.dropout(xs, p=0.5, training=self.training)

	xs = F.relu(getattr(self, fc2_da)(xs))
	xs = F.dropout(xs, p=0.5, training=self.training)

	result[idx_in_level] = getattr(self, fc3_da)(xs)

	return result

	x_s, labels_s = next(train_source_iter)
	x_t, _ = next(train_target_iter)

	for iteration, ((source_images, source_targets, idx1), (target_images, target_targets, idx2)) in enumerate(zip(source_data_loader, target_data_loader), start_iter):
	data_time = time.time() - end
	arguments["iteration"] = iteration

	scheduler.step()
	images = (source_images+target_images).to(device)
	targets = [target.to(device) for target in list(source_targets+target_targets)]

	loss_dict = model(images, targets)

lhoyer / mic Goto Github PK

mic's People

Contributors

Stargazers

Watchers

Forkers

mic's Issues

Recommend Projects

Recommend Topics

Recommend Org