czczup / vit-adapter Goto Github PK

View Code? Open in Web Editor NEW

1.1K 16.0 128.0 1.82 MB

[ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions

Home Page: https://arxiv.org/abs/2205.08534

License: Apache License 2.0

Python 95.60% Shell 0.33% C++ 0.37% Cuda 3.70%

adapter object-detection semantic-segmentation vision-transformer

vit-adapter's Introduction

ViT-Adapter

The official implementation of the paper "Vision Transformer Adapter for Dense Predictions".

Segmentation Colab Notebook | Detection Colab Notebook (thanks @IamShubhamGupto, @dudifrid)

News

2024/01/19: Train ViT-Adapter with frozen InternViT-6B, see here!
2023/12/23: 🚀🚀🚀 We release a ViT-based vision foundation model with 6B parameters, see here!
2023/08/31: 🚀🚀 DINOv2 released the ViT-g-based segmentor with ViT-Adapter, see here.
2023/07/10: 🚀 Support the weights of DINOv2 for object detection, see here!
2023/06/26: ViT-Adapter is adopted by the champion solution NVOCC in Track 3 (3D Occupancy Prediction) of the CVPR 2023 Autonomous Driving Challenge.
2023/06/07: ViT-Adapter is used by ONE-PEACE and they created new SOTA of 63.0 mIoU on ADE20K.
2023/04/14: ViT-Adapter is used in EVA and DINOv2!
2023/01/21: Our paper is accepted by ICLR 2023!
2023/01/17: We win the champion of WSDM Cup 2023 Toloka VQA Challenge using ViT-Adapter.
2022/10/20: ViT-Adapter is adopted by Zhang et al. and they ranked 1st in the UVO Challenge 2022.
2022/08/22: ViT-Adapter is adopted by BEiT-3 and created new SOTA of 62.8 mIoU on ADE20K.
2022/06/09: ViT-Adapter-L achieves 60.4 box AP and 52.5 mask AP on COCO test-dev without Objects365.
2022/06/04: Code and models are released.
2022/05/12: ViT-Adapter-L reaches 85.2 mIoU on Cityscapes test set without coarse data.
2022/05/05: ViT-Adapter-L achieves the SOTA on ADE20K val set with 60.5 mIoU!

Highlights

ViT-Adapter supports various dense prediction tasks, including object detection, instance segmentation, semantic segmentation, visual grounding, panoptic segmentation, etc.
This codebase includes many SOTA detectors and segmenters to achieve top performance, such as HTC++, Mask2Former, DINO.

results.mp4

Abstract

This work investigates a simple yet powerful dense prediction task adapter for Vision Transformer (ViT). Unlike recently advanced variants that incorporate vision-specific inductive biases into their architectures, the plain ViT suffers inferior performance on dense predictions due to weak prior assumptions. To address this issue, we propose the ViT-Adapter, which allows plain ViT to achieve comparable performance to vision-specific transformers. Specifically, the backbone in our framework is a plain ViT that can learn powerful representations from large-scale multi-modal data. When transferring to downstream tasks, a pre-training-free adapter is used to introduce the image-related inductive biases into the model, making it suitable for these tasks. We verify ViT-Adapter on multiple dense prediction tasks, including object detection, instance segmentation, and semantic segmentation. Notably, without using extra detection data, our ViT-Adapter-L yields state-of-the-art 60.9 box AP and 53.0 mask AP on COCO test-dev. We hope that the ViT-Adapter could serve as an alternative for vision-specific transformers and facilitate future research. The code and models will be released.

Method

Catalog

Awesome Competition Solutions with ViT-Adapter

1st Place Solution for the 5th LSVOS Challenge: Video Instance Segmentation
Tao Zhang, Xingye Tian, Yikang Zhou, Yuehua Wu, Shunping Ji, Cilin Yan, Xuebo Wang, Xin Tao, Yuanhui Zhang, Pengfei Wan
[Code]
August 28, 2023

2nd place solution in Scene Understanding for Autonomous Drone Delivery (SUADD'23) competition
Mykola Lavreniuk, Nivedita Rufus, Unnikrishnan R Nair
[Code]
July 18, 2023

Champion solution in Track 3 (3D Occupancy Prediction) of the CVPR 2023 Autonomous Driving Challenge
FB-OCC: 3D Occupancy Prediction based on Forward-Backward View Transformation
Zhiqi Li, Zhiding Yu, David Austin, Mingsheng Fang, Shiyi Lan, Jan Kautz, Jose M. Alvarez
[Code]
June 26, 2023

3rd Place Solution for PVUW Challenge 2023: Video Panoptic Segmentation
Jinming Su, Wangwang Yang, Junfeng Luo, Xiaolin Wei
June 6, 2023

Champion solution in the Video Scene Parsing in the Wild Challenge at CVPR 2023
Semantic Segmentation on VSPW Dataset through Contrastive Loss and Multi-dataset Training Approach
Min Yan, Qianxiong Ning, Qian Wang
June 3, 2023

2nd place in the Video Scene Parsing in the Wild Challenge at CVPR 2023
Recyclable Semi-supervised Method Based on Multi-model Ensemble for Video Scene Parsing
Biao Wu, Shaoli Liu, Diankai Zhang, Chengjian Zheng, Si Gao, Xiaofeng Zhang, Ning Wang
June 2, 2023

Champion Solution for the WSDM2023 Toloka VQA Challenge
Shengyi Gao, Zhe Chen, Guo Chen, Wenhai Wang, Tong Lu
[Code]
January 9, 2023

1st Place Solutions for the UVO Challenge 2022
Jiajun Zhang, Boyu Chen, Zhilong Ji, Jinfeng Bai, Zonghai Hu
October 9, 2022

Citation

If this work is helpful for your research, please consider citing the following BibTeX entry.

@article{chen2022vitadapter,
  title={Vision Transformer Adapter for Dense Predictions},
  author={Chen, Zhe and Duan, Yuchen and Wang, Wenhai and He, Junjun and Lu, Tong and Dai, Jifeng and Qiao, Yu},
  journal={arXiv preprint arXiv:2205.08534},
  year={2022}
}

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.

vit-adapter's People

Contributors

Stargazers

Watchers

Forkers

sunsmarterjie satpalsr hwijune ander002 zhizhangxian luo77123 xzjzsa rane7 zhiqi-li celsopitta pauls3 akito-shine abhay5991 fangwudi nazirnayal8 nbardy dl-vit chenzhutian zengwang430521 one-green-bird prophet-c hehongjie chandsome nassim12 ymcidence gauenk arthasmil ibilibili-m madhawav zhangsiqigit josharoon baimon664 dq-soulie okotaku kwyoung04 s-inno kaveric algorithmlover2016 jonah1234567 ovipaul tb5z035 jadgardner lonely-geese lzh990711 xie-muxi hyu-zhang jdg105 raywoo1 weijingmin2000 mdciri andreimihalea lanternjx steven-xiong anywayany dbaranchuk maomao0819 wstchhwp fonzen1 aliman80 alexyiningliu slngen aroojsyed likeboo nameongithub hammad-abid lyf0801 jsnarvasa pipizhum a-dalal yjhuasheng nieqiang001 iq-scm vayzenb mrbjkk jhr123321 wanggcong 170744039 sheffieldcao bohatyrov yudiprtm wangzifeibo robin-karlsson0 saraazm husseinmansourmohd susanbao kyegomez anh-vunguyen hyggge okkzzh cxz pzhren hebeifast yanxingliu lxiu-yu yingjun-zhang herb1999 wanggrun xuanjiawang theinfamouswayne beandkay

vit-adapter's Issues

How to modified configs files for tiny or small model

I want to use small or tiny model to train myself datasets because of GPU memory is too small. but I do not how to modify configs.
I do not find some parameters where is set. for example , set of adapter (N, CFFN, head).

请问base.py获取的batch_data对应的mmseg版本是什么呢？

我用的之前可以成功跑通的mmseg包，打印出来的batch_data的keys()为：

现在训练博主的代码的时候，把mmseg和mmcv都移动到segmentation下，并make.sh了环境。发现batch_data超过except 2之后，我将batch_data去掉了多余的key，目前报如下错误：

请问我应该更换mmseg的版本吗，我是用的是0.24版本

I think the deformable attention is not compiled successfully.

    I think the deformable attention is not compiled successfully.

You can try this:
replace line 11

import MultiScaleDeformableAttention as MSDA

in the ms_deform_attn_func.py with

from mmcv.ops.multi_scale_deform_attn import ext_module as MSDA

and then run python test.py again.

Originally posted by @czczup in #41 (comment)

CUDA "index out of bounds"` failed.

Hi, I am working to learn with ViT-Adapter as below.

After setup all required packages installed, do training with this command makes error.

bash dist_train.sh configs/cityscapes/mask2former_beit_adapter_large_896_80k_cityscapes_ss.py 2

I am using conda env for cuda and pytorch version (CUDA 11.1.1, Pytorch 1.9.0)

Could you guess any error reason? Thank you.

[Error log]

2022-06-28 09:44:52,221 - mmseg - INFO - Loaded 500 images
2022-06-28 09:44:52,222 - mmseg - INFO - load checkpoint from local path: pretrained/mask2former_beit_adapter_large_896_80k_mapillary.pth.tar
2022-06-28 09:44:54,014 - mmseg - WARNING - The model and loaded state dict do not match exactly

missing keys in source state_dict: backbone.blocks.0.attn.relative_position_index, backbone.blocks.1.attn.relative_position_index, backbone.blocks.2.attn.relative_position_index, backbone.blocks.3.attn.relative_position_index, backbone.blocks.4.attn.relative_position_index, backbone.blocks.5.attn.relative_position_index, backbone.blocks.6.attn.relative_position_index, backbone.blocks.7.attn.relative_position_index, backbone.blocks.8.attn.relative_position_index, backbone.blocks.9.attn.relative_position_index, backbone.blocks.10.attn.relative_position_index, backbone.blocks.11.attn.relative_position_index, backbone.blocks.12.attn.relative_position_index, backbone.blocks.13.attn.relative_position_index, backbone.blocks.14.attn.relative_position_index, backbone.blocks.15.attn.relative_position_index, backbone.blocks.16.attn.relative_position_index, backbone.blocks.17.attn.relative_position_index, backbone.blocks.18.attn.relative_position_index, backbone.blocks.19.attn.relative_position_index, backbone.blocks.20.attn.relative_position_index, backbone.blocks.21.attn.relative_position_index, backbone.blocks.22.attn.relative_position_index, backbone.blocks.23.attn.relative_position_index

2022-06-28 09:44:54,053 - mmseg - INFO - Start running, host: ldg810@LVEF2, work_dir: /home/ldg810/git/ViT-Adapter/segmentation/work_dirs/mask2former_beit_adapter_large_896_80k_cityscapes_ss
2022-06-28 09:44:54,053 - mmseg - INFO - Hooks will be executed in the following order:
before_run:
(VERY_HIGH   ) PolyLrUpdaterHook
(NORMAL      ) CheckpointHook
(LOW         ) DistEvalHook
(VERY_LOW    ) TextLoggerHook
 --------------------
before_train_epoch:
(VERY_HIGH   ) PolyLrUpdaterHook
(LOW         ) IterTimerHook
(LOW         ) DistEvalHook
(VERY_LOW    ) TextLoggerHook
 --------------------
before_train_iter:
(VERY_HIGH   ) PolyLrUpdaterHook
(LOW         ) IterTimerHook
(LOW         ) DistEvalHook
 --------------------
after_train_iter:
(ABOVE_NORMAL) OptimizerHook
(NORMAL      ) CheckpointHook
(LOW         ) IterTimerHook
(LOW         ) DistEvalHook
(VERY_LOW    ) TextLoggerHook
 --------------------
after_train_epoch:
(NORMAL      ) CheckpointHook
(LOW         ) DistEvalHook
(VERY_LOW    ) TextLoggerHook
 --------------------
before_val_epoch:
(LOW         ) IterTimerHook
(VERY_LOW    ) TextLoggerHook
 --------------------
before_val_iter:
(LOW         ) IterTimerHook
 --------------------
after_val_iter:
(LOW         ) IterTimerHook
 --------------------
after_val_epoch:
(VERY_LOW    ) TextLoggerHook
 --------------------
after_run:
(VERY_LOW    ) TextLoggerHook
 --------------------
2022-06-28 09:44:54,053 - mmseg - INFO - workflow: [('train', 1)], max: 80000 iters
2022-06-28 09:44:54,054 - mmseg - INFO - Checkpoints will be saved to /home/ldg810/git/ViT-Adapter/segmentation/work_dirs/mask2former_beit_adapter_large_896_80k_cityscapes_ss by HardDiskBackend.
/home/ldg810/anaconda3/envs/vit-adapter/lib/python3.7/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /opt/conda/conda-bld/pytorch_1623448265233/work/c10/core/TensorImpl.h:1156.)
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
/home/ldg810/anaconda3/envs/vit-adapter/lib/python3.7/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /opt/conda/conda-bld/pytorch_1623448265233/work/c10/core/TensorImpl.h:1156.)
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
/home/ldg810/anaconda3/envs/vit-adapter/lib/python3.7/site-packages/torch/nn/functional.py:3658: UserWarning: The default behavior for interpolate/upsample with float scale_factor changed in 1.6.0 to align with other frameworks/libraries, and now uses scale_factor directly, instead of relying on the computed output size. If you wish to restore the old behavior, please set recompute_scale_factor=True. See the documentation of nn.Upsample for details.
  "The default behavior for interpolate/upsample with float scale_factor changed "
/home/ldg810/anaconda3/envs/vit-adapter/lib/python3.7/site-packages/torch/nn/functional.py:3658: UserWarning: The default behavior for interpolate/upsample with float scale_factor changed in 1.6.0 to align with other frameworks/libraries, and now uses scale_factor directly, instead of relying on the computed output size. If you wish to restore the old behavior, please set recompute_scale_factor=True. See the documentation of nn.Upsample for details.
  "The default behavior for interpolate/upsample with float scale_factor changed "
/home/ldg810/anaconda3/envs/vit-adapter/lib/python3.7/site-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at  /opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/BinaryOps.cpp:467.)
  return torch.floor_divide(self, other)
/home/ldg810/anaconda3/envs/vit-adapter/lib/python3.7/site-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at  /opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/BinaryOps.cpp:467.)
  return torch.floor_divide(self, other)
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [4,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [5,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [6,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [11,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [12,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [13,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [18,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [19,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [20,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [25,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [26,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [27,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [96,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [97,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [102,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [103,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [104,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [109,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [110,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [111,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [116,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [117,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [118,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [123,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [124,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [125,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [32,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [33,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [34,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [39,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [40,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [41,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [46,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [47,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [48,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [53,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [54,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [55,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [60,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [61,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [62,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [96,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [101,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [102,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [103,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [108,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [109,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [110,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [115,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [116,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [117,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [122,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [123,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [124,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [67,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [68,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [69,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [74,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [75,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [76,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [81,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [82,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [83,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [88,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [89,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [90,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [95,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [32,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [33,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [38,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [39,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [40,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [45,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [46,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [47,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [52,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [53,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [54,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [59,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [60,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [61,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [3,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [4,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [5,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [10,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [11,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [12,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [17,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [18,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [19,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [24,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [25,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [26,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [31,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [66,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [67,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [68,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [73,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [74,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [75,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [80,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [81,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [82,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [87,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [88,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [89,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [94,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [1,0,0], thread: [95,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
Traceback (most recent call last):
  File "./train.py", line 215, in <module>
    main()
  File "./train.py", line 211, in main
    meta=meta)
  File "/home/ldg810/anaconda3/envs/vit-adapter/lib/python3.7/site-packages/mmseg/apis/train.py", line 167, in train_segmentor
    runner.run(data_loaders, cfg.workflow)
  File "/home/ldg810/anaconda3/envs/vit-adapter/lib/python3.7/site-packages/mmcv/runner/iter_based_runner.py", line 134, in run
    iter_runner(iter_loaders[i], **kwargs)
  File "/home/ldg810/anaconda3/envs/vit-adapter/lib/python3.7/site-packages/mmcv/runner/iter_based_runner.py", line 61, in train
    outputs = self.model.train_step(data_batch, self.optimizer, **kwargs)
  File "/home/ldg810/anaconda3/envs/vit-adapter/lib/python3.7/site-packages/mmcv/parallel/distributed.py", line 52, in train_step
    output = self.module.train_step(*inputs[0], **kwargs[0])
  File "/home/ldg810/anaconda3/envs/vit-adapter/lib/python3.7/site-packages/mmseg/models/segmentors/base.py", line 138, in train_step
    losses = self(**data_batch)
  File "/home/ldg810/anaconda3/envs/vit-adapter/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/ldg810/anaconda3/envs/vit-adapter/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 98, in new_func
    return old_func(*args, **kwargs)
  File "/home/ldg810/anaconda3/envs/vit-adapter/lib/python3.7/site-packages/mmseg/models/segmentors/base.py", line 108, in forward
    return self.forward_train(img, img_metas, **kwargs)
  File "/home/ldg810/git/ViT-Adapter/segmentation/mmseg_custom/models/segmentors/encoder_decoder_mask2former.py", line 145, in forward_train
    **kwargs)
  File "/home/ldg810/git/ViT-Adapter/segmentation/mmseg_custom/models/segmentors/encoder_decoder_mask2former.py", line 88, in _decode_head_forward_train
    gt_semantic_seg, **kwargs)
  File "/home/ldg810/git/ViT-Adapter/segmentation/mmseg_custom/models/decode_heads/mask2former_head.py", line 553, in forward_train
    img_metas)
  File "/home/ldg810/anaconda3/envs/vit-adapter/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 186, in new_func
    return old_func(*args, **kwargs)
  File "/home/ldg810/git/ViT-Adapter/segmentation/mmseg_custom/models/decode_heads/mask2former_head.py", line 387, in loss
    all_gt_labels_list, all_gt_masks_list, img_metas_list)
  File "/home/ldg810/git/ViT-Adapter/segmentation/mmseg_custom/core/utils/misc.py", line 21, in multi_apply
    return tuple(map(list, zip(*map_results)))
  File "/home/ldg810/git/ViT-Adapter/segmentation/mmseg_custom/models/decode_heads/mask2former_head.py", line 297, in loss_single
    img_metas)
  File "/home/ldg810/git/ViT-Adapter/segmentation/mmseg_custom/models/decode_heads/mask2former_head.py", line 192, in get_targets
    gt_masks_list, img_metas)
  File "/home/ldg810/git/ViT-Adapter/segmentation/mmseg_custom/core/utils/misc.py", line 21, in multi_apply
    return tuple(map(list, zip(*map_results)))
  File "/home/ldg810/git/ViT-Adapter/segmentation/mmseg_custom/models/decode_heads/mask2former_head.py", line 248, in _get_target_single
    img_metas)
  File "/home/ldg810/git/ViT-Adapter/segmentation/mmseg_custom/models/utils/assigner.py", line 148, in assign
    cost = cost.detach().cpu()
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Traceback (most recent call last):
  File "./train.py", line 215, in <module>
    main()
  File "./train.py", line 211, in main
    meta=meta)
  File "/home/ldg810/anaconda3/envs/vit-adapter/lib/python3.7/site-packages/mmseg/apis/train.py", line 167, in train_segmentor
    runner.run(data_loaders, cfg.workflow)
  File "/home/ldg810/anaconda3/envs/vit-adapter/lib/python3.7/site-packages/mmcv/runner/iter_based_runner.py", line 134, in run
    iter_runner(iter_loaders[i], **kwargs)
  File "/home/ldg810/anaconda3/envs/vit-adapter/lib/python3.7/site-packages/mmcv/runner/iter_based_runner.py", line 61, in train
    outputs = self.model.train_step(data_batch, self.optimizer, **kwargs)
  File "/home/ldg810/anaconda3/envs/vit-adapter/lib/python3.7/site-packages/mmcv/parallel/distributed.py", line 52, in train_step
    output = self.module.train_step(*inputs[0], **kwargs[0])
  File "/home/ldg810/anaconda3/envs/vit-adapter/lib/python3.7/site-packages/mmseg/models/segmentors/base.py", line 138, in train_step
    losses = self(**data_batch)
  File "/home/ldg810/anaconda3/envs/vit-adapter/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/ldg810/anaconda3/envs/vit-adapter/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 98, in new_func
    return old_func(*args, **kwargs)
  File "/home/ldg810/anaconda3/envs/vit-adapter/lib/python3.7/site-packages/mmseg/models/segmentors/base.py", line 108, in forward
    return self.forward_train(img, img_metas, **kwargs)
  File "/home/ldg810/git/ViT-Adapter/segmentation/mmseg_custom/models/segmentors/encoder_decoder_mask2former.py", line 145, in forward_train
    **kwargs)
  File "/home/ldg810/git/ViT-Adapter/segmentation/mmseg_custom/models/segmentors/encoder_decoder_mask2former.py", line 88, in _decode_head_forward_train
    gt_semantic_seg, **kwargs)
  File "/home/ldg810/git/ViT-Adapter/segmentation/mmseg_custom/models/decode_heads/mask2former_head.py", line 553, in forward_train
    img_metas)
  File "/home/ldg810/anaconda3/envs/vit-adapter/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 186, in new_func
    return old_func(*args, **kwargs)
  File "/home/ldg810/git/ViT-Adapter/segmentation/mmseg_custom/models/decode_heads/mask2former_head.py", line 387, in loss
    all_gt_labels_list, all_gt_masks_list, img_metas_list)
  File "/home/ldg810/git/ViT-Adapter/segmentation/mmseg_custom/core/utils/misc.py", line 21, in multi_apply
    return tuple(map(list, zip(*map_results)))
  File "/home/ldg810/git/ViT-Adapter/segmentation/mmseg_custom/models/decode_heads/mask2former_head.py", line 297, in loss_single
    img_metas)
  File "/home/ldg810/git/ViT-Adapter/segmentation/mmseg_custom/models/decode_heads/mask2former_head.py", line 192, in get_targets
    gt_masks_list, img_metas)
  File "/home/ldg810/git/ViT-Adapter/segmentation/mmseg_custom/core/utils/misc.py", line 21, in multi_apply
    return tuple(map(list, zip(*map_results)))
  File "/home/ldg810/git/ViT-Adapter/segmentation/mmseg_custom/models/decode_heads/mask2former_head.py", line 248, in _get_target_single
    img_metas)
  File "/home/ldg810/git/ViT-Adapter/segmentation/mmseg_custom/models/utils/assigner.py", line 142, in assign
    dice_cost = self.dice_cost(mask_pred, gt_masks)
  File "/home/ldg810/git/ViT-Adapter/segmentation/mmseg_custom/models/losses/match_costs.py", line 178, in __call__
    dice_cost = self.binary_mask_dice_loss(mask_preds, gt_masks)
  File "/home/ldg810/git/ViT-Adapter/segmentation/mmseg_custom/models/losses/match_costs.py", line 164, in binary_mask_dice_loss
    loss = 1 - (numerator + self.eps) / (denominator + self.eps)
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

ImportError: _ext.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZNK2at6Tensor7is_cudaEv

Hi.Thank you for sharing your work.
But,I followed the steps of usage, but got this error, what should I do?

Do you plan to PR this to mmsegmentation?

I think it would be nice to see this in mmsegmentation official repo

RuntimeError: Default process group has not been initialized, please make sure to call init_process_group

Traceback (most recent call last):
File "/home/software/pycharm-2018.3.1/helpers/pydev/pydevd.py", line 1741, in
main()
File "/home/software/pycharm-2018.3.1/helpers/pydev/pydevd.py", line 1735, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/home/software/pycharm-2018.3.1/helpers/pydev/pydevd.py", line 1135, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/software/pycharm-2018.3.1/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/PycharmProjects/finetune/ViT-Adapter-main/detection/train.py", line 195, in
main()
File "/home/PycharmProjects/finetune/ViT-Adapter-main/detection/train.py", line 191, in main
meta=meta)
File "/home/anaconda/envs/vit_pytorch/lib/python3.7/site-packages/mmdet/apis/train.py", line 208, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/home/anaconda/envs/vit_pytorch/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/anaconda/envs/vit_pytorch/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
self.run_iter(data_batch, train_mode=True, **kwargs)
File "/home/anaconda/envs/vit_pytorch/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter
**kwargs)
File "/home/anaconda/envs/vit_pytorch/lib/python3.7/site-packages/mmcv/parallel/data_parallel.py", line 75, in train_step
return self.module.train_step(*inputs[0], **kwargs[0])
File "/home/anaconda/envs/vit_pytorch/lib/python3.7/site-packages/mmdet/models/detectors/base.py", line 248, in train_step
losses = self(**data)
File "/home/anaconda/envs/vit_pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/anaconda/envs/vit_pytorch/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 128, in new_func
output = old_func(*new_args, **new_kwargs)
File "/home/anaconda/envs/vit_pytorch/lib/python3.7/site-packages/mmdet/models/detectors/base.py", line 172, in forward
return self.forward_train(img, img_metas, **kwargs)
File "/home/anaconda/envs/vit_pytorch/lib/python3.7/site-packages/mmdet/models/detectors/two_stage.py", line 127, in forward_train
x = self.extract_feat(img)
File "/home/anaconda/envs/vit_pytorch/lib/python3.7/site-packages/mmdet/models/detectors/two_stage.py", line 67, in extract_feat
x = self.backbone(img)
File "/home/anaconda/envs/vit_pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/PycharmProjects/finetune/ViT-Adapter-main/detection/mmdet_custom/models/backbones/vit_adapter.py", line 94, in forward
c1, c2, c3, c4 = self.spm(x)
File "/home/anaconda/envs/vit_pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/PycharmProjects/finetune/ViT-Adapter-main/detection/mmdet_custom/models/backbones/adapter_modules.py", line 207, in forward
c1 = self.stem(x)
File "/home/anaconda/envs/vit_pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/anaconda/envs/vit_pytorch/lib/python3.7/site-packages/torch/nn/modules/container.py", line 139, in forward
input = module(input)
File "/home/anaconda/envs/vit_pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/anaconda/envs/vit_pytorch/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 731, in forward
world_size = torch.distributed.get_world_size(process_group)
File "/home/anaconda/envs/vit_pytorch/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 748, in get_world_size
return _get_group_size(group)
File "/home/anaconda/envs/vit_pytorch/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 274, in _get_group_size
default_pg = _get_default_group()
File "/home/anaconda/envs/vit_pytorch/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 358, in _get_default_group
raise RuntimeError("Default process group has not been initialized, "
RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.

About the FPS metric

Thanks for the great work!

In addition to the #Params and FLOPs, could you please provide the FPS (images/s) metric of ViT-Adapter-Ti/S/B?

Sincerely.

How to use model to inference with image and video?

I have just downloaded model htc++_beit_adapter_large_fpn_3x_coco.pth and config from this github. But I cannot load model use this command:

from mmdet.apis import init_detector
configFile = 'configs/htc++/htc++_beit_adapter_large_fpn_3x_coco.py'
checkpointFile = 'checkpoint/htc++_beit_adapter_large_fpn_3x_coco.pth'
model = init_detector(config_file, checkpoint_file, device='cuda:0')

img = 'demo.jpg'
result = inference_detector(model, img)

please help me

Pretraining code

Hi,
How are the models pretrained? I notice that custom architectures like injector and extractors require rewriting the model, so I'm assuming you pretrained the model yourself?
If that's correct, could you provide a link to the code for pretraining these models?

The vit adapter exhibits very low performance on the potsdam dataset

The miou and acc are very low but the loss is high

About error 'EncoderDecoderMask2Former is not in the models registry'

Hello, when I run model_VitAdapter = init_segmentor(config_file_VitAdapter , checkpoint_file_VitAdapter, device='cuda:0'), it says this error, how to solve it~~ Thank you very much~~!

mask2former_beit_adapter_large_896_80k_cocostuff164k.pth.tar is not a checkpoint file

I want to use these weights as a pretrained model for use with a smaller subset of cocostuff data.
When I change

pretrained = 'pretrained/beit_large_patch16_224_pt22k_ft22k.pth'

pretrained = 'pretrained/mask2former_beit_adapter_large_896_80k_cocostuff164k.pth.tar'

I get this error message.
How can I use your pretrained weights in a new model of the same structure with different data?

I just want to use semantic segmentation, do I need to install the mmdet module?

ModuleNotFoundError: No module named 'ops.modules'

dist_test.sh: 8: dist_test.sh: Bad substitution

here is my command:
(vit_adapter) lidexuan@aa-SYS-4029GP-TRT:/data/lidexuan/ViT-Adapter/detection$ sh dist_test.sh configs/htc++/htc++_beit_adapter_large_fpn_3x_coco.py htc++_beit_adapter_large_fpn_3x_coco.pth.tar 8 --eval bbox segm
dist_test.sh: 8: dist_test.sh: Bad substitution

any idea?

请问cityscapes语义分割中mapillary预训练模型对应的config文件是哪一个？

请问如果想用图中的mapillary预训练模型（mask2former_beit_adapter_large_896_80k_mapillary.zip）做语义分割的话？config文件该用哪一个呢？是“mask2former_beit_adapter_large_896_80k_mapillary_ss.py”吗？

How can I follow this great work if I do NOT have 16 Nvidia A100 (80GB)?

HI, thanks for your insightful work. I notice that, you use 8 or even 16 80GB A100 graphics cards to conduct the experiments (e.g. ViT-L-Mask2Former@ADE20K and ViT-L-Mask2Former@CityScape ). However, I do not have such powerful computing resources, the most powerful graphics card I can access is 2080Ti, can you kindly give some suggestions?

Many thanks!

Example Colab notebook

I'm trying for a couple of days now to make use of this repo, but I can't make it work...
Here is what I've tried.
Can you please supply me with a simple Collab notebook example that actually works, or point out why isn't my trial working?

Edit:
Currently the error I get there is:

File "test.py", line 186, in main
checkpoint = load_checkpoint(model, args.checkpoint, map_location='cpu')
File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/checkpoint.py", line 581, in load_checkpoint
checkpoint = _load_checkpoint(filename, map_location, logger)
File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/checkpoint.py", line 520, in _load_checkpoint
return CheckpointLoader.load_checkpoint(filename, map_location, logger)
File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/checkpoint.py", line 285, in load_checkpoint
return checkpoint_loader(filename, map_location)
File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/checkpoint.py", line 302, in load_from_local
checkpoint = torch.load(filename, map_location=map_location)
File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 713, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 920, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '\x0a'.

Can I do instance segmentation with ground-truth bounding box?

Thanks for sharing the great repo!
I have a few GT bounding box of several overlapped objects. I am wondering if it is possible to use these GT bbox to do insance segmentation? If so, how? Thanks!

请问我需要调整输入图片尺寸来得到更好的语义分割效果吗？

您好，由于我水平有限，在阅读论文和代码后仍然无法解决下面这个问题：
我把在Cityscapes test set上训练的模型用于我自己的数据集做语义分割，对应config是“mask2former_beit_adapter_large_896_80k_cityscapes_ss.py”，数据集图片尺寸为19201080，我没有进行任何操作直接用“image_demo.py”进行语义分割，运行没有报错，得到19201080的语义分割图片。请问这个模型的输入图片尺寸是任意的吗？我是否需要将图片尺寸调整为某个尺寸（例如cityscapes的图片尺寸）以获得更好的语义分割效果？

how to get pth.tar file after trainning own dataset?

hi, i trained on my own dataset, and get pth files, but image_demo.py need a pth.tar file as weight, so how to get the pth.tar file by pth file? can i just run "tar -cvf iter_xxxx.pth.tar iter_xxxx.pth"?

分割预测时，输入图像的宽大于2048时，ms_deform_attn AssertionEror

您好，十分感谢算法和预训练模型的开源，效果非常棒！

但在单图cityscapes格式数据预测时，发现当图像的宽 > 2048 时，运行报错，(input_spatial_shapes[:, 0] * input_spatial_shapes[:, 1]).sum() != Len_in。请问是什么原因呢？如何解决呢？

以下报错信息

CUDA_VISIBLE_DEVICES=0 python3 image_demo.py configs/cityscapes/mask2former_beit_adapter_large_896_80k_cityscapes_ss.py released/mask2former_beit_adapter_large_896_80k_mapillary.pth.tar data/6.jpg

/opt/tiger/algo/ViT-Adapter-main/segmentation/mmseg_custom/models/losses/cross_entropy_loss.py:231: UserWarning: Default avg_non_ignore is False, if you would like to ignore the certain label and average loss over non-ignore labels, which is the same with PyTorch official cross_entropy, set avg_non_ignore=True.
'Default avg_non_ignore is False, if you would like to '
load checkpoint from local path: released/mask2former_beit_adapter_large_896_80k_mapillary.pth.tar
The model and loaded state dict do not match exactly

missing keys in source state_dict: backbone.blocks.0.attn.relative_position_index, backbone.blocks.1.attn.relative_position_index, backbone.blocks.2.attn.relative_position_index, backbone.blocks.3.attn.relative_position_index, backbone.blocks.4.attn.relative_position_index, backbone.blocks.5.attn.relative_position_index, backbone.blocks.6.attn.relative_position_index, backbone.blocks.7.attn.relative_position_index, backbone.blocks.8.attn.relative_position_index, backbone.blocks.9.attn.relative_position_index, backbone.blocks.10.attn.relative_position_index, backbone.blocks.11.attn.relative_position_index, backbone.blocks.12.attn.relative_position_index, backbone.blocks.13.attn.relative_position_index, backbone.blocks.14.attn.relative_position_index, backbone.blocks.15.attn.relative_position_index, backbone.blocks.16.attn.relative_position_index, backbone.blocks.17.attn.relative_position_index, backbone.blocks.18.attn.relative_position_index, backbone.blocks.19.attn.relative_position_index, backbone.blocks.20.attn.relative_position_index, backbone.blocks.21.attn.relative_position_index, backbone.blocks.22.attn.relative_position_index, backbone.blocks.23.attn.relative_position_index

test_cfg mode: slide
/opt/tiger/user_envs/vit-adapter/lib/python3.7/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at /pytorch/c10/core/TensorImpl.h:1156.)
return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
input_spatial_shapes sum: tensor(11452, device='cuda:0')
Len_in: 11648
Traceback (most recent call last):
File "image_demo.py", line 59, in
main()
File "image_demo.py", line 45, in main
result = inference_segmentor(model, args.img)
File "/opt/tiger/user_envs/vit-adapter/lib/python3.7/site-packages/mmseg/apis/inference.py", line 98, in inference_segmentor
result = model(return_loss=False, rescale=True, **data)
File "/opt/tiger/user_envs/vit-adapter/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/tiger/user_envs/vit-adapter/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 98, in new_func
return old_func(*args, **kwargs)
File "/opt/tiger/user_envs/vit-adapter/lib/python3.7/site-packages/mmseg/models/segmentors/base.py", line 110, in forward
return self.forward_test(img, img_metas, **kwargs)
File "/opt/tiger/user_envs/vit-adapter/lib/python3.7/site-packages/mmseg/models/segmentors/base.py", line 92, in forward_test
return self.simple_test(imgs[0], img_metas[0], **kwargs)
File "/opt/tiger/algo/ViT-Adapter-main/segmentation/mmseg_custom/models/segmentors/encoder_decoder_mask2former.py", line 258, in simple_test
seg_logit = self.inference(img, img_meta, rescale)
File "/opt/tiger/algo/ViT-Adapter-main/segmentation/mmseg_custom/models/segmentors/encoder_decoder_mask2former.py", line 241, in inference
seg_logit = self.slide_inference(img, img_meta, rescale)
File "/opt/tiger/algo/ViT-Adapter-main/segmentation/mmseg_custom/models/segmentors/encoder_decoder_mask2former.py", line 180, in slide_inference
crop_seg_logit = self.encode_decode(crop_img, img_meta)
File "/opt/tiger/algo/ViT-Adapter-main/segmentation/mmseg_custom/models/segmentors/encoder_decoder_mask2former.py", line 73, in encode_decode
x = self.extract_feat(img)
File "/opt/tiger/algo/ViT-Adapter-main/segmentation/mmseg_custom/models/segmentors/encoder_decoder_mask2former.py", line 65, in extract_feat
x = self.backbone(img)
File "/opt/tiger/user_envs/vit-adapter/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/tiger/algo/ViT-Adapter-main/segmentation/mmseg_custom/models/backbones/beit_adapter.py", line 116, in forward
deform_inputs1, deform_inputs2, H, W)
File "/opt/tiger/user_envs/vit-adapter/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/tiger/algo/ViT-Adapter-main/segmentation/mmseg_custom/models/backbones/adapter_modules.py", line 219, in forward
level_start_index=deform_inputs1[2])
File "/opt/tiger/user_envs/vit-adapter/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/tiger/algo/ViT-Adapter-main/segmentation/mmseg_custom/models/backbones/adapter_modules.py", line 150, in forward
query = _inner_forward(query, feat)
File "/opt/tiger/algo/ViT-Adapter-main/segmentation/mmseg_custom/models/backbones/adapter_modules.py", line 144, in _inner_forward
level_start_index, None)
File "/opt/tiger/user_envs/vit-adapter/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/tiger/algo/ViT-Adapter-main/segmentation/ops/modules/ms_deform_attn.py", line 103, in forward
input_spatial_shapes[:, 1]).sum() == Len_in
AssertionError

How do i train custom data?

Traceback (most recent call last):
File "train.py", line 207, in
main()
File "train.py", line 203, in main
meta=meta)
File "/opt/conda/envs/vit_37/lib/python3.7/site-packages/mmdet/apis/train.py", line 208, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/opt/conda/envs/vit_37/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
epoch_runner(data_loaders[i], **kwargs)
File "/opt/conda/envs/vit_37/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 45, in train
self.call_hook('before_train_epoch')
File "/opt/conda/envs/vit_37/lib/python3.7/site-packages/mmcv/runner/base_runner.py", line 309, in call_hook
getattr(hook, fn_name)(self)
File "/opt/conda/envs/vit_37/lib/python3.7/site-packages/mmdet/datasets/utils.py", line 158, in before_train_epoch
self._check_head(runner)
File "/opt/conda/envs/vit_37/lib/python3.7/site-packages/mmdet/datasets/utils.py", line 145, in _check_head
(f'The num_classes ({module.num_classes}) in '
AssertionError: The num_classes (80) in BEiTAdapter of MMDataParallel does not matches the length of CLASSES 5) in CocoDataset

this is my log.

I have 5 class in my custom data, but it seems to need to have 80 classes to use pretrained weight.
Can you give solution?

Model for real time Segmetation

Hello @czczup, @whai362, @duanduanduanyuchen,

I would like to use ViT-Adapter for real time driving scene sematic segmentation task. Is there any model that can be used for this task and still get high performance?

I have tried Mask R-CNN with ViT-Adapter-T backbone on DeiT-T pretrain model but I got this error.

Here is my note book Link.

2022-08-19 07:04:35,726 - mmdet - WARNING - unexpected key in source state_dict: cls_token, norm.weight, norm.bias, head.weight, head.bias

missing keys in source state_dict: blocks.7.gamma1, blocks.2.gamma2, blocks.7.gamma2, blocks.2.gamma1, blocks.6.gamma1, blocks.8.gamma1, blocks.9.gamma2, blocks.9.gamma1, blocks.3.gamma2, blocks.10.gamma2, blocks.4.gamma2, blocks.6.gamma2, blocks.1.gamma2, blocks.4.gamma1, blocks.3.gamma1, blocks.0.gamma1, blocks.11.gamma1, blocks.8.gamma2, blocks.0.gamma2, blocks.10.gamma1, blocks.5.gamma2, blocks.5.gamma1, blocks.11.gamma2, blocks.1.gamma1

load checkpoint from local path: checkpoint/mask_rcnn_deit_adapter_tiny_fpn_3x_coco.pth.tar
/usr/local/lib/python3.7/dist-packages/mmdet/apis/inference.py:50: UserWarning: Class names are not saved in the checkpoint's meta data, use COCO classes by default.
  warnings.warn('Class names are not saved in the checkpoint\'s '
/usr/local/lib/python3.7/dist-packages/mmdet/datasets/utils.py:70: UserWarning: "ImageToTensor" pipeline is replaced by "DefaultFormatBundle" for batch inference. It is recommended to manually replace it in the test data pipeline in your config file.
  'data pipeline in your config file.', UserWarning)
/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /pytorch/c10/core/TensorImpl.h:1156.)
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:3658: UserWarning: The default behavior for interpolate/upsample with float scale_factor changed in 1.6.0 to align with other frameworks/libraries, and now uses scale_factor directly, instead of relying on the computed output size. If you wish to restore the old behavior, please set recompute_scale_factor=True. See the documentation of nn.Upsample for details. 
  "The default behavior for interpolate/upsample with float scale_factor changed "  ```

Is the ViT backbone frozen?

In some adapter literature, the backbone is frozen, I just want to double check if the ViT backbone here is frozen. It seems from the figure that ViT backbone is in grey color, which might indicate freezing? But I didn't find corresponding description in the paper, thank you.

ValueError: Unrecognized dataset: coco_stuff

!CUDA_VISIBLE_DEVICES=0 python image_demo.py
configs/coco_stuff164k/mask2former_beit_adapter_large_896_80k_cocostuff164k_ss.py
pretrained/mask2former_beit_adapter_large_896_80k_cocostuff164k.pth.tar
/data/sample1.jpg
--palette coco_stuff

Traceback (most recent call last):
File "image_demo.py", line 58, in
main()
File "image_demo.py", line 42, in main
model.CLASSES = get_classes(args.palette)
File "/opt/conda/lib/python3.7/site-packages/mmseg/core/evaluation/class_names.py", line 133, in get_classes
raise ValueError(f'Unrecognized dataset: {dataset}')
ValueError: Unrecognized dataset: coco_stuff

In get_classes these are the only datasets hard coded: cityscapes, ade and voc.
Do I have to embed the classes inside this script?
Surely there is a better way...

image_demo Potsdam datasets

首先很感谢作者您对于图像分割的贡献以及开源代码，但是我在预测Potsdam数据集的时候存在一些小问题，运行代码如下：
python segmentation/image_demo.py segmentation/configs/potsdam/mask2former_beit_adapter_large_512_80k_potsdam_ss.py D:\PyCharm_Projects\ViT-Adapter-main\segmentation\pretrained_model\beit_large_patch16_224_pt22k_ft22k.pth D:/downloads/Potsdam/Potsdam/myOutputs/images/2_10_0_0.png

但是这里运行时报错显示：
unexpected key in source state_dict: model
missing keys in source state_dict: backbone.cls_token, backbone.level_embed, backbone.patch_embed.proj.weight, backbone.patch_embed.proj.bias, backbone.blocks.0.gamma_1, backbone.blocks.0.gamma_2, ...
等一系列backbone,decode_head开头的一系列权重载入

其次还存在这个问题：

Traceback (most recent call last):
File "D:\PyCharm_Projects\ViT-Adapter-main\segmentation\image_demo.py", line 58, in
main()
File "D:\PyCharm_Projects\ViT-Adapter-main\segmentation\image_demo.py", line 45, in main
result = inference_segmentor(model, args.img)
File "d:\downloads\mmsegmentation-master\mmsegmentation-master\mmseg\apis\inference.py", line 102, in inference_segmentor
result = model(return_loss=False, rescale=True, **data)
File "E:\conda\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "E:\conda\lib\site-packages\mmcv\runner\fp16_utils.py", line 116, in new_func
return old_func(*args, **kwargs)
File "d:\downloads\mmsegmentation-master\mmsegmentation-master\mmseg\models\segmentors\base.py", line 110, in forward
return self.forward_test(img, img_metas, **kwargs)
File "d:\downloads\mmsegmentation-master\mmsegmentation-master\mmseg\models\segmentors\base.py", line 94, in forward_test
return self.aug_test(imgs, img_metas, **kwargs)
File "D:\PyCharm_Projects\ViT-Adapter-main\segmentation\mmseg_custom\models\segmentors\encoder_decoder_mask2former.py", line 276, in aug_test
seg_logit = self.inference(imgs[0], img_metas[0], rescale)
File "D:\PyCharm_Projects\ViT-Adapter-main\segmentation\mmseg_custom\models\segmentors\encoder_decoder_mask2former.py", line 240, in inference
seg_logit = self.slide_inference(img, img_meta, rescale)
File "D:\PyCharm_Projects\ViT-Adapter-main\segmentation\mmseg_custom\models\segmentors\encoder_decoder_mask2former.py", line 180, in slide_inference
crop_seg_logit = self.encode_decode(crop_img, img_meta)
File "D:\PyCharm_Projects\ViT-Adapter-main\segmentation\mmseg_custom\models\segmentors\encoder_decoder_mask2former.py", line 73, in encode_decode
x = self.extract_feat(img)
File "D:\PyCharm_Projects\ViT-Adapter-main\segmentation\mmseg_custom\models\segmentors\encoder_decoder_mask2former.py", line 65, in extract_feat
x = self.backbone(img)
File "E:\conda\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "D:\PyCharm_Projects\ViT-Adapter-main\segmentation\mmseg_custom\models\backbones\beit_adapter.py", line 115, in forward
x, c, cls = layer(x, c, cls, self.blocks[indexes[0]:indexes[-1] + 1],
File "E:\conda\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "D:\PyCharm_Projects\ViT-Adapter-main\segmentation\mmseg_custom\models\backbones\adapter_modules.py", line 222, in forward
x = blk(x, H, W)
File "E:\conda\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "D:\PyCharm_Projects\ViT-Adapter-main\segmentation\mmseg_custom\models\backbones\base\beit.py", line 186, in forward
x = _inner_forward(x)
File "D:\PyCharm_Projects\ViT-Adapter-main\segmentation\mmseg_custom\models\backbones\base\beit.py", line 179, in _inner_forward
x = x + self.drop_path(self.gamma_1 * self.attn(self.norm1(x), rel_pos_bias=rel_pos_bias))
File "E:\conda\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "D:\PyCharm_Projects\ViT-Adapter-main\segmentation\mmseg_custom\models\backbones\base\beit.py", line 136, in forward
attn = attn + relative_position_bias.unsqueeze(0)
RuntimeError: The size of tensor a (257) must match the size of tensor b (1025) at non-singleton dimension 3

如果您有时间可以给我一些指点，我将非常感谢

关于更新pytorch1.12后遇到的问题

您好，由于另个项目需要，我把pytorch升级为最新版本，在这之后我再运行vit-adapter部分代码会出现一下问题：
File "/data0/ops/modules/ms_deform_attn.py", line 127, in forward
output = MSDeformAttnFunction.apply(value, input_spatial_shapes, input_level_start_index,
File "/home/.conda/envs/lib/python3.9/site-packages/torch/cuda/amp/autocast_mode.py", line 116, in decorate_fwd
return fwd(*_cast(args, cast_inputs), _cast(kwargs, cast_inputs))
File "/data0/ops/functions/ms_deform_attn_func.py", line 25, in forward
output = MSDA.ms_deform_attn_forward(value, value_spatial_shapes,
RuntimeError: Unrecognized tensor type ID: PythonTLSSnapshot
Exception raised from dispatchKeyToBackend at /home/.conda/envs/lib/python3.9/site-packages/torch/include/c10/core/Backend.h:110 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x3e (0x7fde793da1ee in /home/.conda/envs/lib/python3.9/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const, char const, unsigned int, std::string const&) + 0x5c (0x7fde793b55e8 in /home/.conda/envs/lib/python3.9/site-packages/torch/lib/libc10.so)
frame #2: + 0x3cd14 (0x7fe29fbacd14 in /home/.conda/envs/lib/python3.9/site-packages/MultiScaleDeformableAttention-1.0-py3.9-linux-x86_64.egg/MultiScaleDeformableAttention.cpython-39-x86_64-linux-gnu.so)
frame #3: ms_deform_attn_forward(at::Tensor const&, at::Tensor const&, at::Tensor const&, at::Tensor const&, at::Tensor const&, int) + 0x82 (0x7fe29fbad0b2 in /home/.conda/envs/lib/python3.9/site-packages/MultiScaleDeformableAttention-1.0-py3.9-linux-x86_64.egg/MultiScaleDeformableAttention.cpython-39-x86_64-linux-gnu.so)
frame #4: + 0x4950a (0x7fe29fbb950a in /home/.conda/envs/lib/python3.9/site-packages/MultiScaleDeformableAttention-1.0-py3.9-linux-x86_64.egg/MultiScaleDeformableAttention.cpython-39-x86_64-linux-gnu.so)
frame #5: + 0x46676 (0x7fe29fbb6676 in /home/.conda/envs/lib/python3.9/site-packages/MultiScaleDeformableAttention-1.0-py3.9-linux-x86_64.egg/MultiScaleDeformableAttention.cpython-39-x86_64-linux-gnu.so)

当然，这只是一个使用后的反馈，毕竟安装独立环境就能解决这个问题

here is an problem with MS deform

when I run the following sentence:
output = MSDA.ms_deform_attn_forward(value, value_spatial_shapes, value_level_start_index, sampling_locations, attention_weights, ctx.im2col_step)

The error occurs:
type object 'MultiScaleDeformableAttention' has no attribute 'ms_deform_attn_forward'

pls help, Thanks

你好，请问”assert (count_mat == 0).sum() == 0“，在文件中的作用是什么？

ViT-Adapter-main/segmentation/mmseg_custom/models/segmentors/encoder_decoder_mask2former.py的195行，存在句子”assert (count_mat == 0).sum() == 0“，每次运行都会因为这个异常而跳出来。我将其注释掉，程序可以正常运行，但每次得出来的结果都是一样的，不知道和这里是否有关？想请问，count_mat代表什么意思？

下面是那一段代码：
def slide_inference(self, img, img_meta, rescale):
"""Inference by sliding-window with overlap.

    If h_crop > h_img or w_crop > w_img, the small patch will be used to
    decode without padding.
    """
    print("img.shape",img.shape)
    # print("img",img)
    h_stride, w_stride = self.test_cfg.stride
    h_crop, w_crop = self.test_cfg.crop_size
    batch_size, _, h_img, w_img = img.size()

    num_classes = self.num_classes
    # print("num_class",num_classes)

    h_grids = max(h_img - h_crop + h_stride - 1, 0) // h_stride + 1
    w_grids = max(w_img - w_crop + w_stride - 1, 0) // w_stride + 1
    preds = img.new_zeros((batch_size, num_classes, h_img, w_img))
    # print("preds",preds)
    count_mat = img.new_zeros((batch_size, 1, h_img, w_img))
    # print("count_mat",count_mat)
    for h_idx in range(h_grids):
        for w_idx in range(w_grids):
            y1 = h_idx * h_stride
            x1 = w_idx * w_stride
            y2 = min(y1 + h_crop, h_img)
            x2 = min(x1 + w_crop, w_img)
            y1 = max(y2 - h_crop, 0)
            x1 = max(x2 - w_crop, 0)
            crop_img = img[:, :, y1:y2, x1:x2]
            crop_seg_logit = self.encode_decode(crop_img, img_meta)
            preds += F.pad(crop_seg_logit,
                           (int(x1), int(preds.shape[3] - x2), int(y1),
                            int(preds.shape[2] - y2)))

            count_mat[:, :, y1:y2, x1:x2] += 1
            # print(count_mat[:, :, y1:y2, x1:x2])


    # assert (count_mat == 0).sum() == 0
    if torch.onnx.is_in_onnx_export():
        # cast count_mat to constant while exporting to ONNX
        count_mat = torch.from_numpy(
            count_mat.cpu().detach().numpy()).to(device=img.device)
    preds = preds / count_mat
    # preds = preds
    if rescale:
        preds = resize(
            preds,
            size=img_meta[0]['ori_shape'][:2],
            mode='bilinear',
            align_corners=self.align_corners,
            warning=False)
    return preds

打扰了~~~

Broken Download Links

Hello, when attempting to download SOTA pre-trained models COCO-Stuff-164k on ADE20k or Mapillary on Cityscapes, it redirects to the GitHub broken link page.

The dimensions do not broadcast

Hi, great work! Trying to test it out. Maybe a bug:

Settings
config: mask2former_beit_adapter_base_512_40k_cocostuff10k_ss.py
checkpoint: mask2former_beit_adapter_base_512_40k_cocostuff10k.pth.tar

Error
In this line in beit.py:

            attn = attn + relative_position_bias.unsqueeze(0)

The dimensions do not broadcast.
If one takes an input image 1x3x128x128, then the dimensions are:

# attn.shape
# ([1, 12, 65, 65])

# relative_position_bias.unsqueeze(0).shape
# ([1, 12, 1025, 1025])

分割部分，请问背景算在类别里面嘛？

num_things_classes = 8
num_stuff_classes = 11
num_classes = num_things_classes + num_stuff_classes
【城市景观的config】
这里面包括背景嘛？

Augmentations used for training

Hi Authors,
Have you used any kind of image augmentations to train this model?

ade20k 640x640

The mmseg official code uses the image scale of (2560, 640) while this repo uses (2048, 640).

针对非固定尺寸的图像，似乎无法到达理想的状态

最近跑了好几个数据集，其中包括了三个图片尺寸固定的数据集和两个非固定尺寸的图片的数据集。其中在固定尺寸数据集上，均能到达SOTA或者接近SOTA的级别，但是在非固定尺寸的数据集上，会比当前SOTA低接近10-20个点。我怀疑是我配置文件弄错了，但我不太清楚哪里弄得不对。
以下是我的思考：

我觉得可能是img_scale这个参数设置的问题，针对固定数据集，我可以该数值设置的就是固定尺寸的值。但是针对非固定尺寸（长宽比不一致）数据集，我不知道这个值该设置成什么？目前设置的是最大的图片的高宽。
img_scale=（x,y），其中这个x代表的是高，y代表的是宽吗？

其余的情况我暂时没有考虑到。
以下是我的config：

_base_ = [
    '../_base_/models/upernet_nofcn_beit.py',
    '../_base_/datasets/isic2018.py',
    '../_base_/default_runtime.py',
    '../_base_/schedules/schedule_160k.py'
]
img_scale = (6748, 6748) #此处不知道该填什么，已知该数据集图片最宽为6748，图片最高为4499。【不是同一张图片】
t_size=256
crop_size = (t_size, t_size)
# pretrained = 'https://conversationhub.blob.core.windows.net/beit-share-public/beit/beit_large_patch16_224_pt22k_ft22k.pth'
# pretrained = '/data/cgh/project/ViT-Adapter-main/segmentation/pretrained/beit_large_patch16_224_pt22k_ft22k.pth'
model = dict(
    # pretrained=pretrained,
    backbone=dict(
        type='BEiTAdapter',
        img_size=t_size,
        patch_size=16,
        embed_dim=768,
        depth=12,
        num_heads=12,
        mlp_ratio=4,
        qkv_bias=True,
        use_abs_pos_emb=False,
        use_rel_pos_bias=True,
        init_values=1e-6,
        drop_path_rate=0.2,
        conv_inplane=64,
        n_points=4,
        deform_num_heads=12,
        cffn_ratio=0.25,
        deform_ratio=0.5,
        interaction_indexes=[[0, 2], [3, 5], [6, 8], [9, 11]],
    ),
    decode_head=dict(
        in_channels=[768, 768, 768, 768],
        num_classes=2,
        channels=768,
    ),
    # auxiliary_head=dict(
    #     in_channels=768,
    #     num_classes=2
    # ),
    # test_cfg = dict(mode='whole')
    test_cfg = dict(mode='slide', crop_size=crop_size, stride=(85, 85))
)
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
    dict(type='LoadImageFromFile'), # 载入图片
    dict(type='LoadAnnotations'), # 载入mask
    dict(type='Resize', img_scale=img_scale, ratio_range=(0.5, 2.0)), # 开始Resize，
    dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),# 随机裁剪，从img_scale中裁剪出crop_size
    dict(type='RandomFlip', prob=0.5), # 随机翻转（是否是垂直？还是水平翻转？）
    dict(type='PhotoMetricDistortion'),# 光度失真，可以调整图像的亮度，色度，对比度，饱和度，以及加入噪点；
    dict(type='Normalize', **img_norm_cfg), # 归一化
    # dict(type='RandomRotate',prob=0.5,degree=30,pad_val=255, seg_pad_val=0),
    dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=1), # padding到crop_size的大小，其中图像padding加像素值0，mask加上像素值1
    dict(type='DefaultFormatBundle'), # 流程里收集数据的默认格式捆
    dict(type='Collect', keys=['img', 'gt_semantic_seg']) # 决定数据里哪些键被传递到分割器里的流程。
]
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug', # 封装测试时数据增广
        img_scale=img_scale,
        # img_ratios=[0.5, 0.75, 1.0, 1.25, 1.5, 1.75], # 多尺度测试，后面测试的时候可以开起来提分数
        flip=False,
        transforms=[
            dict(type='Resize', keep_ratio=True), # keep_ratio是否保持宽和高的比例
            dict(type='ResizeToMultiple', size_divisor=32), # 根据给定的大小或比例因子调整图像大小，然后将调整大小或重新缩放的图像大小四舍五入到可以除以除数的最接近的值。
            dict(type='RandomFlip'),
            dict(type='Normalize', **img_norm_cfg),
            dict(type='ImageToTensor', keys=['img']),
            dict(type='Collect', keys=['img']),
        ])
]
optimizer = dict(_delete_=True, type='AdamW', lr=1e-4, betas=(0.9, 0.999), weight_decay=0.05,
                 constructor='LayerDecayOptimizerConstructor',
                 paramwise_cfg=dict(num_layers=24, layer_decay_rate=0.90))
lr_config = dict(_delete_=True, policy='poly',
                 warmup='linear',
                 warmup_iters=1500,
                 warmup_ratio=1e-6,
                 power=1.0, min_lr=0.0, by_epoch=False)
data=dict(samples_per_gpu=4,# bs=16的时候，显存用8g左右
            workers_per_gpu=2,
          train=dict(pipeline=train_pipeline),
          val=dict(pipeline=test_pipeline),
          test=dict(pipeline=test_pipeline))
runner = dict(type='IterBasedRunner', max_iters=250000)
checkpoint_config = dict(by_epoch=False, interval=1000, max_keep_ckpts=1)
evaluation = dict(interval=5000, metric='mDice', save_best='mDice')

ONNX conversion?

Question about masking the input when using MAE weights with ViT-Adapter.

Hi, thanks for your great work. I was looking at your config file for mask_rcnn_mae_adapter_base_lsj_fpn_25ep_coco.py which uses the MAE pretrained weights with the ViTAdapter backbone. I could not find any specification for the masking ratio for the input.

When looking at the vit.py and vit_adpater.py files, I could not find any method that performs random masking on the input which is the usual norm with the MAE based works. Am I missing something here? How do you ensure the masking, or do you input without any masking? Thanks

pretrained backbones

Thanks for the great work! Did you try to use the imagenet-1k 384x384 finetuned backbones?

more results on semantic segmentation

Could you release the results of BEiT-B+UperNet on cocostuff-10k and pascal context?

Test script for single image or webcam

Hi. Thank you for sharing your work.
Is there any chance to release a script for testing a single image, or maybe allowing for webcam usage?

When will you share your code? Thanks.

Appreciate your work.

No module named 'MultiScaleDeformableAttention'

Hello, when I import mmseg_custom, it gave this error... Could you help me to fix it, Thank you!!!~~~

ModuleNotFoundError Traceback (most recent call last)
/tmp/ipykernel_208157/151112096.py in
1 get_ipython().run_line_magic('cd', '/media/preethamam/Utilities-SSD-1/Xtreme_Programming/Jinshan/ViT-Adapter/segmentation')
----> 2 import mmseg_custom

/media/preethamam/Utilities-SSD-1/Xtreme_Programming/Jinshan/ViT-Adapter/segmentation/mmseg_custom/init.py in
1 from .core import * # noqa: F401,F403
2 from .datasets import * # noqa: F401,F403
----> 3 from .models import * # noqa: F401,F403

/media/preethamam/Utilities-SSD-1/Xtreme_Programming/Jinshan/ViT-Adapter/segmentation/mmseg_custom/models/init.py in
1 # Copyright (c) OpenMMLab. All rights reserved.
----> 2 from .backbones import * # noqa: F401,F403
3 from .builder import (MASK_ASSIGNERS, MATCH_COST, TRANSFORMER, build_assigner,
4 build_match_cost)
5 from .decode_heads import * # noqa: F401,F403

/media/preethamam/Utilities-SSD-1/Xtreme_Programming/Jinshan/ViT-Adapter/segmentation/mmseg_custom/models/backbones/init.py in
1 # Copyright (c) Shanghai AI Lab. All rights reserved.
----> 2 from .beit_adapter import BEiTAdapter
3 from .beit_baseline import BEiTBaseline
4 from .vit_adapter import ViTAdapter
5 from .vit_baseline import ViTBaseline

/media/preethamam/Utilities-SSD-1/Xtreme_Programming/Jinshan/ViT-Adapter/segmentation/mmseg_custom/models/backbones/beit_adapter.py in
8 import torch.nn.functional as F
9 from mmseg.models.builder import BACKBONES
---> 10 from ops.modules import MSDeformAttn
11 from timm.models.layers import DropPath, trunc_normal_
12 from torch.nn.init import normal_

/media/preethamam/Utilities-SSD-1/Xtreme_Programming/Jinshan/ViT-Adapter/segmentation/ops/modules/init.py in
7 # ------------------------------------------------------------------------------------------------
8
----> 9 from .ms_deform_attn import MSDeformAttn
10
11 all = ['MSDeformAttn']

/media/preethamam/Utilities-SSD-1/Xtreme_Programming/Jinshan/ViT-Adapter/segmentation/ops/modules/ms_deform_attn.py in
17 from torch.nn.init import constant_, xavier_uniform_
18
---> 19 from ..functions import MSDeformAttnFunction
20
21

7 # ------------------------------------------------------------------------------------------------ 8 ----> 9 from .ms_deform_attn_func import MSDeformAttnFunction 10 11 __all__ = ['MSDeformAttnFunction']

/media/preethamam/Utilities-SSD-1/Xtreme_Programming/Jinshan/ViT-Adapter/segmentation/ops/functions/ms_deform_attn_func.py in
9 from future import absolute_import, division, print_function
10
---> 11 import MultiScaleDeformableAttention as MSDA
12 import torch
13 import torch.nn.functional as F

ModuleNotFoundError: No module named 'MultiScaleDeformableAttention'

Error in grid_sampler

Hi. I am using a custom dataset and receiving this error. Used ADE20k dataset and received the same error.

File "/home/workspace/training/ViT-Adapter/segmentation/mmseg_custom/core/utils/misc.py", line 21, in multi_apply
  return tuple(map(list, zip(*map_results)))
File "/home/workspace/training/ViT-Adapter/segmentation/mmseg_custom/models/decode_heads/mask2former_head.py", line 242, in _get_target_single
  1)).squeeze(1)
File "/home/ubuntu/.local/lib/python3.6/site-packages/mmcv/ops/point_sample.py", line 280, in point_sample
  input, denormalize(points), align_corners=align_corners, **kwargs)
File "/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/functional.py", line 4011, in grid_sample
  return torch.grid_sampler(input, grid, mode_enum, padding_mode_enum, align_corners)
RuntimeError: grid_sampler(): expected 4D or 5D input and grid with same number of dimensions, but got input with sizes [20, 1, 512, 512, 3] and grid with sizes [20, 12544, 1, 2]

ImportError

Traceback (most recent call last):
File "/home/ccq/Data/liu/ViT-Adapter-main/segmentation/train.py", line 11, in
import mmseg_custom # noqa: F401,F403
File "/home/ccq/Data/liu/ViT-Adapter-main/segmentation/mmseg_custom/init.py", line 3, in
from .models import * # noqa: F401,F403
File "/home/ccq/Data/liu/ViT-Adapter-main/segmentation/mmseg_custom/models/init.py", line 2, in
from .backbones import * # noqa: F401,F403
File "/home/ccq/Data/liu/ViT-Adapter-main/segmentation/mmseg_custom/models/backbones/init.py", line 2, in
from .beit_adapter import BEiTAdapter
File "/home/ccq/Data/liu/ViT-Adapter-main/segmentation/mmseg_custom/models/backbones/beit_adapter.py", line 9, in
from detection.ops.modules import MSDeformAttn
File "/home/ccq/Data/liu/ViT-Adapter-main/detection/ops/modules/init.py", line 9, in
from .ms_deform_attn import MSDeformAttn
File "/home/ccq/Data/liu/ViT-Adapter-main/detection/ops/modules/ms_deform_attn.py", line 19, in
from ..functions import MSDeformAttnFunction
File "/home/ccq/Data/liu/ViT-Adapter-main/detection/ops/functions/init.py", line 9, in
from .ms_deform_attn_func import MSDeformAttnFunction
File "/home/ccq/Data/liu/ViT-Adapter-main/detection/ops/functions/ms_deform_attn_func.py", line 11, in
import MultiScaleDeformableAttention as MSDA
ImportError: /home/ccq/anaconda3/envs/ViT-Adapter-main/lib/python3.8/site-packages/MultiScaleDeformableAttention-1.0-py3.8-linux-x86_64.egg/MultiScaleDeformableAttention.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZNK2at10TensorBase8data_ptrIdEEPT_v

Hello, what is the reason for this error? I want a solution, thanks.