yihongxu / transcenter Goto Github PK

This is the official implementation of TransCenter (TPAMI). The code and pretrained models are now available here: https://gitlab.inria.fr/yixu/TransCenter_official.

Home Page: https://team.inria.fr/robotlearn/transcenter-transformers-with-dense-queriesfor-multiple-object-tracking/

License: Other

computer-vision deep-learning pytorch multiple-object-tracking transformers

transcenter's Introduction

TransCenter: Transformers with Dense Representations for Multiple-Object Tracking

The work is accepted for TPAMI 2022.

An update towards a more efficient and powerful TransCenter, TransCenter-Lite!

The code for TransCenter and TransCenter-Lite is now available, you can find the code and pretrained models at https://gitlab.inria.fr/robotlearn/TransCenter_official.

TransCenter: Transformers with Dense Representations for Multiple-Object Tracking
Yihong Xu, Yutong Ban, Guillaume Delorme, Chuang Gan, Daniela Rus, Xavier Alameda-Pineda
[Paper] [Project]

MOT20 example:

Bibtex

If you find this code useful, please star the project and consider citing:

@misc{xu2021transcenter,
      title={TransCenter: Transformers with Dense Representations for Multiple-Object Tracking}, 
      author={Yihong Xu and Yutong Ban and Guillaume Delorme and Chuang Gan and Daniela Rus and Xavier Alameda-Pineda},
      year={2021},
      eprint={2103.15145},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

MOTChallenge Results

For TransCenter:

MOT17 public detections:

Pretrained	MOTA	MOTP	IDF1	FP	FN	IDS
CoCo	71.9%	80.5%	64.1%	27,356	126,860	4,118
CH	75.9%	81.2%	65.9%	30,190	100,999	4,626

MOT20 public detections:

Pretrained	MOTA	MOTP	IDF1	FP	FN	IDS
CoCo	67.7%	79.8%	58.9%	54,967	108,376	3,707
CH	72.8%	81.0%	57.6%	28,026	110,312	2,621

MOT17 private detections:

Pretrained	MOTA	MOTP	IDF1	FP	FN	IDS
CoCo	72.7%	80.3%	64.0%	33,807	115,542	4,719
CH	76.2%	81.1%	65.5%	40,101	88,827	5,394

MOT20 private detections:

Pretrained	MOTA	MOTP	IDF1	FP	FN	IDS
CoCo	67.7%	79.8%	58.7%	56,435	107,163	3,759
CH	72.9%	81.0%	57.7%	28,596	108,982	2,625

Note:

The results can be slightly different depending on the running environment.
We might keep updating the results in the near future.

Acknowledgement

The code for TransCenterV2, TransCenter-Lite is modified and network pre-trained weights are obtained from the following repositories:

The PVTv2 backbone pretrained models from PVTv2.
The data format conversion code is modified from CenterTrack.

CenterTrack, Deformable-DETR, Tracktor.

@article{zhou2020tracking,
  title={Tracking Objects as Points},
  author={Zhou, Xingyi and Koltun, Vladlen and Kr{\"a}henb{\"u}hl, Philipp},
  journal={ECCV},
  year={2020}
}

@InProceedings{tracktor_2019_ICCV,
author = {Bergmann, Philipp and Meinhardt, Tim and Leal{-}Taix{\'{e}}, Laura},
title = {Tracking Without Bells and Whistles},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}}

@article{zhu2020deformable,
  title={Deformable DETR: Deformable Transformers for End-to-End Object Detection},
  author={Zhu, Xizhou and Su, Weijie and Lu, Lewei and Li, Bin and Wang, Xiaogang and Dai, Jifeng},
  journal={arXiv preprint arXiv:2010.04159},
  year={2020}
}

@article{zhang2021bytetrack,
  title={ByteTrack: Multi-Object Tracking by Associating Every Detection Box},
  author={Zhang, Yifu and Sun, Peize and Jiang, Yi and Yu, Dongdong and Yuan, Zehuan and Luo, Ping and Liu, Wenyu and Wang, Xinggang},
  journal={arXiv preprint arXiv:2110.06864},
  year={2021}
}

@article{wang2021pvtv2,
  title={Pvtv2: Improved baselines with pyramid vision transformer},
  author={Wang, Wenhai and Xie, Enze and Li, Xiang and Fan, Deng-Ping and Song, Kaitao and Liang, Ding and Lu, Tong and Luo, Ping and Shao, Ling},
  journal={Computational Visual Media},
  volume={8},
  number={3},
  pages={1--10},
  year={2022},
  publisher={Springer}
}

Several modules are from:

MOT Metrics in Python: py-motmetrics

Soft-NMS: Soft-NMS

DETR: DETR

DCNv2: DCNv2

PVTv2: PVTv2

ByteTrack: ByteTrack

transcenter's People

Contributors

Stargazers

Watchers

Forkers

cardinal2376 cv-ip mysqlsc chenying99 collector-m zzzz737 amitgalor18

transcenter's Issues

TransCenterv2 demo code please

Dear author:
Thanks for sharing the great work. For a lot people are interested in your algorithm, the fast way to know it is just git clone and try the demo.py. Would you kindly share it here? thank you.

Performance about TransCenter_Lite?

Is there any comparison about the TransCenter_V2 and TransCenter_Lite?

What is the graphic memory size during training？ and how long did the training process take?

I'm interested in your work. but i only have a single rtx3090 machine.

关于减少显存占用

感谢开源，
作者有试过输入小一点分辨率的图片，或者其他减少显存占用的思路，实验效果如何吗
（）当前的模型在batch_size调到1依然需要超过12G显存）

environment error

there are pacakges installed from PyPI.
so when I use

conda create --name <env_name> --file requirements.txt

I get the following error:
PackagesNotFoundError: The following packages are not available from current channels:

Could you use the following commands to upload your environments?

conda env export > environment.yaml
conda env create -n <env_name> -f environment.yaml

ModuleNotFoundError: No module named 'motmetrics'

Hello, I followed the option2 to create an environment and converted MOT20 dataset to coco format.
Then I want to track MOT20 dataset
python tracking/transcenter/mot20_pub.py --data_dir=./MOT20
I got an error and did not know how to deal with it.

Version 1

Hi, thanks for sharing the codes. Could you please release the previous version 1 ?

Thanks a lot !

The loss of the tracking decoder

Hello, thanks for your excellent work, and i would like to know the reason why the loss of tracking decoder is hard to decrease when applied on my custom small object dataset ?

感谢开源，您的文章中用yolox做了实验，请问是直接您使用yolox的检测结果，还是重新训练，因为我直接使用yolox的结果，没有达到您给的结果，谢谢

Train it with multi-class?

I found the model is trained only in person category, can we train it with multi-class like person and car?

I can't open https://gitlab.inria.fr/robotlearn/TransCenter_official

This doesn't seem to be the problem with my network

MultiScaleDeformableAttention Error for TransCenter v2

No module named 'MultiScaleDeformableAttention'

I failed in this step:

Any pretrained model provided?

Hey, Thanks for this great work.

Have you released any pretrained checkpoint for TransCenter?

Question about motion model

Hi,
I noticed that the config file (detracker_reidV3.yaml) contains a parameter about a motion model. I see that it is imported into the parameters of each new track (line 132 in track.py) but I don't see where this parameter is actually used, for example when predicting the new position of a track before the linear matching.
I also noticed that in the function tracks_dets_matching_tracking() function there is an update to the current track position but the line "t.prev_pos = t.pos" is commented out (line 184 in tracker.py).

Are you using the velocity/motion information in some other part of the code or did you decide not to use this information at all?

how to get targets['pre_cts']?

Thanks for your wonderful work! Can someone tell me how targets['pre_cts'] was obtained?

Any updates on the code

Hi! Do you have any update on the code?

how to visualize heatmap

Do you use gradcam (I cannot visualize heatmap by this way) or have a special function

Question about the pretrained models

Hi,
Are the models trained on the full train set or only on 75% (while the last 25% remained for validation)?
I'm asking because the class "GenericDataset_val" has a default value of train_ratio=0.5 which leads to a 75%-25% partition in the evaluation, but it is not mentioned in the pretrained models section of the readme.

Does any of the provided images work with V100 GPUs?

I am getting the following error in the deformable convolution package.

error in modulated_deformable_im2col_cuda: no kernel image is available for execution on the device
error in modulated_deformable_im2col_cuda: no kernel image is available for execution on the device
error in modulated_deformable_im2col_cuda: no kernel image is available for execution on the device
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/pdb.py", line 1699, in main
pdb._runscript(mainpyfile)
File "/opt/conda/lib/python3.7/pdb.py", line 1568, in _runscript
self.run(statement)
File "/opt/conda/lib/python3.7/bdb.py", line 578, in run
exec(cmd, globals, locals)
File "", line 1, in
File "/net/coxfs01/srv/export/coxfs01/pfister_lab2/share_root/Lab/abanerjee/VideoProjects/TransCenter_official/training/transcenter/main_mot17_tracking.py", line 32, in
import os
File "/net/coxfs01/srv/export/coxfs01/pfister_lab2/share_root/Lab/abanerjee/VideoProjects/TransCenter_official/training/transcenter/main_mot17_tracking.py", line 412, in main
model, criterion, data_loader_train, optimizer, device, epoch, args.clip_max_norm, adaptive_clip=args.adaptive_clip)
File "/net/coxfs01/srv/export/coxfs01/pfister_lab2/share_root/Lab/abanerjee/VideoProjects/TransCenter_official/training/transcenter/engine.py", line 65, in train_one_epoch
outputs = model(samples, pre_samples=pre_samples, pre_hm=pre_hm)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/net/coxfs01/srv/export/coxfs01/pfister_lab2/share_root/Lab/abanerjee/VideoProjects/TransCenter_official/training/transcenter/models/deformable_detr.py", line 287, in forward
hs[layer_lvl] = self.ida_up[0](hs[layer_lvl], 0, len(hs[layer_lvl]))[-1]
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/net/coxfs01/srv/export/coxfs01/pfister_lab2/share_root/Lab/abanerjee/VideoProjects/TransCenter_official/training/transcenter/models/dla.py", line 98, in forward
layers[startp] = node(layers[startp])
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/net/coxfs01/srv/export/coxfs01/pfister_lab2/share_root/Lab/abanerjee/VideoProjects/TransCenter_official/training/transcenter/models/dla.py", line 48, in forward
x = self.conv(x)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/opt/DCNv2/dcn_v2.py", line 128, in forward
self.deformable_groups)
File "/opt/DCNv2/dcn_v2.py", line 31, in forward
ctx.deformable_groups)
RuntimeError: cublas runtime error : the GPU program failed to execute at /opt/conda/conda-bld/pytorch_1591914880026/work/aten/src/THC/THCBlas.cu:335

I am using the pytorch1-5cuda10-1_RTX.sif image with V100 gpu. Is there a way to make it work?

Thanks

RuntimeError: expected scalar type Float but found Half

Traceback (most recent call last):
File "/home/valca509/Desktop/TransCenter_official-main/training/main_mot20.py", line 511, in
main(args)
File "/home/valca509/Desktop/TransCenter_official-main/training/main_mot20.py", line 366, in main
model, criterion, postprocessors, data_loader_val, base_ds, device, args.output_dir, args.half
File "/home/valca509/anaconda3/envs/transcenter3/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "/home/valca509/Desktop/TransCenter_official-main/training/engine.py", line 172, in evaluate
outputs = model(samples, pre_samples=pre_samples, pre_hm=targets['pre_cts'].clone())
File "/home/valca509/anaconda3/envs/transcenter3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/valca509/Desktop/TransCenter_official-main/training/deformable_detr.py", line 215, in forward
hs[layer_lvl] = self.ida_up(hs[layer_lvl], 0, len(hs[layer_lvl]))[-1]
File "/home/valca509/anaconda3/envs/transcenter3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/valca509/Desktop/TransCenter_official-main/models/dla.py", line 96, in forward
layers[i-1] = node(layers[i] + layers[i - 1])
File "/home/valca509/anaconda3/envs/transcenter3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/valca509/anaconda3/envs/transcenter3/lib/python3.7/site-packages/torch/cuda/amp/autocast_mode.py", line 141, in decorate_autocast
return func(*args, **kwargs)
File "/home/valca509/Desktop/TransCenter_official-main/models/dla.py", line 47, in forward
x_out = self.conv(x)
File "/home/valca509/anaconda3/envs/transcenter3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/valca509/Desktop/TransCenter_official-main/to_install/DCNv2_1.9/dcn_v2.py", line 172, in forward
self.deformable_groups,
File "/home/valca509/Desktop/TransCenter_official-main/to_install/DCNv2_1.9/dcn_v2.py", line 39, in forward
ctx.deformable_groups,
RuntimeError: expected scalar type Float but found Half