Giter VIP home page Giter VIP logo

yihongxu / transcenter Goto Github PK

View Code? Open in Web Editor NEW
106.0 10.0 7.0 35.44 MB

This is the official implementation of TransCenter (TPAMI). The code and pretrained models are now available here: https://gitlab.inria.fr/yixu/TransCenter_official.

Home Page: https://team.inria.fr/robotlearn/transcenter-transformers-with-dense-queriesfor-multiple-object-tracking/

License: Other

computer-vision deep-learning pytorch multiple-object-tracking transformers

transcenter's Introduction

TransCenter: Transformers with Dense Representations for Multiple-Object Tracking

The work is accepted for TPAMI 2022.

An update towards a more efficient and powerful TransCenter, TransCenter-Lite!

The code for TransCenter and TransCenter-Lite is now available, you can find the code and pretrained models at https://gitlab.inria.fr/robotlearn/TransCenter_official.

TransCenter: Transformers with Dense Representations for Multiple-Object Tracking
Yihong Xu, Yutong Ban, Guillaume Delorme, Chuang Gan, Daniela Rus, Xavier Alameda-Pineda
[Paper] [Project]



MOT20 example:

Bibtex

If you find this code useful, please star the project and consider citing:

@misc{xu2021transcenter,
      title={TransCenter: Transformers with Dense Representations for Multiple-Object Tracking}, 
      author={Yihong Xu and Yutong Ban and Guillaume Delorme and Chuang Gan and Daniela Rus and Xavier Alameda-Pineda},
      year={2021},
      eprint={2103.15145},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

MOTChallenge Results

For TransCenter:

MOT17 public detections:

Pretrained MOTA MOTP IDF1 FP FN IDS
CoCo 71.9% 80.5% 64.1% 27,356 126,860 4,118
CH 75.9% 81.2% 65.9% 30,190 100,999 4,626

MOT20 public detections:

Pretrained MOTA MOTP IDF1 FP FN IDS
CoCo 67.7% 79.8% 58.9% 54,967 108,376 3,707
CH 72.8% 81.0% 57.6% 28,026 110,312 2,621

MOT17 private detections:

Pretrained MOTA MOTP IDF1 FP FN IDS
CoCo 72.7% 80.3% 64.0% 33,807 115,542 4,719
CH 76.2% 81.1% 65.5% 40,101 88,827 5,394

MOT20 private detections:

Pretrained MOTA MOTP IDF1 FP FN IDS
CoCo 67.7% 79.8% 58.7% 56,435 107,163 3,759
CH 72.9% 81.0% 57.7% 28,596 108,982 2,625

Note:

  • The results can be slightly different depending on the running environment.
  • We might keep updating the results in the near future.

Acknowledgement

The code for TransCenterV2, TransCenter-Lite is modified and network pre-trained weights are obtained from the following repositories:

  1. The PVTv2 backbone pretrained models from PVTv2.
  2. The data format conversion code is modified from CenterTrack.

CenterTrack, Deformable-DETR, Tracktor.

@article{zhou2020tracking,
  title={Tracking Objects as Points},
  author={Zhou, Xingyi and Koltun, Vladlen and Kr{\"a}henb{\"u}hl, Philipp},
  journal={ECCV},
  year={2020}
}

@InProceedings{tracktor_2019_ICCV,
author = {Bergmann, Philipp and Meinhardt, Tim and Leal{-}Taix{\'{e}}, Laura},
title = {Tracking Without Bells and Whistles},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}}

@article{zhu2020deformable,
  title={Deformable DETR: Deformable Transformers for End-to-End Object Detection},
  author={Zhu, Xizhou and Su, Weijie and Lu, Lewei and Li, Bin and Wang, Xiaogang and Dai, Jifeng},
  journal={arXiv preprint arXiv:2010.04159},
  year={2020}
}

@article{zhang2021bytetrack,
  title={ByteTrack: Multi-Object Tracking by Associating Every Detection Box},
  author={Zhang, Yifu and Sun, Peize and Jiang, Yi and Yu, Dongdong and Yuan, Zehuan and Luo, Ping and Liu, Wenyu and Wang, Xinggang},
  journal={arXiv preprint arXiv:2110.06864},
  year={2021}
}

@article{wang2021pvtv2,
  title={Pvtv2: Improved baselines with pyramid vision transformer},
  author={Wang, Wenhai and Xie, Enze and Li, Xiang and Fan, Deng-Ping and Song, Kaitao and Liang, Ding and Lu, Tong and Luo, Ping and Shao, Ling},
  journal={Computational Visual Media},
  volume={8},
  number={3},
  pages={1--10},
  year={2022},
  publisher={Springer}
}

Several modules are from:

MOT Metrics in Python: py-motmetrics

Soft-NMS: Soft-NMS

DETR: DETR

DCNv2: DCNv2

PVTv2: PVTv2

ByteTrack: ByteTrack

transcenter's People

Contributors

banyutong avatar ragondyn avatar yihongxu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

transcenter's Issues

TransCenterv2 demo code please

Dear author:
Thanks for sharing the great work. For a lot people are interested in your algorithm, the fast way to know it is just git clone and try the demo.py. Would you kindly share it here? thank you.

关于减少显存占用

感谢开源,
作者有试过输入小一点分辨率的图片,或者其他减少显存占用的思路,实验效果如何吗
()当前的模型在batch_size调到1依然需要超过12G显存)

environment error

there are pacakges installed from PyPI.
so when I use

conda create --name <env_name> --file requirements.txt

I get the following error:
PackagesNotFoundError: The following packages are not available from current channels:

Could you use the following commands to upload your environments?

conda env export > environment.yaml
conda env create -n <env_name> -f environment.yaml

ModuleNotFoundError: No module named 'motmetrics'

Hello, I followed the option2 to create an environment and converted MOT20 dataset to coco format.
Then I want to track MOT20 dataset
python tracking/transcenter/mot20_pub.py --data_dir=./MOT20
I got an error and did not know how to deal with it.
image

Version 1

Hi, thanks for sharing the codes. Could you please release the previous version 1 ?

Thanks a lot !

The loss of the tracking decoder

Hello, thanks for your excellent work, and i would like to know the reason why the loss of tracking decoder is hard to decrease when applied on my custom small object dataset ?

Train it with multi-class?

I found the model is trained only in person category, can we train it with multi-class like person and car?

Question about motion model

Hi,
I noticed that the config file (detracker_reidV3.yaml) contains a parameter about a motion model. I see that it is imported into the parameters of each new track (line 132 in track.py) but I don't see where this parameter is actually used, for example when predicting the new position of a track before the linear matching.
I also noticed that in the function tracks_dets_matching_tracking() function there is an update to the current track position but the line "t.prev_pos = t.pos" is commented out (line 184 in tracker.py).

Are you using the velocity/motion information in some other part of the code or did you decide not to use this information at all?

Question about the pretrained models

Hi,
Are the models trained on the full train set or only on 75% (while the last 25% remained for validation)?
I'm asking because the class "GenericDataset_val" has a default value of train_ratio=0.5 which leads to a 75%-25% partition in the evaluation, but it is not mentioned in the pretrained models section of the readme.

Does any of the provided images work with V100 GPUs?

I am getting the following error in the deformable convolution package.

error in modulated_deformable_im2col_cuda: no kernel image is available for execution on the device
error in modulated_deformable_im2col_cuda: no kernel image is available for execution on the device
error in modulated_deformable_im2col_cuda: no kernel image is available for execution on the device
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/pdb.py", line 1699, in main
pdb._runscript(mainpyfile)
File "/opt/conda/lib/python3.7/pdb.py", line 1568, in _runscript
self.run(statement)
File "/opt/conda/lib/python3.7/bdb.py", line 578, in run
exec(cmd, globals, locals)
File "", line 1, in
File "/net/coxfs01/srv/export/coxfs01/pfister_lab2/share_root/Lab/abanerjee/VideoProjects/TransCenter_official/training/transcenter/main_mot17_tracking.py", line 32, in
import os
File "/net/coxfs01/srv/export/coxfs01/pfister_lab2/share_root/Lab/abanerjee/VideoProjects/TransCenter_official/training/transcenter/main_mot17_tracking.py", line 412, in main
model, criterion, data_loader_train, optimizer, device, epoch, args.clip_max_norm, adaptive_clip=args.adaptive_clip)
File "/net/coxfs01/srv/export/coxfs01/pfister_lab2/share_root/Lab/abanerjee/VideoProjects/TransCenter_official/training/transcenter/engine.py", line 65, in train_one_epoch
outputs = model(samples, pre_samples=pre_samples, pre_hm=pre_hm)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/net/coxfs01/srv/export/coxfs01/pfister_lab2/share_root/Lab/abanerjee/VideoProjects/TransCenter_official/training/transcenter/models/deformable_detr.py", line 287, in forward
hs[layer_lvl] = self.ida_up[0](hs[layer_lvl], 0, len(hs[layer_lvl]))[-1]
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/net/coxfs01/srv/export/coxfs01/pfister_lab2/share_root/Lab/abanerjee/VideoProjects/TransCenter_official/training/transcenter/models/dla.py", line 98, in forward
layers[startp] = node(layers[startp])
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/net/coxfs01/srv/export/coxfs01/pfister_lab2/share_root/Lab/abanerjee/VideoProjects/TransCenter_official/training/transcenter/models/dla.py", line 48, in forward
x = self.conv(x)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/opt/DCNv2/dcn_v2.py", line 128, in forward
self.deformable_groups)
File "/opt/DCNv2/dcn_v2.py", line 31, in forward
ctx.deformable_groups)
RuntimeError: cublas runtime error : the GPU program failed to execute at /opt/conda/conda-bld/pytorch_1591914880026/work/aten/src/THC/THCBlas.cu:335

I am using the pytorch1-5cuda10-1_RTX.sif image with V100 gpu. Is there a way to make it work?

Thanks

RuntimeError: expected scalar type Float but found Half

Traceback (most recent call last):
File "/home/valca509/Desktop/TransCenter_official-main/training/main_mot20.py", line 511, in
main(args)
File "/home/valca509/Desktop/TransCenter_official-main/training/main_mot20.py", line 366, in main
model, criterion, postprocessors, data_loader_val, base_ds, device, args.output_dir, args.half
File "/home/valca509/anaconda3/envs/transcenter3/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "/home/valca509/Desktop/TransCenter_official-main/training/engine.py", line 172, in evaluate
outputs = model(samples, pre_samples=pre_samples, pre_hm=targets['pre_cts'].clone())
File "/home/valca509/anaconda3/envs/transcenter3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/valca509/Desktop/TransCenter_official-main/training/deformable_detr.py", line 215, in forward
hs[layer_lvl] = self.ida_up(hs[layer_lvl], 0, len(hs[layer_lvl]))[-1]
File "/home/valca509/anaconda3/envs/transcenter3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/valca509/Desktop/TransCenter_official-main/models/dla.py", line 96, in forward
layers[i-1] = node(layers[i] + layers[i - 1])
File "/home/valca509/anaconda3/envs/transcenter3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/valca509/anaconda3/envs/transcenter3/lib/python3.7/site-packages/torch/cuda/amp/autocast_mode.py", line 141, in decorate_autocast
return func(*args, **kwargs)
File "/home/valca509/Desktop/TransCenter_official-main/models/dla.py", line 47, in forward
x_out = self.conv(x)
File "/home/valca509/anaconda3/envs/transcenter3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/valca509/Desktop/TransCenter_official-main/to_install/DCNv2_1.9/dcn_v2.py", line 172, in forward
self.deformable_groups,
File "/home/valca509/Desktop/TransCenter_official-main/to_install/DCNv2_1.9/dcn_v2.py", line 39, in forward
ctx.deformable_groups,
RuntimeError: expected scalar type Float but found Half

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.