xingyizhou / centernet Goto Github PK

View Code? Open in Web Editor NEW

7.2K 7.2K 1.9K 6.26 MB

Object detection, 3D detection, and pose estimation using center point detection:

License: MIT License

Python 63.14% Shell 1.67% Cuda 12.35% C 10.65% Makefile 0.02% C++ 11.05% Cython 1.12%

centernet's Introduction

Objects as Points

Object detection, 3D detection, and pose estimation using center point detection:

Objects as Points,
Xingyi Zhou, Dequan Wang, Philipp Krähenbühl,
arXiv technical report (arXiv 1904.07850)

Contact: [email protected]. Any questions or discussions are welcomed!

Updates

(June, 2020) We released a state-of-the-art Lidar-based 3D detection and tracking framework CenterPoint.
(April, 2020) We released a state-of-the-art (multi-category-/ pose-/ 3d-) tracking extension CenterTrack.

Abstract

Detection identifies objects as axis-aligned boxes in an image. Most successful object detectors enumerate a nearly exhaustive list of potential object locations and classify each. This is wasteful, inefficient, and requires additional post-processing. In this paper, we take a different approach. We model an object as a single point -- the center point of its bounding box. Our detector uses keypoint estimation to find center points and regresses to all other object properties, such as size, 3D location, orientation, and even pose. Our center point based approach, CenterNet, is end-to-end differentiable, simpler, faster, and more accurate than corresponding bounding box based detectors. CenterNet achieves the best speed-accuracy trade-off on the MS COCO dataset, with 28.1% AP at 142 FPS, 37.4% AP at 52 FPS, and 45.1% AP with multi-scale testing at 1.4 FPS. We use the same approach to estimate 3D bounding box in the KITTI benchmark and human pose on the COCO keypoint dataset. Our method performs competitively with sophisticated multi-stage methods and runs in real-time.

Highlights

Simple: One-sentence method summary: use keypoint detection technic to detect the bounding box center point and regress to all other object properties like bounding box size, 3d information, and pose.
Versatile: The same framework works for object detection, 3d bounding box estimation, and multi-person pose estimation with minor modification.
Fast: The whole process in a single network feedforward. No NMS post processing is needed. Our DLA-34 model runs at 52 FPS with 37.4 COCO AP.
Strong: Our best single model achieves 45.1AP on COCO test-dev.
Easy to use: We provide user friendly testing API and webcam demos.

Main results

Object Detection on COCO validation

Backbone	AP / FPS	Flip AP / FPS	Multi-scale AP / FPS
Hourglass-104	40.3 / 14	42.2 / 7.8	45.1 / 1.4
DLA-34	37.4 / 52	39.2 / 28	41.7 / 4
ResNet-101	34.6 / 45	36.2 / 25	39.3 / 4
ResNet-18	28.1 / 142	30.0 / 71	33.2 / 12

Keypoint detection on COCO validation

Backbone	AP	FPS
Hourglass-104	64.0	6.6
DLA-34	58.9	23

3D bounding box detection on KITTI validation

Backbone	FPS	AP-E	AP-M	AP-H	AOS-E	AOS-M	AOS-H	BEV-E	BEV-M	BEV-H
DLA-34	32	96.9	87.8	79.2	93.9	84.3	75.7	34.0	30.5	26.8

All models and details are available in our Model zoo.

Installation

Please refer to INSTALL.md for installation instructions.

Use CenterNet

We support demo for image/ image folder, video, and webcam.

First, download the models (By default, ctdet_coco_dla_2x for detection and multi_pose_dla_3x for human pose estimation) from the Model zoo and put them in CenterNet_ROOT/models/.

For object detection on images/ video, run:

python demo.py ctdet --demo /path/to/image/or/folder/or/video --load_model ../models/ctdet_coco_dla_2x.pth

We provide example images in CenterNet_ROOT/images/ (from Detectron). If set up correctly, the output should look like

For webcam demo, run

python demo.py ctdet --demo webcam --load_model ../models/ctdet_coco_dla_2x.pth

Similarly, for human pose estimation, run:

python demo.py multi_pose --demo /path/to/image/or/folder/or/video/or/webcam --load_model ../models/multi_pose_dla_3x.pth

The result for the example images should look like:

You can add --debug 2 to visualize the heatmap outputs. You can add --flip_test for flip test.

To use this CenterNet in your own project, you can

import sys
CENTERNET_PATH = /path/to/CenterNet/src/lib/
sys.path.insert(0, CENTERNET_PATH)

from detectors.detector_factory import detector_factory
from opts import opts

MODEL_PATH = /path/to/model
TASK = 'ctdet' # or 'multi_pose' for human pose estimation
opt = opts().init('{} --load_model {}'.format(TASK, MODEL_PATH).split(' '))
detector = detector_factory[opt.task](opt)

img = image/or/path/to/your/image/
ret = detector.run(img)['results']

ret will be a python dict: {category_id : [[x1, y1, x2, y2, score], ...], }

Benchmark Evaluation and Training

After installation, follow the instructions in DATA.md to setup the datasets. Then check GETTING_STARTED.md to reproduce the results in the paper. We provide scripts for all the experiments in the experiments folder.

Develop

If you are interested in training CenterNet in a new dataset, use CenterNet in a new task, or use a new network architecture for CenterNet, please refer to DEVELOP.md. Also feel free to send us emails for discussions or suggestions.

Third-party resources

CenterNet + embedding learning based tracking: FairMOT from Yifu Zhang.
Detectron2 based implementation: CenterNet-better from Feng Wang.
Keras Implementation: keras-centernet from see-- and keras-CenterNet from xuannianz.
MXnet implementation: mxnet-centernet from Guanghan Ning.
Stronger human open estimation models: centerpose from tensorboy.
TensorRT extension with ONNX models: TensorRT-CenterNet from Wengang Cao.
CenterNet + DeepSORT tracking implementation: centerNet-deep-sort from kimyoon-young.
Blogs on training CenterNet on custom datasets (in Chinese): ships from Rhett Chen and faces from linbior.

License

CenterNet itself is released under the MIT License (refer to the LICENSE file for details). Portions of the code are borrowed from human-pose-estimation.pytorch (image transform, resnet), CornerNet (hourglassnet, loss functions), dla (DLA network), DCNv2(deformable convolutions), tf-faster-rcnn(Pascal VOC evaluation) and kitti_eval (KITTI dataset evaluation). Please refer to the original License of these projects (See NOTICE).

Citation

If you find this project useful for your research, please use the following BibTeX entry.

@inproceedings{zhou2019objects,
  title={Objects as Points},
  author={Zhou, Xingyi and Wang, Dequan and Kr{\"a}henb{\"u}hl, Philipp},
  booktitle={arXiv preprint arXiv:1904.07850},
  year={2019}
}

centernet's People

Contributors

Stargazers

Watchers

Forkers

hyzcn locussam tangyoubao hzhang57 jwmneu jxqj richardhahahaha chaos1992 billyzju skyneta 10183308 clarence-wen xs-han trantorrepository gcv9htd mzhutikov shiyongde ruotianluo xy0806 cezny hdjsjyl kkkmmu collector-m alonegu fabulouslewis jdc08161063 gzzgz angleboy8 johndpope dahburj micous hack121 cclauss yinrui1991 shaunstanislauslau hhy5277 dreadlord1984 w7yuu wishgale fendaq yuckfu gpsbird snooble happog gehongpeng wenmingmeng ziqichai hunglethanh9 zhdai emily-tan sherrycloudy moctiors thaneacheron leo-xxx suriyanitt davidko3 gaoyuchris rotorliu morganwang010 meitianjinbu wjx2 leeyongchao rokon-uz-zaman nuptwuchen ishantbansal zhukkang tony-leeee youtang1993 xjhaoren fighting-jj fangwudi hiroki-kyoto shubhampachori12110095 miaowu99 sani1486 sundrops mouedhen brandonzhong yushenxiang dongzhuoyao qwexsasd liaoheping klwgo undercontroller prabindh github-luffy esmaeilinia shuoshuoa mafuyan kingfaluis xiaoyubing yezhengli-mr9 woinck littleserendipity xtmeng baoqiancherry shrincy raeony lkllk icgog

centernet's Issues

Undefined name 'dla' in dlav0.py

flake8 testing of https://github.com/xingyizhou/CenterNet on Python 3.7.1

$ flake8 . --count --select=E9,F63,F72,F82 --show-source --statistics

./src/lib/models/networks/dlav0.py:417:5: F821 undefined name 'dla'
    dla.BatchNorm = bn
    ^
1     F821 undefined name 'dla'
1

E901,E999,F821,F822,F823 are the "showstopper" flake8 issues that can halt the runtime with a SyntaxError, NameError, etc. These 5 are different from most other flake8 issues which are merely "style violations" -- useful for readability but they do not effect runtime safety.

F821: undefined name name
F822: undefined name name in __all__
F823: local variable name referenced before assignment
E901: SyntaxError or IndentationError
E999: SyntaxError -- failed to compile a file into an Abstract Syntax Tree

Possible BUG when evaluating PASCAL format mAP?

Following experiments script, test.py for pretrained model pascal_resdcn18_384, the evaluation result is as shown in the figure:

python3 test.py ctdet --exp_id pascal_resdcn18_384 --arch resdcn_18 --dataset pascal --load_model ~/data/models/pytorch/CenterNet/ctdet_pascal_resdcn18_384.pth

where "ctdet_pascal_resdcn18_384.pth" is provided in ModelZoo.

Anyone else has the same issue? Is this a bug or my incorrect configuration (following DATA.md PASCAL VOC)

failed in inference on non-default gpu

i also oppend this issue on DCNv2 project:
CharlesShang/DCNv2#16

the centernet project uses this DCNv2( pytorch 0.4 version)
as I compiled DCNv2( pytorch 1.0 version) and use it in the centernet project,it also works fine.
things goes well as i only run it on gpu 0
but,strange bug happened when i tried to use model inference on different gpu.
say i wrote code like this:
...
torch.cuda.synchronize()
model = centernet_model.to( 1 ) # not on gpu 0
x = x.to( 1 )
y = model(x)
torch.cuda.synchronize()
...
firstly, there would be error says" illegal memory access torch.cuda.synchronize()"
well, if i remove the all the synchronize function
this code would run as usual for about 10 images,
then suddenly it got a cuda runtime error says:
"GPU program failed to execute at /pytorch/aten/src/THC/THCBlas.cu:416"

as the author of centernet developed his project in py0.4 enviroment, i decided to change to another machine and set-up all the pytorch0.4 env, and compile DCNv2 pytorch0.4 version.
Still, everything works fine except try to inference on a non-default gpu.
but this time,the error message is "argument not on same gpu".
and the error message is from dcn_v2_cuda.c:20

this problem really drived me mad, and i realy don't know where the bug is. is it here(DCNv2),or is it in the centernet project.

Very poor performance using provided pretrained model(KITTI, 3DOP)

Thanks for your nice work. I download your pretrained model for 3DOP split and follow the instruction in GETTING_STARTED.md. Here is the kitti evaluation result:

car_detection AP: 96.919403 87.846443 79.191116
car_orientation AP: 93.916435 84.285149 75.674400
pedestrian_detection AP: 69.930481 60.852219 52.161320
pedestrian_orientation AP: 52.551113 45.630672 39.154442
cyclist_detection AP: 73.306274 48.693352 41.659622
cyclist_orientation AP: 63.731075 41.754368 36.320240
car_detection_ground AP: 3.543346 4.392704 3.567544
pedestrian_detection_ground AP: 5.368695 5.038685 4.873090
cyclist_detection_ground AP: 4.042099 3.308401 3.359062
car_detection_3d AP: 0.890253 1.148476 1.380574
pedestrian_detection_3d AP: 4.984490 4.746258 3.636364
cyclist_detection_3d AP: 3.657130 3.113063 3.147585

Compared with your paper, the 2D detection and AOS result of car category is fairly normal, while the result of 3D detection and BEV detection is extremely low. I wonder if I made any mistake.

How to generate the image dir in kitti?

Thanks for your work! I am a bit confused that how to use convert_kitti_to_coco.py to generate the images dir in kitti?

Here are my dirs in kitti.

heyuan@lambda-quad:~/Research/CenterNet/data/kitti$ tree -L 2
.
├── ImageSets_3dop
│   ├── test.txt
│   ├── train.txt
│   ├── trainval.txt
│   └── val.txt
├── ImageSets_subcnn
│   ├── test.txt
│   ├── train.txt
│   ├── trainval.txt
│   └── val.txt
└── training
    ├── calib
    ├── image_2
    └── label_2

When I tried to run python convert_kitti_to_coco.py. It returned an error.

(CenterNet) heyuan@lambda-quad:~/Research/CenterNet/src/tools$ python convert_kitti_to_coco.py 
# images:  3712
# annotations:  25099
Traceback (most recent call last):
  File "convert_kitti_to_coco.py", line 151, in <module>
    json.dump(ret, open(out_path, 'w'))
FileNotFoundError: [Errno 2] No such file or directory: '../../data/kitti//annotations/kitti_3dop_train.json'

Then, I created annotations manually, and run the command again.

heyuan@lambda-quad:~/Research/CenterNet/data/kitti$ tree -L 2
.
├── annotations
│   ├── kitti_3dop_train.json
│   ├── kitti_3dop_val.json
│   ├── kitti_subcnn_train.json
│   └── kitti_subcnn_val.json
├── ImageSets_3dop
│   ├── test.txt
│   ├── train.txt
│   ├── trainval.txt
│   └── val.txt
├── ImageSets_subcnn
│   ├── test.txt
│   ├── train.txt
│   ├── trainval.txt
│   └── val.txt
└── training
    ├── calib
    ├── image_2
    └── label_2

Now the files in annotations are generated, but I still cannot find images dir as your description in DATA.md

By the way, when I run the command with DEBUG = True, it returns an error because there is no image in the image path.

(CenterNet) heyuan@lambda-quad:~/Research/CenterNet/src/tools$ python convert_kitti_to_coco.py 
pt_3d [[1.8127272]
 [1.4814587]
 [8.405019 ]]
location [1.84, 1.47, 8.41]
Traceback (most recent call last):
  File "convert_kitti_to_coco.py", line 143, in <module>
    cv2.imshow('image', image)
cv2.error: OpenCV(4.1.0) /io/opencv/modules/highgui/src/window.cpp:352: error: (-215:Assertion failed) size.width>0 && size.height>0 in function 'imshow'

top K

Why do you use topK instead of a fixed threshold for output? Wouldn't it make a big error if there were only a few or a lot of targets in the image?

Not an issue - question on perspective detection

Could the training be easily configured to find corners of rooms / buildings / perspective?

error when run demo.py

when I run the demo.py, I met this error:
RuntimeError: cublas runtime error : the GPU program failed to execute at /pytorch/aten/src/THC/THCBlas.cu:416

How can I solve it, thank you!

怎么得到网络结构？

不是项目本身问题，学习时候遇到问题；不知道大家遇到过没有，我用torch.save保存的模型，用netron打开没有结构化显示。ctdet_coco_dla_1x.pth本身也只有weight，更无法查看结构了，不知道有什么好的方法可以看到网络结构。

run demo error: Cannot move to target thread

QObject::moveToThread: Current thread (0x6ac6ba0) is not the object's thread (0x6c66600).
Cannot move to target thread (0x6ac6ba0)

All steps is same to INSTALL.md, error when execute ret = detector.run(image_name) in demo.py
Thanks!

bug? missing argument in opts.py

The call to opts.init() using the README example fails with the error message describing the argument parser's arguments. It seems that in opts.py if we change line 357 to
opt = self.parse(args) then the model is created successfully.
Not sure why, but this is what worked for me.

pytorch0.41+cuda9.0+cudnn7.3+win10，failed complied DCNv2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "build_double.py", line 43, in
ffi.build()
File "D:\Anaconda3\envs\CenterNet\lib\site-packages\torch\utils\ffi_init_.py", line 189, in build
build_extension(ffi, cffi_wrapper_name, target_dir, verbose)
File "D:\Anaconda3\envs\CenterNet\lib\site-packages\torch\utils\ffi_init.py", line 111, in _build_extension
outfile = ffi.compile(tmpdir=tmpdir, verbose=verbose, target=libname)
File "D:\Anaconda3\envs\CenterNet\lib\site-packages\cffi\api.py", line 723, in compile
compiler_verbose=verbose, debug=debug, **kwds)
File "D:\Anaconda3\envs\CenterNet\lib\site-packages\cffi\recompiler.py", line 1526, in recompile
compiler_verbose, debug)
File "D:\Anaconda3\envs\CenterNet\lib\site-packages\cffi\ffiplatform.py", line 22, in compile
outputfilename = _build(tmpdir, ext, compiler_verbose, debug)
File "D:\Anaconda3\envs\CenterNet\lib\site-packages\cffi\ffiplatform.py", line 58, in _build
raise VerificationError('%s: %s' % (e.class.name, e))
cffi.VerificationError: LinkError: command 'D:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\link.exe' failed with exit status 1120

c:\program files\nvidia gpu computing toolkit\cuda\v9.0\include\crt\math_functions.h: warning C4819: The file contains a character that cannot be represented in the current code page (936). Save the file in Unicode format to prevent data loss

Training on our data

Hi,
I did not understand from the documentation, your code read the the data from txt files or from json? Can i put my data (images and txt files) in the directory of coco data, and run the training.
My data have for each object (class_id xc yc w h) normalized, as darknet work for coco, How to convert to your format?

Best.

Could you show you training log?

I tried the Experienment CTDET-DLA-PASCAL ,the train loss log ：

the loss of val (every five epoch)do not converge along with the epoches:

I do not change the BN to cpu.
Can you show your training log?
Thanks a lot.
@xingyizhou

Is the get_dataset function complete?

In the dataset_factory.py, I found get_dataset is this:
def get_dataset(dataset, task):
class Dataset(dataset_factory[dataset], _sample_factory[task]):
pass
return Dataset
it is right?
when i run code,Iget error:
Traceback (most recent call last):
File "main.py", line 103, in
main(opt)
File "main.py", line 23, in main
Dataset = get_dataset(opt.dataset, opt.task)
TypeError: get_dataset() missing 1 required positional argument: 'dataset_dir'

undefined symbol: _ZN2at19UndefinedTensorImpl10_singletonEv

Traceback (most recent call last):
File "demo.py", line 11, in
from detectors.detector_factory import detector_factory
File "/app/ljn_work/CenterNet/src/lib/detectors/detector_factory.py", line 5, in
from .exdet import ExdetDetector
File "/app/ljn_work/CenterNet/src/lib/detectors/exdet.py", line 23, in
from .base_detector import BaseDetector
File "/app/ljn_work/CenterNet/src/lib/detectors/base_detector.py", line 11, in
from models.model import create_model, load_model
File "/app/ljn_work/CenterNet/src/lib/models/model.py", line 12, in
from .networks.pose_dla_dcn import get_pose_net as get_dla_dcn
File "/app/ljn_work/CenterNet/src/lib/models/networks/pose_dla_dcn.py", line 16, in
from .DCNv2.dcn_v2 import DCN
File "/app/ljn_work/CenterNet/src/lib/models/networks/DCNv2/dcn_v2.py", line 13, in
import _ext as _backend
ImportError: /app/ljn_work/CenterNet/src/lib/models/networks/DCNv2/_ext.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN2at19UndefinedTensorImpl10_singletonEv

how to train you own model?

Extracting meaningful pose information from model output

Is there a straightforward way to parse the output from the pose model? For example, if I wanted just the coordinates of all of the detected foot keypoints, how would I obtain those from looking at 'ret' with the following line:

ret = centernet_detector.run(img)['results']

Where centernet_detector was initialized with the following model:

MODEL_PATH = "../models/multi_pose_dla_3x.pth"
TASK = 'multi_pose' 
opt = opts().init('{} --load_model {}'.format(TASK, MODEL_PATH).split(' '))
centernet_detector = detector_factory[opt.task](opt)

Not really an issue - just trying to understand if there's an easy way to parse the output that I've overlooked because at the moment, ret is returned as a massive dictionary with 1 key value that looks like this (and this is just a small chunk of it):

{1: [[431.6806335449219, 374.5654296875, 798.3137817382812, 806.46533203125, 0.9136448502540588, 596.3617553710938, 451.5331115722656, 574.0827026367188, 441.81072998046875, 598.2064819335938, 444.61395263671875, 521.5050048828125, 435.496337890625, 592.1182250976562, 452.6651306152344, 465.6389465332031, 520.5789184570312, 612.9014892578125, 522.172607421875, 444.38726806640625, 601.4559936523438, 692.7471923828125, 614.8954467773438, 520.7488403320312, 474.4514465332031, 700.9055786132812, 577.1103515625, 532.8385009765625, 736.5150146484375, 634.1228637695312, 736.1271362304688, 623.9586181640625, 757.8681640625, 748.8753051757812, 693.3152465820312, 627.692626953125, 819.211669921875, 763.8153686523438, 810.7884521484375], [1150.5811767578125, 243.87657165527344, 1309.4493408203125, 445.14276123046875, 0.8061871528625488, 1273.1363525390625, 298.8131408691406, 1272.9134521484375, 288.4426574707031, 1268.0367431640625, 295.4898376464844, 1242.2972412109375, 281.63641357421875, 1233.7552490234375, 296.6488952636719, 1197.0325927734375, 321.2576599121094, 1218.05224609375, 341.2820739746094, 1256.329833984375, 355.447998046875, 1230.5333251953125, 421.34320068359375, 1274.132080078125, 307.4358825683594, 1275.4371337890625, 431.8289489746094, 1185.95556640625, 444.4655456542969, 1200.3651123046875, 460.19976806640625, 1249.952880859375, 467.85028076171875, 1283.1112060546875, 476.4889831542969, 1236.3939208984375, 513.936767578125, 1261.282958984375, 516.7928466796875], ... ]]}

In addition to this, where is the code that outputs the skeleton drawing for the demo? I presume the answer to my initial question may lie there, but I can't find that either.

Thanks!

Error when trying to run demo with custom model

I was able to train a new pose model by following the instructions in DEVELOP.md. The command I used to train was the following (which I modeled off of experiments/multi_pose_hg_3x.sh):

# train
python main.py multi_pose --exp_id hg_3x --dataset mod_coco_hp --arch hourglass --batch_size 12 --master_batch 2 --lr 2.5e-4 --load_model /home/shared/Projects/CenterNet/models/ctdet_coco_hg.pth --gpus 0,1 --num_epochs 150 --lr_step 130

The training finished successfully (on 2 v100s). However, when I try to load my new model (on a 1080ti) with the webcam demo script, I get a series of errors and the model doesn't output anything. The windows showing the webcam feed appear, but there is no pose detection.

For reference, here is what I use to run the demo using a pre-trained pose model from the model-zoo successfully on a webcam:

python demo.py multi_pose --demo webcam --load_model ../models/multi_pose_dla_3x.pth

Here is the command I use to run the demo script for my custom model:

python demo.py multi_pose --demo webcam --load_model ../models/model_130.pth

Here is some relevant error output (it appears the problem is with pytorch):

training chunk_sizes: [32]
The output will be saved to  /home/shared/Projects/CenterNet/src/lib/../../exp/multi_pose/default
heads {'hm': 1, 'wh': 2, 'hps': 34, 'reg': 2, 'hm_hp': 17, 'hp_offset': 2}
Creating model...
loaded ../models/model_130.pth, epoch 130
Drop parameter pre.0.conv.weight.
Drop parameter pre.0.bn.weight.
Drop parameter pre.0.bn.bias.
Drop parameter pre.0.bn.running_mean.
Drop parameter pre.0.bn.running_var.
...
No param base.base_layer.0.weight.
No param base.base_layer.1.weight.
No param base.base_layer.1.bias.
No param base.base_layer.1.running_mean.
No param base.base_layer.1.running_var.
No param base.base_layer.1.num_batches_tracked.
No param base.level0.0.weight.
No param base.level0.1.weight.
No param base.level0.1.bias.
No param base.level0.1.running_mean.
No param base.level0.1.running_var.
No param base.level0.1.num_batches_tracked.
No param base.level1.0.weight.
No param base.level1.1.weight.
...

There are about ~1700 lines that say "Drop parameter..." or "No param ..." that get printed to the terminal before I see the video feeds from the webcam. The multi-pose window then doesn't draw any of the poses on the people in the frame like I would expect it to.

If there is something I might have done wrong in the training step or in running the demo, please let me know. I was fairly optimistic in this working and have hit a wall for the time being.

System settings (Let me know if any other info is needed): CUDA 10.1, torch 1.0.1.post2, Ubuntu 18.04.2

paper typo

one-state -> one-stage?

run demo error

CenterNet work on ubuntu16.04+pytorch0.4.1+python3.5+cuda8.0
I have compile the ./make.sh
when i run the demo.py met error
python demo.py ctdet --demo ../images/ --load_model ../models/ctdet_coco_dla_2x.pth
error as follows:
Fix size testing.
training chunk_sizes: [32]
The output will be saved to /home/zkl/Project/Python/rev2019/CenterNet/src/lib/../../exp/ctdet/default
heads {'hm': 80, 'reg': 2, 'wh': 2}
Creating model...
Traceback (most recent call last):
File "/home/zkl/Project/Python/rev2019/CenterNet/src/demo.py", line 56, in
demo(opt)
File "/home/zkl/Project/Python/rev2019/CenterNet/src/demo.py", line 21, in demo
detector = Detector(opt)
File "/home/zkl/Project/Python/rev2019/CenterNet/src/lib/detectors/ctdet.py", line 22, in init
super(CtdetDetector, self).init(opt)
File "/home/zkl/Project/Python/rev2019/CenterNet/src/lib/detectors/base_detector.py", line 25, in init
self.model = load_model(self.model, opt.load_model)
File "/home/zkl/Project/Python/rev2019/CenterNet/src/lib/models/model.py", line 34, in load_model
checkpoint = torch.load(model_path, map_location=lambda storage, loc: storage)
File "/home/zkl/venv/venv_python3.5_pytorch/lib/python3.5/site-packages/torch/serialization.py", line 358, in load
return _load(f, map_location, pickle_module)
File "/home/zkl/venv/venv_python3.5_pytorch/lib/python3.5/site-packages/torch/serialization.py", line 532, in _load
magic_number = pickle_module.load(f)
EOFError: Ran out of input

Process finished with exit code 1

How can I solve this problem?

Experiments folder

Hi,

Thanks for the excellent paper and the code.

It seems you omit to upload the experiments folder indicated in readme. Could you upload it?

As a minor correction, the multi_hp argument in this section of the readme should be corrected as multi_pose.

Best,

Focal loss in the paper

Hello, I'm not sure whether this is the correct place to ask but I was trying to implement your paper in Julia and I'm stuck at the focal loss function you use for the detection of objects.

If I compare the focal loss function in you paper to the one in the cornernet paper, yours lacks a -, is this intentional?

Say this is a target heatmap, the black pixels are 0 and white is 1.

In my code it's often the case that no pixel has a value of 1 since the center of an object often falls in between pixels. Does this implementation cope with this somehow?

Also, when training my implementation, the predicted heatmap quicly only returns zeros, that way the loss has reached 0

I'm sure I'm missing something and I'd love to hear back from you.
Thanks

Training from nuScenes data advise

Hi, I have managed training CenterNet 3d on kitti, it pretty nice on predicting 3d box using kitti format labeling. I wanna training it on nuScences now. I want get some advise from here:

nuScences has more classes then kitti, very pedestrian were classified into more detailed classes, should I treat them as single classes when training? (the people in wheelchairs is also labeled, the trash can and traffic cone were also elaborately labeled) Does so many classes can be trained on CenterNet? (in other words, how to choose reasonable classes to get a good performance?)
About the training input data and output data, input data is image of course, but how to prepared output data? (in kitti, it's object x,y,z hwl, r_y in 3D according to camera view, and in kitti the y axis is on the height side. But it seems not labeled like this. Will it effect the final result?)

thanks in advance.

use model 'ctdet_coco_resdcn18.pth' error

Thanks for your work.

I run python demo.py ctdet --load_model ../models/ctdet_coco_dla_2x.pth --demo ../images successfully, but when i change the pre-trained model into ctdet_coco_resdcn18.pth, there some information such as No param reg_r.0.bias and there is no correct result shown in picture.

How i should do to solve this? Thanks~

how to specify the traing param lr and lr_step

i modify the code to train my data on detection. 280 classes, 80,000 images, and run the cmd below.

['main.py', 'ctdet', '--exp_id', 'pascal_dla_512', '--dataset', 'pascal', '--input_res', '512', '--num_epochs', '230', '--batch_size', '47', '--master_batch', '7', '--lr', '5e-4', '--lr_step', '180,210', '--gpus', '0,1,2,3,4,5', '--num_workers', '12']

but the param lr and lr_step, i can not find a good value.
when the program run for 7 days (k80, slow),

81117734 May  3 23:07 model_best.pth
243113237 May  6 10:44 model_last.pth

the program runs using nohup, and i can not see the loss and progress.
only when load the the model best, i can find the model best is on the 75th epoch.
it does not update for two days, and it can not go further about 50 epoches.

is there a way to see the training progress and the loss
how to specify the param lr and lr_step. is there a way to make the lr and lr_step auto adapt to train. for example, when loss increase three time continously, decrease the loss.

bug .Compilc make.sh error

DCNv2 can not built on pytorch1.0

    from ._ext import dcn_v2 as _backend
ModuleNotFoundError: No module named 'models.networks.DCNv2._ext'

the DCNv2 can not be built on pytorch 1.0.

Traceback (most recent call last):
  File "build.py", line 3, in <module>
    from torch.utils.ffi import create_extension
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/ffi/__init__.py", line 1, in <module>
    raise ImportError("torch.utils.ffi is deprecated. Please use cpp extensions instead.")
ImportError: torch.utils.ffi is deprecated. Please use cpp extensions instead.

Any fix on this?

Bug in training?

Training the network sometimes reports a strange issue. However, if I run the code again, everything is fine, this bug is missing. It seems to be a probability event.

Traceback (most recent call last):
  File "main.py", line 102, in <module>
    main(opt)
  File "main.py", line 70, in main
    log_dict_train, _ = trainer.train(epoch, train_loader)
  File "/media/data2/zhouhj/centernet/src/lib/trains/base_trainer.py", line 119, in train
    return self.run_epoch('train', epoch, data_loader)
  File "/media/data2/zhouhj/centernet/src/lib/trains/base_trainer.py", line 74, in run_epoch
    self.optimizer.step()
  File "/home/zhouhj/.local/lib/python3.5/site-packages/torch/optim/adam.py", line 92, in step
    exp_avg.mul_(beta1).add_(1 - beta1, grad)
RuntimeError: The expanded size of the tensor (2) must match the existing size (20) at non-singleton dimension 0

test error

Hi, thanks for your great work. When test in coco datsets by python test.py ctdet --exp_id coco_dla --keep_res --load_model ../models/ctdet_coco_dla_2x.pth, the error has happen

Keep resolution testing.
training chunk_sizes: [32]
The output will be saved to /home3/wangjx/CenterNet/CenterNet-master/src/lib/../../exp/ctdet/coco_dla
heads {'hm': 80, 'wh': 2, 'reg': 2}
Namespace(K=100, aggr_weight=0.0, agnostic_ex=False, arch='dla_34', aug_ddd=0.5, aug_rot=0, batch_size=32, cat_spec_wh=False, center_thresh=0.1, chunk_sizes=[32], data_dir='/home3/wangjx/CenterNet/CenterNet-master/src/lib/../../data', dataset='coco', debug=0, debug_dir='/home3/wangjx/CenterNet/CenterNet-master/src/lib/../../exp/ctdet/coco_dla/debug', debugger_theme='white', demo='', dense_hp=False, dense_wh=False, dep_weight=1, dim_weight=1, down_ratio=4, eval_oracle_dep=False, eval_oracle_hm=False, eval_oracle_hmhp=False, eval_oracle_hp_offset=False, eval_oracle_kps=False, eval_oracle_offset=False, eval_oracle_wh=False, exp_dir='/home3/wangjx/CenterNet/CenterNet-master/src/lib/../../exp/ctdet', exp_id='coco_dla', fix_res=False, flip=0.5, flip_test=False, gpus=[0], gpus_str='3', head_conv=256, heads={'hm': 80, 'wh': 2, 'reg': 2}, hide_data_time=False, hm_hp=True, hm_hp_weight=1, hm_weight=1, hp_weight=1, input_h=512, input_res=512, input_w=512, keep_res='True', kitti_split='3dop', load_model='../models/ctdet_coco_dla_2x.pth', lr=0.000125, lr_step=[90, 120], master_batch_size=32, mean=array([[[0.40789655, 0.44719303, 0.47026116]]], dtype=float32), metric='loss', mse_loss=False, nms=False, no_color_aug=False, norm_wh=False, not_cuda_benchmark=False, not_hm_hp=False, not_prefetch_test=False, not_rand_crop=False, not_reg_bbox=False, not_reg_hp_offset=False, not_reg_offset=False, num_classes=80, num_epochs=140, num_iters=-1, num_stacks=1, num_workers=4, off_weight=1, output_h=128, output_res=128, output_w=128, pad=31, peak_thresh=0.2, print_iter=0, rect_mask=False, reg_bbox=True, reg_hp_offset=True, reg_loss='l1', reg_offset=True, resume=False, root_dir='/home3/wangjx/CenterNet/CenterNet-master/src/lib/../..', rot_weight=1, rotate=0, save_all=False, save_dir='/home3/wangjx/CenterNet/CenterNet-master/src/lib/../../exp/ctdet/coco_dla', scale=0.4, scores_thresh=0.1, seed=317, shift=0.1, std=array([[[0.2886383 , 0.27408165, 0.27809834]]], dtype=float32), task='ctdet', test=False, test_scales=[1.0], trainval=False, val_intervals=5, vis_thresh=0.3, wh_weight=0.1)
==> initializing coco 2017 val data.
loading annotations into memory...
Done (t=0.62s)
creating index...
index created!
Loaded val 5000 samples
Creating model...
loaded ../models/ctdet_coco_dla_2x.pth, epoch 230
coco_dlaTraceback (most recent call last):
File "test.py", line 126, in
prefetch_test(opt)
File "test.py", line 69, in prefetch_test
for ind, (img_id, pre_processed_images) in enumerate(data_loader):
File "/home/wangjx/anaconda3/envs/CenterNet/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 336, in next
return self._process_next_batch(batch)
File "/home/wangjx/anaconda3/envs/CenterNet/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 357, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
AttributeError: Traceback (most recent call last):
File "/home/wangjx/anaconda3/envs/CenterNet/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 106, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/wangjx/anaconda3/envs/CenterNet/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 106, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "test.py", line 41, in getitem
images[scale], meta[scale] = self.pre_process_func(image, scale)
File "/home3/wangjx/CenterNet/CenterNet-master/src/lib/detectors/base_detector.py", line 38, in pre_process
height, width = image.shape[0:2]
AttributeError: 'NoneType' object has no attribute 'shape'

can you help me?

Undefined names

flake8 testing of https://github.com/xingyizhou/CenterNet on Python 3.7.1

$ flake8 . --count --select=E9,F63,F72,F82 --show-source --statistics

./src/lib/detectors/exdet.py:49:70: F821 undefined name 'opt'
        dets = self.decode(t_heat, l_heat, b_heat, r_heat, c_heat, K=opt.K,
                                                                     ^
./src/lib/trains/ddd.py:80:14: F821 undefined name 'ctadd_decode'
      dets = ctadd_decode(output['hm'], output['rot'], output['dep'],
             ^
./src/lib/trains/ddd.py:89:19: F821 undefined name 'ctadd_post_process'
      dets_pred = ctadd_post_process(
                  ^
./src/lib/trains/ddd.py:92:17: F821 undefined name 'ctadd_post_process'
      dets_gt = ctadd_post_process(
                ^
./src/lib/trains/ddd.py:141:12: F821 undefined name 'ctadd_decode'
    dets = ctadd_decode(output['hm'], output['rot'], output['dep'],
           ^
./src/lib/trains/ddd.py:148:17: F821 undefined name 'ctadd_post_process'
    dets_pred = ctadd_post_process(
                ^
./src/lib/models/networks/dlav0.py:417:5: F821 undefined name 'dla'
    dla.BatchNorm = bn
    ^
./src/lib/utils/debugger.py:225:11: F821 undefined name 'plt'
      fig=plt.figure(figsize=(nImgs * 10,10))
          ^
./src/lib/utils/debugger.py:231:11: F821 undefined name 'plt'
          plt.imshow(cv2.cvtColor(v, cv2.COLOR_BGR2RGB))
          ^
./src/lib/utils/debugger.py:233:11: F821 undefined name 'plt'
          plt.imshow(v)
          ^
./src/lib/utils/debugger.py:234:7: F821 undefined name 'plt'
      plt.show()
      ^
./src/tools/voc_eval_lib/model/test.py:132:14: F821 undefined name 'nms'
      keep = nms(dets, thresh)
             ^
./src/tools/voc_eval_lib/model/test.py:168:14: F821 undefined name 'nms'
      keep = nms(cls_dets, cfg.TEST.NMS)
             ^
13    F821 undefined name 'opt'
13

F821: undefined name name
F822: undefined name name in __all__
F823: local variable name referenced before assignment
E901: SyntaxError or IndentationError
E999: SyntaxError -- failed to compile a file into an Abstract Syntax Tree

to() got an unexpected keyword argument 'non_blocking'

When I try to run main.py. It seems has some problem:

Starting training...
Traceback (most recent call last):
File "/home/jin/Documents/fairy/CenterNet/src/main.py", line 103, in
main(opt)
File "/home/jin/Documents/fairy/CenterNet/src/main.py", line 70, in main
log_dict_train, _ = trainer.train(epoch, train_loader)
File "/home/jin/Documents/fairy/CenterNet/src/lib/trains/base_trainer.py", line 119, in train
return self.run_epoch('train', epoch, data_loader)
File "/home/jin/Documents/fairy/CenterNet/src/lib/trains/base_trainer.py", line 68, in run_epoch
batch[k] = batch[k].to(device=opt.device, non_blocking=True)
TypeError: to() got an unexpected keyword argument 'non_blocking'

How to dealing with the problem that different objects have the same keypoint in heatmap?

Fail to run camera demo on pytorch1.0+ + CUDA10 + Ubuntu 16.04 + cudnn7.4

$ python demo.py ctdet --demo webcam --load_model ../models/ctdet_coco_dla_2x.pth

Fix size testing.
training chunk_sizes: [32]
heads {'hm': 80, 'reg': 2, 'wh': 2}
Creating model...
loaded ../models/ctdet_coco_dla_2x.pth, epoch 230
Segmentation fault (core dumped)

Anyone getting the same issue? I have followed the steps of changing dcn_v2_cuda and also recompiling DCNv2 for pytorch1.0

Not use Corner Pooling

In the paper, center pooling module and cascade corner pooling module consist of corner pooling layers. But in the code, I don't find a corner pooling layer.

CenterNet works ok on Pytorch 1.1 + Cuda10.1 + Win10

First thanks for authors' great work.

This is not an issue. But I just want to say that CenterNet works ok on Pytorch 1.1 + Cuda10.1 + Win10:.

Just clone CenterNet, compile the nms and DCNv2, download the models, and run the demo.

1. build nms

cd CenterNet\src\lib\external
#python setup.py install
python setup.py build_ext --inplace

just comment the parameter in setup.py when building 'nms' extension to solve invalid numeric argument '/Wno-cpp' :

#extra_compile_args=["-Wno-cpp", "-Wno-unused-function"]

2. clone and build original DCN2

You may fail to compile DCNv2 when using Pytorch 1.x, because torch.utils.ffi is deprecated. Then replace DCNv2 using the original repo and [Solved] dcn_v2_cuda.obj : error LNK2001: unresolved external symbol state caused by extern THCState *state; by modifing the line DCNv2/blob/master/src/cuda/dcn_v2_cuda.cu#L11:

//extern THCState *state;                           
THCState *state = at::globalContext().lazyInitCUDA();   // Modified

cd CenterNet\src\lib\models\networks
rm -rf DCNv2
git clone https://github.com/CharlesShang/DCNv2
cd DCNv2

vim cuda/dcn_va_cuda.cu
"""
# extern THCState *state;
THCState *state = at::globalContext().lazyInitCUDA();
"""

python setup.py build develop

3. test

cd CenterNet/src
python demo.py ctdet --demo ../images/17790319373_bd19b24cfc_k.jpg --load_model ../models/ctdet_coco_dla_2x.pth --debug 2
python demo.py multi_pose --demo ../images/17790319373_bd19b24cfc_k.jpg --load_model ../models/multi_pose_dla_3x.pth --debug 2

down ratio

If I want to change the down ratio from 4 to 1, where should I change the code?

kitti test: Couldn't read: 006042.txt of ground truth.

When I run test.py on kitti, I encountered this error.

d3dop |################################| [3768/3769]|Tot: 0:02:40 |ETA: 0:00:01 |tot 0.038s (0.042s) |load 0.000s (0.000s) |pre 0.001s (0.001s) |net 0.033s (0.033s) |dec 0.001s (0.002s) |post 0.004s (0.006s) |merge 0.000s (0.000s) 
Thank you for participating in our evaluation!
Loading detections...
number of files for evaluation: 3769
ERROR: Couldn't read: 006042.txt of ground truth. Please write me an email!
An error occured while processing your results.

I know in prclibo's test code, I may encounter this error because the test file cannot access the original labels files always due to wrong label file path.
However, in this case, we use convert_kitti_to_coco.py to convert kitti form label files to coco form label files and do not need to access original kitti label files. So I don't know what went wrong.
Do you know the reason? Thanks!

UR CODE WAS COPIED FOR SELL NOW

http://manaai.cn/aicodes_detail3.html?id=33
A group of contemptible thieves has clone ur code, replace ur license with theirs and peddling it on their "WEBSITE".

dist must be a Distribution instance

Hi,

I ran 'make.sh' in windows and met the problem in line 39 of 'build.py', how can i fix it?

Thanks!

how to shown model structure?

Hi, i have meet some problems in looking at model structure part. I tried tensorboardX but it seems like need torch version >= 1.0 where this repository needs 0.4. I also tried pytorchviz but i can not find suitable location to insert it.

It's a little difficult for me to make it work. Could you please give me more instruction about this and help me out. Thanks a lot. I only want to look the different model structure more carefully. Thanks again.

No module named 'external.nms'

When run demo.py, it got the above error, from "from external.nms import soft_nms". So ,how to use nms.pyx in external folder.Thx!

DataLoader worker (pid 8107) is killed by signal: Killed. Details are lost due to multip│ rocessing. Rerunning with num_workers=0 may give better error trace

When I ran the command " python main.py ctdet --exp_id coco_dla --batch_size 32 --master_batch 15 --lr 1.25e-4 --gpus 0,1", this error pop up after about half an hour. It may be related to opencv(getAffineTransfrom and warpAffine) in the dataloader, I think.

Confusion in equation 2

@xingyizhou Thanks for sharing the code and the paper.
just I want to check the equation 2. The matrix O=(P/R -p_tilde) with dimension W/RxH/Rx2 contains the offset correspond to each center in the image. Example: if we have an image (512, 512) which contains 1 object whose center is p = (130, 130). So p_tilde=floor((130/4, 130/4)) =(32, 32) so p/R -p_tilde =(0.5, 0.5). So the position (32, 32, 0) and (32, 32, 1) (index start with 0 like in python) in the matrixO contains (.5, 0.5) and the rest of positions are set to 0. Is it right?

Another question: why all centers share the same tensor? I am not sure but maybe it will be difficult to map the offset back to the centers, particularly when we have two or more centers that are very close to each others !?
Another small question: in equation 1 what is the subscript k in L_k ?

About the BatchNorm

Hi, for the hourglass network, does the BN layers in the backbone need to be fixed during training?

how to get the test time

import sys
CENTERNET_PATH = /path/to/CenterNet/src/lib/
sys.path.insert(0, CENTERNET_PATH)

from detectors.detector_factory import detector_factory
from opts import opts

MODEL_PATH = /path/to/model
TASK = 'ctdet' # or 'multi_pose' for human pose estimation
opt = opts().init('{} --load_model {}'.format(TASK, MODEL_PATH).split(' '))
detector = detector_factory[opt.task](opt)

img_dir = "/home/wangdawei/github/image/CenterNet/images"
import os
import time
t1 = time.time()
for file in os.listdir(img_dir):
    print(file)
    if file.endswith(".jpg"):
        ret = detector.run(img)['results']
        #print(ret)
        pass
print(time.time() - t1)

my env:
ubuntu 18.04
gtx 1080 ti
cuda 10.0
pytorch 1.1

i use the code recommend, and the image in the images dir, 10 jpgs,
but i use 0.9 secodes, 90ms per image, which differs the 19 ms of the test time

how to get the test time?
is there something wrong with my test?

Link to arxiv broken.

Instead of pointing to the paper, the link is actually https://arxiv.org/abs/xxxx.xxxxx

Testing error on pascal voc 2007 dataset

When I use the following code to test on pascal voc 2007 dataset:

python test.py ctdet --exp_id dla --dataset pascal --load_model ../models/ctdet_pascal_dla_384.pth --flip_test

I encountered an error when getting the mAP after testing on each image:
......
dla |############################### | [4947/4952]|Tot: 0:01:28 |ETA: 0:00:01 |tot 0.017s (0.017s) |load 0.000s (0.000s) |pre 0.000s (0.000s) |net 0.014s (0.014
dla |############################### | [4948/4952]|Tot: 0:01:28 |ETA: 0:00:01 |tot 0.018s (0.017s) |load 0.000s (0.000s) |pre 0.000s (0.000s) |net 0.015s (0.014
dla |############################### | [4949/4952]|Tot: 0:01:28 |ETA: 0:00:01 |tot 0.017s (0.017s) |load 0.000s (0.000s) |pre 0.000s (0.000s) |net 0.014s (0.014
dla |############################### | [4950/4952]|Tot: 0:01:28 |ETA: 0:00:01 |tot 0.016s (0.017s) |load 0.000s (0.000s) |pre 0.000s (0.000s) |net 0.014s (0.014
dla |################################| [4951/4952]|Tot: 0:01:28 |ETA: 0:00:01 |tot 0.016s (0.017s) |load 0.000s (0.000s) |pre 0.000s (0.000s) |net 0.014s (0.014s) |dec 0.001s (0.001s) |post 0.001s (0.001s) |merge 0.000s (0.000s)
sh: 1: Syntax error: "(" unexpected

How to solve it? Thanks a lot.

Training: stuck

When I try to train on coco dataset, it is stuck as below:

My environment is : CUDA9.0 pytorch0.4.1, python3.6

About the cpu version of CenterNet

I am interested in using CenterNet on my own laptop due to the lack of GPU. Could you please give some tips on the cpu version?