v2ai / det3d Goto Github PK
View Code? Open in Web Editor NEWWorld's first general purpose 3D object detection codebse.
Home Page: https://arxiv.org/abs/1908.09492
License: Apache License 2.0
World's first general purpose 3D object detection codebse.
Home Page: https://arxiv.org/abs/1908.09492
License: Apache License 2.0
Kindly help: All values are naN
2020-01-07 17:22:53,040 - INFO - task : ['car'], loss: nan, cls_pos_loss: nan, cls_neg_loss: nan, dir_loss_reduced: nan, cls_loss_reduced: nan, loc_loss_reduced: nan, loc_loss_elem: ['nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan'], num_pos: 26.6600, num_neg: 31688.8400
2020-01-07 17:22:53,040 - INFO - task : ['truck', 'construction_vehicle'], loss: nan, cls_pos_loss: nan, cls_neg_loss: nan, dir_loss_reduced: nan, cls_loss_reduced: nan, loc_loss_reduced: nan, loc_loss_elem: ['nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan'], num_pos: 40.4800, num_neg: 63408.1400
2020-01-07 17:22:53,040 - INFO - task : ['bus', 'trailer'], loss: nan, cls_pos_loss: nan, cls_neg_loss: nan, dir_loss_reduced: nan, cls_loss_reduced: nan, loc_loss_reduced: nan, loc_loss_elem: ['nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan'], num_pos: 58.1800, num_neg: 63362.3000
2020-01-07 17:22:53,040 - INFO - task : ['barrier'], loss: nan, cls_pos_loss: nan, cls_neg_loss: nan, dir_loss_reduced: nan, cls_loss_reduced: nan, loc_loss_reduced: nan, loc_loss_elem: ['nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan'], num_pos: 7.8600, num_neg: 31742.0200
2020-01-07 17:22:53,040 - INFO - task : ['motorcycle', 'bicycle'], loss: nan, cls_pos_loss: nan, cls_neg_loss: nan, dir_loss_reduced: nan, cls_loss_reduced: nan, loc_loss_reduced: nan, loc_loss_elem: ['nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan'], num_pos: 11.8800, num_neg: 63486.6800
2020-01-07 17:22:53,040 - INFO - task : ['pedestrian', 'traffic_cone'], loss: nan, cls_pos_loss: nan, cls_neg_loss: nan, dir_loss_reduced: nan, cls_loss_reduced: nan, loc_loss_reduced: nan, loc_loss_elem: ['nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan'], num_pos: 13.6200, num_neg: 63489.2200
Comparing to the current master branch, I made two changes in order to fix the NaN training loss.
The first change is described in #46 .
The second change is to add what's below before line 193 in losses.py
# FIX NaN TARGETS
target_tensor = torch.where(
torch.isnan(target_tensor), prediction_tensor, target_tensor
)
Besides, I set
norm_cfg = dict(type='SyncBN', eps=1e-3, momentum=0.01)
in examples/cbgs/configs/nusc_all_vfev3_spmiddleresnetfhd_rpn2_mghead_syncbn.py
and
torch.backends.cudnn.benchmark = True
in tools/train.py
.
Here are my results on the validation set after training 20 epochs:
car Nusc dist [email protected], 1.0, 2.0, 4.0
59.25, 71.87, 77.22, 79.63 mean AP: 0.7199402759604012
truck Nusc dist [email protected], 1.0, 2.0, 4.0
17.96, 35.01, 43.00, 47.15 mean AP: 0.357782470829584
construction_vehicle Nusc dist [email protected], 1.0, 2.0, 4.0
0.00, 1.28, 6.75, 13.37 mean AP: 0.05348830261303094
bus Nusc dist [email protected], 1.0, 2.0, 4.0
23.87, 48.49, 62.98, 66.32 mean AP: 0.5041451034213309
trailer Nusc dist [email protected], 1.0, 2.0, 4.0
1.94, 14.27, 30.88, 42.11 mean AP: 0.22300031478924093
barrier Nusc dist [email protected], 1.0, 2.0, 4.0
28.06, 48.97, 57.80, 60.27 mean AP: 0.4877375663669212
motorcycle Nusc dist [email protected], 1.0, 2.0, 4.0
24.97, 29.29, 30.38, 30.99 mean AP: 0.28906646690838084
bicycle Nusc dist [email protected], 1.0, 2.0, 4.0
6.20, 7.36, 7.98, 8.53 mean AP: 0.07516058303100348
pedestrian Nusc dist [email protected], 1.0, 2.0, 4.0
62.82, 64.73, 66.83, 69.03 mean AP: 0.658543997130018
traffic_cone Nusc dist [email protected], 1.0, 2.0, 4.0
42.10, 44.31, 46.23, 50.65 mean AP: 0.4582346501948114
Overall the mean AP is 38.2, which is much lower than what's reported.
Can someone point me to what I might have missed? Thanks!
Hi,
Is there any pointpillar model or model configure for multiclass detection in kitti dataset?
in your code, you use https://github.com/poodarchu/Det3D/blob/1a674e9c80eb8b6213b2b24d0de15c64fb395b04/det3d/datasets/lyft/lyft_common.py#L58 IN the latest version, "rotate method is deprected. Use rotate_around_origin
and rotate_around_box_center
instead"
Thanks so much for sharing your codebase. It really helps accelerate research.
I noticed there is a circular dependency between det3d/core/__init__.py
and det3d/datasets/kitti/kitti.py
(they import each other).
To replicate:
python -c 'import det3d.core'
>>
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/root/det3d/det3d/core/__init__.py", line 2, in <module>
from .evaluation import *
File "/root/det3d/det3d/core/evaluation/__init__.py", line 10, in <module>
from .eval_hooks import KittiDistEvalmAPHook, KittiEvalmAPHookV2
File "/root/det3d/det3d/core/evaluation/eval_hooks.py", line 8, in <module>
from det3d import datasets, torchie
File "/root/det3d/det3d/datasets/__init__.py", line 4, in <module>
from .kitti import KittiDataset
File "/root/det3d/det3d/datasets/kitti/__init__.py", line 1, in <module>
from .kitti import KittiDataset
File "/root/det3d/det3d/datasets/kitti/kitti.py", line 7, in <module>
from det3d.core import box_np_ops
ImportError: cannot import name 'box_np_ops'
A workaround is to make sure to import det3d.datasets
before importing det3d.core
. A preferable solution would be to remove the circular dependency.
I have read through your paper and id like to know more about the network structure and number of parameters in each module. I tried to search in the code but i havent found anything.
Would you be able to indicate the size of 3D Feature extractor, RPN, Multi-group head in terms of layers and number of neurons in each layer? Or where can i find it in this repo?
Thank you
Graet work!
Are the VoteNet, Mesh R-CNN and C3DPO in the plan?
Thank you for your wonderful work.
"the nuScenes 3D Detection Challenge requires to detect 10 categories at the same time."
Does the model's Multi-group Head have 10 groups?
Thanks.
python setup.py develop made an issue
After adding in setup.py, got error below.
det3d/ops/nms/nms_kernel.cu.cc:48:61: note: (if you use \u2018-fpermissive\u2019, G++ will accept your code, but allowing the use of an undeclared name is deprecated)
det3d/ops/nms/nms_kernel.cu.cc:50:61: error: there are no arguments to \u2018min\u2019 that depend on a template parameter, so a declaration of \u2018min\u2019 must be available [-fpermissive]
min(n_boxes - col_start * BLOCK_THREADS, BLOCK_THREADS);
^
det3d/ops/nms/nms_kernel.cu.cc:53:7: error: \u2018threadIdx\u2019 was not declared in this scope
if (threadIdx.x < col_size)
^
det3d/ops/nms/nms_kernel.cu.cc:62:17: error: there are no arguments to \u2018__syncthreads\u2019 that depend on a template parameter, so a declaration of \u2018__syncthreads\u2019 must be available [-fpermissive]
__syncthreads();
^
det3d/ops/nms/nms_kernel.cu.cc:64:7: error: \u2018threadIdx\u2019 was not declared in this scope
if (threadIdx.x < row_size)
^
det3d/ops/nms/nms_kernel.cu.cc: In function \u2018int _nms_gpu(int*, const DType*, int, int, DType, int)\u2019:
det3d/ops/nms/nms_kernel.cu.cc:123:37: error: expected primary-expression before \u2018<\u2019 token
nms_kernel<DType, BLOCK_THREADS><<<blocks, threads>>>(boxes_num,
^
det3d/ops/nms/nms_kernel.cu.cc:123:55: error: expected primary-expression before \u2018>\u2019 token
nms_kernel<DType, BLOCK_THREADS><<<blocks, threads>>>(boxes_num,
^
det3d/ops/nms/nms_kernel.cu.cc: In instantiation of \u2018int _nms_gpu(int*, const DType*, int, int, DType, int) [with DType = float; int BLOCK_THREADS = 64]\u2019:
det3d/ops/nms/nms_kernel.cu.cc:162:65: required from here
det3d/ops/nms/nms_kernel.cu.cc:123:66: warning: left operand of comma operator has no effect [-Wunused-value]
nms_kernel<DType, BLOCK_THREADS><<<blocks, threads>>>(boxes_num,
^
det3d/ops/nms/nms_kernel.cu.cc:124:53: warning: right operand of comma operator has no effect [-Wunused-value]
nms_overlap_thresh,
^
det3d/ops/nms/nms_kernel.cu.cc:125:44: warning: right operand of comma operator has no effect [-Wunused-value]
boxes_dev,
^
det3d/ops/nms/nms_kernel.cu.cc: In instantiation of \u2018int _nms_gpu(int*, const DType*, int, int, DType, int) [with DType = double; int BLOCK_THREADS = 64]\u2019:
det3d/ops/nms/nms_kernel.cu.cc:165:66: required from here
det3d/ops/nms/nms_kernel.cu.cc:123:66: warning: left operand of comma operator has no effect [-Wunused-value]
nms_kernel<DType, BLOCK_THREADS><<<blocks, threads>>>(boxes_num,
^
det3d/ops/nms/nms_kernel.cu.cc:124:53: warning: right operand of comma operator has no effect [-Wunused-value]
nms_overlap_thresh,
^
det3d/ops/nms/nms_kernel.cu.cc:125:44: warning: right operand of comma operator has no effect [-Wunused-value]
boxes_dev,
^
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
@chowkamlee81
when i try to execute python create_data.py function, it gave error below:
Kinfdly help
File "/home/ubuntu/Nuscenes_Top/Det3D-master/det3d/ops/nms/nms_gpu.py", line 10, in
from det3d.ops.nms.nms import non_max_suppression
ModuleNotFoundError: No module named 'det3d.ops.nms.nms'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "tools/create_data.py", line 7, in
from det3d.datasets.kitti import kitti_common as kitti_ds
File "/home/ubuntu/Nuscenes_Top/Det3D-master/det3d/datasets/init.py", line 4, in
from .kitti import KittiDataset
File "/home/ubuntu/Nuscenes_Top/Det3D-master/det3d/datasets/kitti/init.py", line 1, in
from .kitti import KittiDataset
File "/home/ubuntu/Nuscenes_Top/Det3D-master/det3d/datasets/kitti/kitti.py", line 7, in
from det3d.core import box_np_ops
File "/home/ubuntu/Nuscenes_Top/Det3D-master/det3d/core/init.py", line 4, in
from .anchor import *
File "/home/ubuntu/Nuscenes_Top/Det3D-master/det3d/core/anchor/init.py", line 1, in
from .anchor_generator import (
File "/home/ubuntu/Nuscenes_Top/Det3D-master/det3d/core/anchor/anchor_generator.py", line 2, in
from det3d.core.bbox import box_np_ops
File "/home/ubuntu/Nuscenes_Top/Det3D-master/det3d/core/bbox/init.py", line 42, in
from . import box_coders, box_np_ops, box_torch_ops, geometry, region_similarity
File "/home/ubuntu/Nuscenes_Top/Det3D-master/det3d/core/bbox/box_coders.py", line 5, in
from . import box_np_ops, box_torch_ops
File "/home/ubuntu/Nuscenes_Top/Det3D-master/det3d/core/bbox/box_torch_ops.py", line 6, in
from det3d.ops.nms.nms_cpu import rotate_nms_cc
File "/home/ubuntu/Nuscenes_Top/Det3D-master/det3d/ops/nms/init.py", line 1, in
from det3d.ops.nms.nms_cpu import nms_jit, soft_nms_jit
File "/home/ubuntu/Nuscenes_Top/Det3D-master/det3d/ops/nms/nms_cpu.py", line 7, in
from det3d.ops.nms.nms_gpu import rotate_iou_gpu
File "/home/ubuntu/Nuscenes_Top/Det3D-master/det3d/ops/nms/nms_gpu.py", line 17, in
cuda=True,
File "/home/ubuntu/Nuscenes_Top/Det3D-master/det3d/utils/buildtools/pybind11_build.py", line 109, in load_pb11
cmds.append(Nvcc(s, out(s), arch))
File "/home/ubuntu/Nuscenes_Top/Det3D-master/det3d/utils/buildtools/command.py", line 128, in init
raise ValueError("you must specify arch if use cuda.")
ValueError: you must specify arch if use cuda.
First of all, thank you for your work!
When I was create dataset, the following error occurs:
python tools/create_data.py nuscenes_data_prep --root_path=/media/hz3014/DataLinux/v1.0-trainval_blobs --version="v1.0-trainval" --nsweeps=10
Traceback (most recent call last):
File "tools/create_data.py", line 7, in
from det3d.datasets.kitti import kitti_common as kitti_ds
File "/home/hz3014/Det3D/det3d/datasets/init.py", line 1, in
from .builder import build_dataset
File "/home/hz3014/Det3D/det3d/datasets/builder.py", line 3, in
from det3d.utils import build_from_cfg
File "/home/hz3014/Det3D/det3d/utils/init.py", line 2, in
from .registry import Registry, build_from_cfg
File "/home/hz3014/Det3D/det3d/utils/registry.py", line 3, in
from det3d import torchie
File "/home/hz3014/Det3D/det3d/torchie/init.py", line 2, in
from .cnn import *
File "/home/hz3014/Det3D/det3d/torchie/cnn/init.py", line 1, in
from .alexnet import AlexNet
ModuleNotFoundError: No module named 'det3d.torchie.cnn.alexnet'
It seems like in det3d/torchie/cnn, there is no related module provided.
Or just provide the estimated plane files. Thanks!
@poodarchu @a157801 thanks for this wonderful code base.
When training pointpillars on one gpu(3 samples per gpu), it consumes 8417MB GPU memory.
However, it consumes 13849/13237 MB memory when trained on 2 gpus in one machine with DDP, and samples per gpu are still 3.
I wonder if this normal case?
Beside CBGS, tring train original pointpillars in nuscenes with the repo.
find the loss compute problem leading to a gradient explosion
here is the first epoch Head1 box_conv weight:
box conv weight: Parameter containing:
tensor([[[[-0.0235]],
[[-0.0223]],
[[ 0.0100]],
...,
[[ 0.0126]],
[[-0.0176]],
[[ 0.0154]]],
[[[-0.0487]],
[[ 0.0367]],
[[ 0.0096]],
...,
[[ 0.0182]],
[[ 0.0200]],
[[-0.0325]]],
[[[ 0.0089]],
[[-0.0121]],
[[-0.0017]],
...,
[[-0.0492]],
[[-0.0505]],
[[-0.0137]]],
...,
[[[-0.0302]],
[[-0.0257]],
[[-0.0246]],
...,
[[ 0.0090]],
[[-0.0497]],
[[ 0.0128]]],
[[[ 0.0449]],
[[ 0.0291]],
[[ 0.0460]],
...,
[[ 0.0024]],
[[-0.0081]],
[[-0.0162]]],
[[[ 0.0178]],
[[-0.0133]],
[[ 0.0189]],
...,
[[ 0.0100]],
[[-0.0445]],
[[-0.0162]]]], device='cuda:0', requires_grad=True)
here is the loss output (only compute head1 loss):
OrderedDict([('loss', [203.5531005859375]), ('cls_pos_loss', [0.04986190423369408]), ('cls_neg_loss', [201.2117919921875]), ('dir_loss_reduced', [0.6615481376647949]), ('cls_loss_reduced', [201.26165771484375]), ('loc_loss_reduced', [2.1591315269470215]), ('loc_loss_elem', [[0.05492932349443436, 0.041640881448984146, 0.67469322681427, 0.035490743815898895, 0.05674883723258972, 0.05906621366739273, 0.0, 0.0, 0.15699654817581177]]), ('num_pos', [86]), ('num_neg', [126794])])
in the second epoch:
the head1 cpnv_box weight changed and contain some NaN value:
box conv weight: Parameter containing:
tensor([[[[-0.0235]],
[[-0.0223]],
[[ 0.0100]],
...,
[[ 0.0126]],
[[-0.0176]],
[[ 0.0154]]],
[[[-0.0487]],
[[ 0.0367]],
[[ 0.0096]],
...,
[[ 0.0182]],
[[ 0.0200]],
[[-0.0325]]],
[[[ 0.0089]],
[[-0.0121]],
[[-0.0017]],
...,
[[-0.0492]],
[[-0.0505]],
[[-0.0137]]],
...,
[[[ nan]],
[[ nan]],
[[ nan]],
...,
[[ nan]],
[[ nan]],
[[ nan]]],
[[[ nan]],
[[ nan]],
[[ nan]],
...,
[[ nan]],
[[ nan]],
[[ nan]]],
[[[ 0.0178]],
[[-0.0133]],
[[ 0.0189]],
...,
[[ 0.0100]],
[[-0.0445]],
[[-0.0162]]]], device='cuda:0', requires_grad=True)
that's the last layer weight contain nan value leading back propagation to other layer are all nan value, the grad clip are set to:
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
Another try is that I set the loss value in a fixed num(300), which leading no nan value in all layer weight, and the loss are normal value(which means the problem is the loss compute rather than the network layer compute problem).
Hi @poodarchu Thanks for your great code! I trained pointpillar with default config, while the performance is as follow, which is similar to the results post by @s-ryosky in #18 .
For the category of car, the published mAP of moderate level on kitti 3D test dataset is 74.99, while my trained one is only 75.66 on kitti val dataset. It seems cannot exceed the published one.
As far as I konw, other researchers could achieve about 77 on val with pointpillar, I wonder if there exists any problem in the configs? and can you pubulish your results? Thanks a lot!
git diff
) or what code you wrote# convert velo from global to lidar
for i in range(len(ref_boxes)):
velo = np.array([*velocity[i], 0.0])
velo = velo @ np.linalg.inv(e2g_r_mat).T @ np.linalg.inv(
l2e_r_mat).T
velocity[i] = velo[:2]
velocity = velocity.reshape(-1,2)
python3 -m torch.distributed.launch --nproc_per_node=8 ./tools/train.py examples/cbgs/configs/nusc_all_vfev3_spmiddleresnetfhd_rpn2_mghead_syncbn.py --work_dir=/home/ubuntu/Documents/Det3D/trained_model
mAP: 0.3719
mATE: 0.3724
mASE: 0.2661
mAOE: 0.9296
mAVE: 1.3655
mAAE: 0.2684
NDS: 0.4023
Eval time: 140.1s
Per-class results:
Object Class AP ATE ASE AOE AVE AAE
car 0.721 0.219 0.158 0.841 1.116 0.230
truck 0.371 0.426 0.198 0.640 1.155 0.307
bus 0.500 0.439 0.174 1.223 2.171 0.431
trailer 0.213 0.687 0.219 0.670 1.371 0.184
construction_vehicle 0.058 0.798 0.481 1.370 0.157 0.372
pedestrian 0.653 0.165 0.287 1.350 0.869 0.439
motorcycle 0.242 0.223 0.243 1.107 3.192 0.153
bicycle 0.043 0.199 0.264 1.111 0.894 0.031
traffic_cone 0.449 0.170 0.348 nan nan nan
barrier 0.470 0.398 0.289 0.056 nan nan
Evaluation nusc: Nusc v1.0-trainval Evaluation
car Nusc dist [email protected], 1.0, 2.0, 4.0
59.48, 71.97, 77.40, 79.65 mean AP: 0.7212472062431424
truck Nusc dist [email protected], 1.0, 2.0, 4.0
18.48, 36.29, 44.83, 48.91 mean AP: 0.3712787143771077
construction_vehicle Nusc dist [email protected], 1.0, 2.0, 4.0
0.00, 2.09, 8.06, 12.96 mean AP: 0.05777510817362395
bus Nusc dist [email protected], 1.0, 2.0, 4.0
23.60, 47.30, 62.94, 66.32 mean AP: 0.5003920838518946
trailer Nusc dist [email protected], 1.0, 2.0, 4.0
1.66, 13.24, 29.62, 40.49 mean AP: 0.21251682647224052
barrier Nusc dist [email protected], 1.0, 2.0, 4.0
26.14, 46.78, 55.95, 59.13 mean AP: 0.4700045657239055
motorcycle Nusc dist [email protected], 1.0, 2.0, 4.0
20.39, 24.74, 25.62, 26.05 mean AP: 0.24202605811125658
bicycle Nusc dist [email protected], 1.0, 2.0, 4.0
3.93, 4.27, 4.35, 4.58 mean AP: 0.04280152387228541
pedestrian Nusc dist [email protected], 1.0, 2.0, 4.0
62.12, 64.40, 66.13, 68.36 mean AP: 0.6525328516852104
traffic_cone Nusc dist [email protected], 1.0, 2.0, 4.0
40.81, 43.40, 45.49, 49.80 mean AP: 0.44874109465427564
Unable to reproduce the results in model zoo.
the score NDS don't reach the number in released paper, and the AVE number is abnormal large than others, this reproduced result even worse than pointpillars. Is the loss compute func exist some problems leading to this result?
Hi, thanks for your great work!
In the paper and Nuscenes dataset leader board, I saw the remarkable improvements. So how about kitti dataset performance? Did you compare the multitask results with SECOND1.6's single-class results? I think pointpillars multitask is not quite good because it directly uses SECOND1.0 code which failed to choose smartly the headers(even SECOND1.6 is not good).
Since KITTI dataset exists longer, I believe better results on it will be much persuasive.
Thanks!
I believe this is related to #6 #42 #43 and #19 .
I followed INSTALL.md and installed nuscenes from https://github.com/poodarchu/nuscenes.git. I have also run create_data.py
accordingly. From what I have seen, ground truth velocities that are cached in infos_train_10sweeps_withvelo.pkl
are all NaN. I believe this is at least one of the issues that results in NaN losses.
I think line 516 in nusc_common.py
velocity = np.array([b.velocity for b in ref_boxes]).reshape(-1, 3)
should be:
velocity = np.array([
nusc.box_velocity(token) for token in sample['anns']
]).reshape((-1, 3))
Otherwise the function (box_velocity
) that computes velocity will never be called and b.velocity
will stay uninitialized as NaNs.
Kindly help all values are naN . Iam using single GPU
2020-01-07 17:22:53,040 - INFO - task : ['car'], loss: nan, cls_pos_loss: nan, cls_neg_loss: nan, dir_loss_reduced: nan, cls_loss_reduced: nan, loc_loss_reduced: nan, loc_loss_elem: ['nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan'], num_pos: 26.6600, num_neg: 31688.8400
2020-01-07 17:22:53,040 - INFO - task : ['truck', 'construction_vehicle'], loss: nan, cls_pos_loss: nan, cls_neg_loss: nan, dir_loss_reduced: nan, cls_loss_reduced: nan, loc_loss_reduced: nan, loc_loss_elem: ['nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan'], num_pos: 40.4800, num_neg: 63408.1400
2020-01-07 17:22:53,040 - INFO - task : ['bus', 'trailer'], loss: nan, cls_pos_loss: nan, cls_neg_loss: nan, dir_loss_reduced: nan, cls_loss_reduced: nan, loc_loss_reduced: nan, loc_loss_elem: ['nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan'], num_pos: 58.1800, num_neg: 63362.3000
2020-01-07 17:22:53,040 - INFO - task : ['barrier'], loss: nan, cls_pos_loss: nan, cls_neg_loss: nan, dir_loss_reduced: nan, cls_loss_reduced: nan, loc_loss_reduced: nan, loc_loss_elem: ['nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan'], num_pos: 7.8600, num_neg: 31742.0200
2020-01-07 17:22:53,040 - INFO - task : ['motorcycle', 'bicycle'], loss: nan, cls_pos_loss: nan, cls_neg_loss: nan, dir_loss_reduced: nan, cls_loss_reduced: nan, loc_loss_reduced: nan, loc_loss_elem: ['nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan'], num_pos: 11.8800, num_neg: 63486.6800
2020-01-07 17:22:53,040 - INFO - task : ['pedestrian', 'traffic_cone'], loss: nan, cls_pos_loss: nan, cls_neg_loss: nan, dir_loss_reduced: nan, cls_loss_reduced: nan, loc_loss_reduced: nan, loc_loss_elem: ['nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan'], num_pos: 13.6200, num_neg: 63489.2200
First of all, thanks a lot for this comprehensive 3d detection library!
Considering open waymo dataset has 10x more annotations compared to nuscenes, do you have any intention of adding open waymo dataset support?
Hi,
Thanks for sharing your great work! I am wondering if it is easy to train with multiple GPUs. I tried calling tools/train.py
with --gpus=4
but it does not seem to do the trick.
Thanks,
Peiyun
trying train CBGS in 8 GPU(2080ti), using the newest repo code, use follow code to start.
python3 -m torch.distributed.launch --nproc_per_node=8 ./tools/train.py examples/cbgs/configs/nusc_all_vfev3_spmiddleresnetfhd_rpn2_mghead_syncbn.py --work_dir=/home/ubuntu/Documents/Det3D/trained_model
the error looks like happend in syncBN:
return SyncBatchnormFunction.apply(input, z, self.weight, self.bias, self.running_mean, self.running_var, self.eps, self.training or not self.track_running_stats, exponential_average_factor, self.process_group, self.channel_last, self.fuse_relu)
File "/home/ubuntu/.local/lib/python3.6/site-packages/apex/parallel/optimized_sync_batchnorm_kernel.py", line 26, in forward
mean, var_biased = syncbn.welford_mean_var(input)
RuntimeError: Dimension out of range (expected to be in range of [-2, 1], but got 2) (maybe_wrap_dim at /pytorch/c10/core/WrapDimMinimal.h:20)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x33 (0x7f92265a1813 in /home/ubuntu/.local/lib/python3.6/site-packages/torch/lib/libc10.so)
by the way, server environment are using pytorch 1.3.1 + CUDA 10.1 + python3.6
@poodarchu
In SECOND's repo adam optimizer with fixed weight decay is supported on all lr scheduler.
However, in this repo fixed weight decay is supported on only "one cycle" lr scheduler.
Why?
def build_one_cycle_optimizer(model, optimizer_config):
if optimizer_config.fixed_wd:
optimizer_func = partial(
torch.optim.Adam, betas=(0.9, 0.99), amsgrad=optimizer_config.amsgrad
)
else:
optimizer_func = partial(torch.optim.Adam, amsgrad=optimizer_cfg.amsgrad)
optimizer = OptimWrapper.create(
optimizer_func,
3e-3,
get_layer_groups(model),
wd=optimizer_config.wd,
true_wd=optimizer_config.fixed_wd,
bn_wd=True,
)
return optimizer
hi, thanks for this great code base.
I wonder where is code for dataset sampling(DS Sampling), which cause +5 map gain according to your paper "Class-balanced Grouping and Sampling for Point Cloud 3D Object Detection", I could only find db_sampler of "GT_AUG" type, but I think it's diffierent than DS Sampling, am I understanding this correctly?
@poodarchu @s-ryosky do we have inference code or eval.py for all the architectures ?
Questions like:
Example: How to visualize detection result with Det3D?
NOTE:
If you met any unexpected issue when using Det3D and wish to know why,
please use the "Unexpected Problems / Bugs" issue template.
We do not answer general machine learning / computer vision questions that are not specific to
Det3D, such as how a model works, how to improve your training/make it converge, or what algorithm/methods can be used to achieve X.
Here is a simple experiment on KITTI dataset.
By adding RGB features into points, the 3d AP increases, but the bev AP drops a lot.
Benchmark
car [email protected], 0.70, 0.70:
bbox AP:90.70, 88.95, 87.33
bev AP:89.65, 84.71, 81.73
3d AP:85.85, 76.36, 69.63
aos AP:90.61, 88.30, 86.31
with RGB feature
car [email protected], 0.70, 0.70:
bbox AP:90.63, 88.86, 87.35
bev AP:89.75, 86.15, 83.00
3d AP:85.75, 75.68, 68.93
aos AP:90.48, 88.36, 86.58
Based on Painted PointPillars result with segmentation feature instead of RGB feature
BEV on test set
mAP | Car AP
Mod. | Easy | Mod. | Hard
73.84 90.21 87.75 84.92
76.46 90.01 87.65 85.26
+2.62 -0.2 -0.1 +0.34
I address this as an overfitting problem and will test it.
Does anybody observe a similar result?
How about using the Nucense dataset?
How about adding augmentation on RGB?
Hope for large 3d AP gain on Pedestrian and Cyclist.
During training, I meet the problem below
at epoch 63 with --nproc_per_node=2, samples_per_gpu=6 and workers_per_gpu=6
at epoch 81 with --nproc_per_node=2, samples_per_gpu=4 and workers_per_gpu=4
cudahash: Completely failed to build
Cuda error in file '/root/spconv/src/cuhash/hash_table.cpp' in line 194 : an illegal memory access was encountered.
It looks like there still remain some compiled .so file in the ./det3d/ops/ .. , these files make this repo hard to clone and download , please delete it.
@poodarchu
I tried to install Det3D with "python setup.py build develop" according to the install.md, but got error when install vtk...
I want to know the correspondence between result_val.json and Nuscenes sample_annotation.json in trainval set. It will help a lot when I apply your result in other tasks.
Thanks for attention!
I want to run you sample code, but I dou`t know where can I download this dataset. Can you support the dataset download link?
When codes will get released? Awaiting for
cxt@ubuntu4-X299X-AORUS-MASTER:~/codetest/det3d$ python tools/create_data.py kitti_data_prep --root_path=/home/cxt/Kitti/object/
/home/cxt/anaconda3/envs/second/lib/python3.6/site-packages/numba/cuda/envvars.py:17: NumbaWarning:
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.
For more information about alternatives visit: ('http://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')
warnings.warn(errors.NumbaWarning(msg))
/home/cxt/anaconda3/envs/second/lib/python3.6/site-packages/numba/cuda/envvars.py:17: NumbaWarning:
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice.
For more information about alternatives visit: ('http://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')
warnings.warn(errors.NumbaWarning(msg))
/home/cxt/anaconda3/envs/second/lib/python3.6/site-packages/numba/cuda/envvars.py:17: NumbaWarning:
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_CUDA_DRIVER=/usr/lib/x86_64-linux-gnu/libcuda.so.
For more information about alternatives visit: ('http://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')
warnings.warn(errors.NumbaWarning(msg))
Traceback (most recent call last):
File "tools/create_data.py", line 7, in
from det3d.datasets.kitti import kitti_common as kitti_ds
File "/home/cxt/codetest/det3d/det3d/datasets/init.py", line 4, in
from .kitti import KittiDataset
File "/home/cxt/codetest/det3d/det3d/datasets/kitti/init.py", line 1, in
from .kitti import KittiDataset
File "/home/cxt/codetest/det3d/det3d/datasets/kitti/kitti.py", line 8, in
from det3d.datasets.custom import PointCloudDataset
File "/home/cxt/codetest/det3d/det3d/datasets/custom.py", line 8, in
from .pipelines import Compose
File "/home/cxt/codetest/det3d/det3d/datasets/pipelines/init.py", line 18, in
from .preprocess import Preprocess, Voxelization, AssignTarget
File "/home/cxt/codetest/det3d/det3d/datasets/pipelines/preprocess.py", line 8, in
from det3d.builder import (
File "/home/cxt/codetest/det3d/det3d/builder.py", line 18, in
from det3d.models.losses import GHMCLoss, GHMRLoss, losses
File "/home/cxt/codetest/det3d/det3d/models/init.py", line 2, in
from .backbones import * # noqa: F401,F403
File "/home/cxt/codetest/det3d/det3d/models/backbones/init.py", line 1, in
from .scn import RCNNSpMiddleFHD, SpMiddleFHD
File "/home/cxt/codetest/det3d/det3d/models/backbones/scn.py", line 6, in
from det3d.models.utils import Empty, change_default_args
File "/home/cxt/codetest/det3d/det3d/models/utils/init.py", line 1, in
from .conv_module import ConvModule, build_conv_layer
File "/home/cxt/codetest/det3d/det3d/models/utils/conv_module.py", line 7, in
from .norm import build_norm_layer
File "/home/cxt/codetest/det3d/det3d/models/utils/norm.py", line 4, in
from det3d.ops.syncbn import DistributedSyncBN
File "/home/cxt/codetest/det3d/det3d/ops/syncbn/init.py", line 1, in
from .syncbn import DistributedSyncBN
File "/home/cxt/codetest/det3d/det3d/ops/syncbn/syncbn.py", line 12, in
from . import syncbn_gpu
ImportError: cannot import name 'syncbn_gpu'
Sometimes my program is terminated by "CUDA error: an illegal memory acess was encountered" in the training process. I used official code and default config setting, only changing the data_root and work_dir, the bug occured in the training in both cases of single gpu and distributed multiple gpus. The picture below shows the error infomation:
Sometimes the training on a single gpu could also be terminated as below:
While this problems seems can be ignored in multi-gpu training:
The envrionment of my server includes:
- OS: Ubuntu 16.04
- Python: 3.7.3
- CUDA: 10.1
- CUDNN: 7.4.1
- pytorch: 1.3.1
- gcc: 5.5.0
- cmake: 3.16.0
- nvidia driver version: 418.40.04
- gpu: 8 TITAN Xp
Really weird! How can i solve the problems as they usually occurs? Could anyone provide some information on these problems? Thanks a lot!
i get error in this section of nuscenes_commons.py
"len(info["sweeps"]) == nsweeps - 1), f"sweep {curr_sd_rec['token']} only has {len(info['sweeps'])} sweeps, you should duplicate to sweep num {nsweeps-1}"" saying as invalid syntax
After modify some configs and compile the nms_gpu module successfully,
I am trying to train the CBGS network in my local computer with Nuscenens Dataset,
Not using the train.sh, but directly use
python3 train.py
/home/muzi2045/Documents/Det3D/examples/cbgs/configs/nusc_all_vfev3_spmiddleresnetfhd_rpn2_mghead_syncbn.py --gpus=1
it can run , but the output in log file are nan value
2019-12-21 14:56:28,351 - INFO - Start running, host: muzi2045@muzi2045-MS-7B48, work_dir: /home/muzi2045/Documents/Det3D/trained_model
2019-12-21 14:56:28,351 - INFO - workflow: [('train', 1), ('val', 1)], max: 20 epochs
2019-12-21 14:56:57,005 - INFO - Epoch [1/20][50/64050] lr: 0.00010, eta: 8 days, 11:53:41, time: 0.573, data_time: 0.178, transfer_time: 0.012, forward_time: 0.112, loss_parse_time: 0.000 memory: 1689,
2019-12-21 14:56:57,005 - INFO - task : ['car'], loss: nan, cls_pos_loss: nan, cls_neg_loss: nan, dir_loss_reduced: nan, cls_loss_reduced: nan, loc_loss_reduced: nan, loc_loss_elem: ['nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan'], num_pos: 26.4600, num_neg: 31687.8400
2019-12-21 14:56:57,005 - INFO - task : ['truck', 'construction_vehicle'], loss: nan, cls_pos_loss: nan, cls_neg_loss: nan, dir_loss_reduced: nan, cls_loss_reduced: nan, loc_loss_reduced: nan, loc_loss_elem: ['nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan'], num_pos: 36.3400, num_neg: 63408.7600
2019-12-21 14:56:57,005 - INFO - task : ['bus', 'trailer'], loss: nan, cls_pos_loss: nan, cls_neg_loss: nan, dir_loss_reduced: nan, cls_loss_reduced: nan, loc_loss_reduced: nan, loc_loss_elem: ['nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan'], num_pos: 54.0200, num_neg: 63379.1400
2019-12-21 14:56:57,005 - INFO - task : ['barrier'], loss: nan, cls_pos_loss: nan, cls_neg_loss: nan, dir_loss_reduced: nan, cls_loss_reduced: nan, loc_loss_reduced: nan, loc_loss_elem: ['nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan'], num_pos: 7.6200, num_neg: 31742.6000
2019-12-21 14:56:57,005 - INFO - task : ['motorcycle', 'bicycle'], loss: nan, cls_pos_loss: nan, cls_neg_loss: nan, dir_loss_reduced: nan, cls_loss_reduced: nan, loc_loss_reduced: nan, loc_loss_elem: ['nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan'], num_pos: 11.4400, num_neg: 63487.4600
2019-12-21 14:56:57,005 - INFO - task : ['pedestrian', 'traffic_cone'], loss: nan, cls_pos_loss: nan, cls_neg_loss: nan, dir_loss_reduced: nan, cls_loss_reduced: nan, loc_loss_reduced: nan, loc_loss_elem: ['nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan'], num_pos: 13.4600, num_neg: 63489.3400
And how can I shutdown the gt_database file path output log?
Hopefully for any advice!
@poodarchu
Try to train this CBGS in single GPU, after modified some params, occur this error:
it looks like when trained with Nuscenes dataset, setting the Range [-50.4, -50.4, 50.4, 50,4], and voxels_size[0.1, 0.1], it will generate [1008, 1008] array -> 1008 * 1008 * 2 = 2032128 anchors per class, but the box_preds output is [1, 126, 126, 18] per class -> 126 * 126 * 2 = 31752.
def add_sin_difference(boxes1, boxes2):
rad_pred_encoding = torch.sin(boxes1[..., -1:]) * torch.cos(boxes2[..., -1:])
rad_tg_encoding = torch.cos(boxes1[..., -1:]) * torch.sin(boxes2[..., -1:])
boxes1 = torch.cat([boxes1[..., :-1], rad_pred_encoding], dim=-1)
boxes2 = torch.cat([boxes2[..., :-1], rad_tg_encoding], dim=-1)
return boxes1, boxes2
Hopefully for any advice
@a157801 @poodarchu
Dear @poodarchu ,
Thanks for your great work, with your open code, most researchers can save a lot of time. Now, many open source code just cannot reproduce results announced in their papers, which causes much confusion for followers. Can you share detailed results with your reproduced models (like pointrcnn) and give more configs files for various models? Thanks.
@poodarchu @a157801 thanks for the wonderful code base had few queries
Error when I install the nuscenes-devkit.
The detail error is :
error in nuscenes-zbj setup command: "values of 'package_data' dict" must be a list of strings (got '*.json')
I have try to use
python3 setup.py build develop
it looks like the nms.so can't be generated in this repo, the other ops can be normally compiled.
How to determine in which direction a certain annotation was taken by the camera?
您好。
我是在读本科生,关于nuscenes可以交流一下吗?
想问一下,对于某个sample,其有6个方向的camera的jpg图片
现在我们想要确定某个方向图片的annotation。
可是,官方接口只给出了一个sample的所有annotation,这包括六个方向所有图片的annotation,我们无法确定某个方向图片有哪些annotation与之对应。
拜托了, 麻烦了。
也就是,怎么知道 某个annotation是哪个方向的相机拍出来的
How to determine in which direction a certain annotation was taken by the camera?
Do you have any number for FPS of the proposed method?
PointRCNN
PointRCNN as mentioned in README TODO list and #35 (comment)
Hi @poodarchu , could you please estimate release time of PointRCNN? Thanks a lot and looking forward to that
I encounter this error when executing
python setup.py build develop
what changes you made (git diff
) or what code you wrote
After the commit b567905
what exact command you run:
Det3D/tools/train.py
what you observed (including the full logs):
AttributeError: 'Tensor' object has no attribute 'bool'
is occurred in the following line.
https://github.com/poodarchu/Det3D/blob/56402d4761a5b73acd23080f537599b0888cce07/det3d/models/bbox_heads/mg_head.py#L1038
PyTorch 1.1 or higher is recommended in readme.
But PyTorch==1.1 doesn't support to the 'bool' attribute.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.