fnozarian / openpcdet Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
I was debugging the evaluation part of the code and found that total_dt_num
and total_gt_num
are interchanged while fetching the returns from calculate_iou_partly()
calculate_iou_partly()
returns the following :
However, while using these in L474, we are interchanging the positions of total_dt_num
and total_gt_num
OpenPCDet/pcdet/datasets/kitti/kitti_object_eval_python/eval.py
Lines 473 to 474 in c3bcae9
An error occurs at the end of each epoch or beginning of the next epoch, but training continues w/o any problem. This should be probably related to our new torchmetrics stats.
train: 100%|██████████| 37/37 [00:42<00:00, 1.18s/it, total_it=36]
epochs: 0%| | 0/60 [00:42<?, ?it/s, loss=3.25, lr=0.00104, d_time=0.00(0.01), f_time=0.82(0.67), b_time=1.07(1.00)]
epochs: 2%|▏ | 1/60 [00:42<41:49, 42.54s/it, loss=3.25, lr=0.00104, d_time=0.00(0.01), f_time=0.82(0.67), b_time=1.07(1.00)]Exception ignored in: <function _MultiProcessingDataLoaderIter.__del__ at 0x7f9f104e6280>
Traceback (most recent call last):
File "/home/farzad/anaconda3/envs/pcdet/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1328, in __del__
self._shutdown_workers()
File "/home/farzad/anaconda3/envs/pcdet/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1320, in _shutdown_workers
if w.is_alive():
File "/home/farzad/anaconda3/envs/pcdet/lib/python3.8/multiprocessing/process.py", line 160, in is_alive
assert self._parent_pid == os.getpid(), 'can only test a child process'
AssertionError: can only test a child process
Exception ignored in: <function _MultiProcessingDataLoaderIter.__del__ at 0x7f9f104e6280>
Traceback (most recent call last):
File "/home/farzad/anaconda3/envs/pcdet/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1328, in __del__
self._shutdown_workers()
File "/home/farzad/anaconda3/envs/pcdet/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1320, in _shutdown_workers
if w.is_alive():
File "/home/farzad/anaconda3/envs/pcdet/lib/python3.8/multiprocessing/process.py", line 160, in is_alive
assert self._parent_pid == os.getpid(), 'can only test a child process'
AssertionError: can only test a child process
Exception ignored in: <function _MultiProcessingDataLoaderIter.__del__ at 0x7f9f104e6280>
Traceback (most recent call last):
File "/home/farzad/anaconda3/envs/pcdet/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1328, in __del__
self._shutdown_workers()
File "/home/farzad/anaconda3/envs/pcdet/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1320, in _shutdown_workers
if w.is_alive():
File "/home/farzad/anaconda3/envs/pcdet/lib/python3.8/multiprocessing/process.py", line 160, in is_alive
assert self._parent_pid == os.getpid(), 'can only test a child process'
AssertionError: can only test a child process
train: 0%| | 0/37 [00:00<?, ?it/s]Exception ignored in: <function _MultiProcessingDataLoaderIter.__del__ at 0x7f9f104e6280>
Traceback (most recent call last):
File "/home/farzad/anaconda3/envs/pcdet/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1328, in __del__
self._shutdown_workers()
File "/home/farzad/anaconda3/envs/pcdet/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1320, in _shutdown_workers
if w.is_alive():
File "/home/farzad/anaconda3/envs/pcdet/lib/python3.8/multiprocessing/process.py", line 160, in is_alive
assert self._parent_pid == os.getpid(), 'can only test a child process'
AssertionError: can only test a child process
train: 3%|▎ | 1/37 [00:01<01:08, 1.91s/it]
epochs: 2%|▏ | 1/60 [00:44<41:49, 42.54s/it, loss=7.24, lr=0.00104, d_time=0.83(0.83), f_time=0.82(0.82), b_time=1.91(1.91)]
train: 5%|▌ | 2/37 [00:03<01:00, 1.73s/it, total_it=38]
Number of GT: 0 Number of Pred: 20
OpenPCDet/pcdet/utils/stats_utils.py
Lines 128 to 130 in 3e03450
Fix to add:
Relevant Issue/Discussions:
Lightning-AI/torchmetrics#881.
Lightning-AI/torchmetrics#1097.
The following per-batch states are being updated with mean values in update()
:
OpenPCDet/pcdet/utils/stats_utils.py
Lines 133 to 139 in 9890d68
When compute()
is called, these states are already scalar tensors, so the mean computation can be avoided here.
OpenPCDet/pcdet/utils/stats_utils.py
Lines 141 to 145 in 9890d68
this should be something
smpl_idxs = torch.from_numpy(batch_dict_ema['frame_id'].astype(np.int32)).to(labeled_inds.device)[labeled_inds].int()
Currently we have 100 max_overlap
i.e. 100 proposals coming from the Teacher
Whereas, while doing subsampling based on FG-BG ratio, we have self.roi_sampler_cfg.ROI_PER_IMAGE=128
, due to which we are sampling some repetitive BGs.
Example of sampled indices for BGs:
Can we avoid this repetition by keeping self.roi_sampler_cfg.ROI_PER_IMAGE=100
and setting the remaining ROIs and their corresponding reg_valid_mask
and rcnn_cls_labels
as invalid (like we were doing before in _override_unlabeled_target
) ?
precision, recall and detailed_stats are initialized with 41 points, but most of the time they are not filled completely. Therefore mAP for example is calculated wrongly by averaging over all 41 points that most of them are zeros.
precision = np.zeros([num_class, num_minoverlap, N_SAMPLE_PTS])
recall = np.zeros([num_class, num_minoverlap, N_SAMPLE_PTS])
detailed_stats = np.zeros([num_class, num_minoverlap, N_SAMPLE_PTS, 5])
ignored_det
and ignored_gt
are being assigned either 0 or -1 based on valid class index in clean_data
OpenPCDet/pcdet/utils/stats_utils.py
Lines 206 to 228 in d274fe0
However, in compute_statistics_jit
, we are checking for some cases where ignored_det==1
and ignored_gt==1
OpenPCDet/pcdet/utils/stats_utils.py
Lines 284 to 308 in d274fe0
Is there any case, when these two values (ignored_det
and ignored_gt
) are assigned 1 ?
While iterating over different classes, max/mean recall
values in statistics
are getting overriden with new values.
OpenPCDet/pcdet/models/detectors/pv_rcnn_ssl.py
Lines 370 to 373 in fe56e56
May be we can do something like we are doing for storing the precision
i.e. adding the values for each class in class_metrics_all
and storing this dictionary in statistics
:
OpenPCDet/pcdet/models/detectors/pv_rcnn_ssl.py
Lines 362 to 368 in fe56e56
When dsnorm is true there are some keys that are missing from the state dict so the loading can not be done properly
File "../pcdet/models/detectors/detector3d_template.py", line 380, in load_params_from_file state_dict, update_model_state = self._load_state_dict(model_state_disk, strict=strict) File "../pcdet/models/detectors/detector3d_template.py", line 356, in _load_state_dict self.load_state_dict(update_model_state) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1070, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for PVRCNN: Missing key(s) in state_dict: "backbone_3d.conv_input.1.running_mean", "backbone_3d.conv_input.1.running_mean", "backbone_3d.conv_input.1.running_var", "backbone_3d.conv_input.1.running_var", "backbone_3d.conv1.0.1.running_mean", "backbone_3d.conv1.0.1.running_mean", "backbone_3d.conv1.0.1.running_var", "backbone_3d.conv1.0.1.running_var", "backbone_3d.conv2.0.1.running_mean", "backbone_3d.conv2.0.1.running_mean", "backbone_3d.conv2.0.1.running_var", "backbone_3d.conv2.0.1.running_var", "backbone_3d.conv2.1.1.running_mean", "backbone_3d.conv2.1.1.running_mean", "backbone_3d.conv2.1.1.running_var", "backbone_3d.conv2.1.1.running_var", "backbone_3d.conv2.2.1.running_mean", "backbone_3d.conv2.2.1.running_mean", "backbone_3d.conv2.2.1.running_var", "backbone_3d.conv2.2.1.running_var", "backbone_3d.conv3.0.1.running_mean", "backbone_3d.conv3.0.1.running_mean", "backbone_3d.conv3.0.1.running_var", "backbone_3d.conv3.0.1.running_var", "backbone_3d.conv3.1.1.running_mean", "backbone_3d.conv3.1.1.running_mean", "backbone_3d.conv3.1.1.running_var", "backbone_3d.conv3.1.1.running_var", "backbone_3d.conv3.2.1.running_mean", "backbone_3d.conv3.2.1.running_mean", "backbone_3d.conv3.2.1.running_var", "backbone_3d.conv3.2.1.running_var", "backbone_3d.conv4.0.1.running_mean", "backbone_3d.conv4.0.1.running_mean", "backbone_3d.conv4.0.1.running_var", "backbone_3d.conv4.0.1.running_var", "backbone_3d.conv4.1.1.running_mean", "backbone_3d.conv4.1.1.running_mean", "backbone_3d.conv4.1.1.running_var", "backbone_3d.conv4.1.1.running_var", "backbone_3d.conv4.2.1.running_mean", "backbone_3d.conv4.2.1.running_mean", "backbone_3d.conv4.2.1.running_var", "backbone_3d.conv4.2.1.running_var", "backbone_3d.conv_out.1.running_mean", "backbone_3d.conv_out.1.running_mean", "backbone_3d.conv_out.1.running_var", "backbone_3d.conv_out.1.running_var", "pfe.SA_layers.0.mlps.0.1.running_mean", "pfe.SA_layers.0.mlps.0.1.running_mean", "pfe.SA_layers.0.mlps.0.1.running_var", "pfe.SA_layers.0.mlps.0.1.running_var", "pfe.SA_layers.0.mlps.0.4.running_mean", "pfe.SA_layers.0.mlps.0.4.running_mean", "pfe.SA_layers.0.mlps.0.4.running_var", "pfe.SA_layers.0.mlps.0.4.running_var", "pfe.SA_layers.0.mlps.1.1.running_mean", "pfe.SA_layers.0.mlps.1.1.running_mean", "pfe.SA_layers.0.mlps.1.1.running_var", "pfe.SA_layers.0.mlps.1.1.running_var", "pfe.SA_layers.0.mlps.1.4.running_mean", "pfe.SA_layers.0.mlps.1.4.running_mean", "pfe.SA_layers.0.mlps.1.4.running_var", "pfe.SA_layers.0.mlps.1.4.running_var", "pfe.SA_layers.1.mlps.0.1.running_mean", "pfe.SA_layers.1.mlps.0.1.running_mean", "pfe.SA_layers.1.mlps.0.1.running_var", "pfe.SA_layers.1.mlps.0.1.running_var", "pfe.SA_layers.1.mlps.0.4.running_mean", "pfe.SA_layers.1.mlps.0.4.running_mean", "pfe.SA_layers.1.mlps.0.4.running_var", "pfe.SA_layers.1.mlps.0.4.running_var", "pfe.SA_layers.1.mlps.1.1.running_mean", "pfe.SA_layers.1.mlps.1.1.running_mean", "pfe.SA_layers.1.mlps.1.1.running_var", "pfe.SA_layers.1.mlps.1.1.running_var", "pfe.SA_layers.1.mlps.1.4.running_mean", "pfe.SA_layers.1.mlps.1.4.running_mean", "pfe.SA_layers.1.mlps.1.4.running_var", "pfe.SA_layers.1.mlps.1.4.running_var", "pfe.SA_rawpoints.mlps.0.1.running_mean", "pfe.SA_rawpoints.mlps.0.1.running_mean", "pfe.SA_rawpoints.mlps.0.1.running_var", "pfe.SA_rawpoints.mlps.0.1.running_var", "pfe.SA_rawpoints.mlps.0.4.running_mean", "pfe.SA_rawpoints.mlps.0.4.running_mean", "pfe.SA_rawpoints.mlps.0.4.running_var", "pfe.SA_rawpoints.mlps.0.4.running_var", "pfe.SA_rawpoints.mlps.1.1.running_mean", "pfe.SA_rawpoints.mlps.1.1.running_mean", "pfe.SA_rawpoints.mlps.1.1.running_var", "pfe.SA_rawpoints.mlps.1.1.running_var", "pfe.SA_rawpoints.mlps.1.4.running_mean", "pfe.SA_rawpoints.mlps.1.4.running_mean", "pfe.SA_rawpoints.mlps.1.4.running_var", "pfe.SA_rawpoints.mlps.1.4.running_var", "pfe.vsa_point_feature_fusion.1.running_mean", "pfe.vsa_point_feature_fusion.1.running_mean", "pfe.vsa_point_feature_fusion.1.running_var", "pfe.vsa_point_feature_fusion.1.running_var", "backbone_2d.blocks.0.2.running_mean", "backbone_2d.blocks.0.2.running_mean", "backbone_2d.blocks.0.2.running_var", "backbone_2d.blocks.0.2.running_var", "backbone_2d.blocks.0.5.running_mean", "backbone_2d.blocks.0.5.running_mean", "backbone_2d.blocks.0.5.running_var", "backbone_2d.blocks.0.5.running_var", "backbone_2d.blocks.0.8.running_mean", "backbone_2d.blocks.0.8.running_mean", "backbone_2d.blocks.0.8.running_var", "backbone_2d.blocks.0.8.running_var", "backbone_2d.blocks.0.11.running_mean", "backbone_2d.blocks.0.11.running_mean", "backbone_2d.blocks.0.11.running_var", "backbone_2d.blocks.0.11.running_var", "backbone_2d.blocks.0.14.running_mean", "backbone_2d.blocks.0.14.running_mean", "backbone_2d.blocks.0.14.running_var", "backbone_2d.blocks.0.14.running_var", "backbone_2d.blocks.0.17.running_mean", "backbone_2d.blocks.0.17.running_mean", "backbone_2d.blocks.0.17.running_var", "backbone_2d.blocks.0.17.running_var", "backbone_2d.blocks.1.2.running_mean", "backbone_2d.blocks.1.2.running_mean", "backbone_2d.blocks.1.2.running_var", "backbone_2d.blocks.1.2.running_var", "backbone_2d.blocks.1.5.running_mean", "backbone_2d.blocks.1.5.running_mean", "backbone_2d.blocks.1.5.running_var", "backbone_2d.blocks.1.5.running_var", "backbone_2d.blocks.1.8.running_mean", "backbone_2d.blocks.1.8.running_mean", "backbone_2d.blocks.1.8.running_var", "backbone_2d.blocks.1.8.running_var", "backbone_2d.blocks.1.11.running_mean", "backbone_2d.blocks.1.11.running_mean", "backbone_2d.blocks.1.11.running_var", "backbone_2d.blocks.1.11.running_var", "backbone_2d.blocks.1.14.running_mean", "backbone_2d.blocks.1.14.running_mean", "backbone_2d.blocks.1.14.running_var", "backbone_2d.blocks.1.14.running_var", "backbone_2d.blocks.1.17.running_mean", "backbone_2d.blocks.1.17.running_mean", "backbone_2d.blocks.1.17.running_var", "backbone_2d.blocks.1.17.running_var", "backbone_2d.deblocks.0.1.running_mean", "backbone_2d.deblocks.0.1.running_mean", "backbone_2d.deblocks.0.1.running_var", "backbone_2d.deblocks.0.1.running_var", "backbone_2d.deblocks.1.1.running_mean", "backbone_2d.deblocks.1.1.running_mean", "backbone_2d.deblocks.1.1.running_var", "backbone_2d.deblocks.1.1.running_var", "point_head.cls_layers.1.running_mean", "point_head.cls_layers.1.running_mean", "point_head.cls_layers.1.running_var", "point_head.cls_layers.1.running_var", "point_head.cls_layers.4.running_mean", "point_head.cls_layers.4.running_mean", "point_head.cls_layers.4.running_var", "point_head.cls_layers.4.running_var", "roi_head.roi_grid_pool_layer.mlps.0.1.running_mean", "roi_head.roi_grid_pool_layer.mlps.0.1.running_mean", "roi_head.roi_grid_pool_layer.mlps.0.1.running_var", "roi_head.roi_grid_pool_layer.mlps.0.1.running_var", "roi_head.roi_grid_pool_layer.mlps.0.4.running_mean", "roi_head.roi_grid_pool_layer.mlps.0.4.running_mean", "roi_head.roi_grid_pool_layer.mlps.0.4.running_var", "roi_head.roi_grid_pool_layer.mlps.0.4.running_var", "roi_head.roi_grid_pool_layer.mlps.1.1.running_mean", "roi_head.roi_grid_pool_layer.mlps.1.1.running_mean", "roi_head.roi_grid_pool_layer.mlps.1.1.running_var", "roi_head.roi_grid_pool_layer.mlps.1.1.running_var", "roi_head.roi_grid_pool_layer.mlps.1.4.running_mean", "roi_head.roi_grid_pool_layer.mlps.1.4.running_mean", "roi_head.roi_grid_pool_layer.mlps.1.4.running_var", "roi_head.roi_grid_pool_layer.mlps.1.4.running_var", "roi_head.shared_fc_layer.1.running_mean", "roi_head.shared_fc_layer.1.running_mean", "roi_head.shared_fc_layer.1.running_var", "roi_head.shared_fc_layer.1.running_var", "roi_head.shared_fc_layer.5.running_mean", "roi_head.shared_fc_layer.5.running_mean", "roi_head.shared_fc_layer.5.running_var", "roi_head.shared_fc_layer.5.running_var", "roi_head.cls_layers.1.running_mean", "roi_head.cls_layers.1.running_mean", "roi_head.cls_layers.1.running_var", "roi_head.cls_layers.1.running_var", "roi_head.cls_layers.5.running_mean", "roi_head.cls_layers.5.running_mean", "roi_head.cls_layers.5.running_var", "roi_head.cls_layers.5.running_var", "roi_head.reg_layers.1.running_mean", "roi_head.reg_layers.1.running_mean", "roi_head.reg_layers.1.running_var", "roi_head.reg_layers.1.running_var", "roi_head.reg_layers.5.running_mean", "roi_head.reg_layers.5.running_mean", "roi_head.reg_layers.5.running_var", "roi_head.reg_layers.5.running_var". srun: error: kyoto: task 0: Exited with exit code 1 srun: Terminating job step 358824.0 srun: error: serv-9222: task 1: Terminated srun: Force Terminated job step 358824.0
An exception has been raised in a case where we don't have any valid prediction in that case valid_pred_boxes is empty and cat operation can not be performed as pred_scores[i] was filled with zero in the previous stage (that filling should be with -1 and we can filter pred_scores[i] here)
OpenPCDet/pcdet/utils/stats_utils.py
Lines 63 to 72 in c5782fe
Filling part:
@shashankag14 The following line has been changed in the main consistency branch. Please fix it in DA-20-roi-aug
branch and make sure this has not influenced your analysis.
scores
in L263 are already normalized as they are coming as an output from the post-processing stage. Do we still need to normalize them in L269 ?
OpenPCDet/pcdet/models/detectors/pv_rcnn_ssl.py
Lines 248 to 269 in b3cefa5
Current code imports mayavi
in pv_rcnn_ssl.py
and roi_head_template.py
to visualize the boxes at different stages.
OpenPCDet/pcdet/models/detectors/pv_rcnn_ssl.py
Lines 17 to 18 in 9173644
Running the same code on the cluster would fail, since we do not have the display support on that server. Following is a workaround in order to import mayavi
based on the presence of display during runtime and avoid code failure.
if os.name == 'posix' and "DISPLAY" not in os.environ:
headless_server = True
else:
headless_server = False
import mayavi.mlab as mlab
from visual_utils import visualize_utils as V
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.