pointscoder / votr Goto Github PK
View Code? Open in Web Editor NEWVoxel Transformer for 3D object detection
Voxel Transformer for 3D object detection
Hi,
I notice that in the paper you mention that VoTr-SSD and VoTr-TSD are trained by 60 and 80 epochs respectively on the Waymo open dataset. But the config file provided is totally opposite. So which setting is the correct exactly ?
Hi, i transferred the code to OpenPCDet v0.52.0, but got a RuntimeError. could u help me plz.
Error:
Traceback (most recent call last): | 0/1856 [00:00<?, ?it/s]
File "train.py", line 202, in <module>
main()
File "train.py", line 171, in main
merge_all_iters_to_one_epoch=args.merge_all_iters_to_one_epoch
File "/home/featurize/OpenPCDet/tools/train_utils/train_utils.py", line 118, in train_model
dataloader_iter=dataloader_iter
File "/home/featurize/OpenPCDet/tools/train_utils/train_utils.py", line 52, in train_one_epoch
loss.backward()
File "/environment/miniconda3/lib/python3.7/site-packages/torch/_tensor.py", line 307, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/environment/miniconda3/lib/python3.7/site-packages/torch/autograd/__init__.py", line 156, in backward
allow_unreachable=True, accumulate_grad=True) # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [4611, 64]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
my environment :
ubuntu 20.04
cuda 11.3
python 3.7.10
torch 1.10.0+cu113
spconv-cu113 2.1.21
Thank you for your amazing article.
But in released paper, I do not see the detailed design of attention mechanisms.
The article said: "We would like readers to refer to supplementary materials for the detailed design of attention mechanisms." But in your article, I do not find the "supplementary materials" part. Can you release supplementary materials? thank you very much.
Hi @PointsCoder ,
Thanks for your great work.
Can I ask you why did you use Batch Norm in your paper instead of Layer Norm? Did you compare their results? I couldn't find their comparison in the paper.
Thank you so much!
Nice work!
What is the latency (inference speed) of this backbone? I can't find that in the paper.
Thank you for your contribution to the field.
Is the techniques presented in this paper can be applied to voxels generated using SimpleBEV technique?
Best regards,
I try to run "python setup.py develop", but it failed.
I got
"fatal error: build_mapping_gpu.h: No such file or directory".
When I open the setup.py, I find that this repo lacks four files: 'build_mapping.cpp', 'build_mapping_gpu.cu', 'build_attention_indices.cpp' and 'build_attention_indices_gpu.cu'.
How can I get these files.
Thank you!
I don’t quite understand the implementation of Dilated Attention and the setting of RANGE_SPEC.
If I want to get the result of Fig3(2D example) in the paper, how to set the parameters.
Your work is inspiring! I was wondering if you can share more tips about training from scratch on custom dataset. THX
My setting is torch: 1.10 cuda: 11.3
error:
Traceback (most recent call last): | 0/1856 [00:00<?, ?it/s]
File "train.py", line 211, in
main()
File "train.py", line 165, in main
train_model(
File "/home/sfy/PythonProject/VOTR/tools/train_utils/train_utils.py", line 91, in train_model
accumulated_iter = train_one_epoch(
File "/home/sfy/PythonProject/VOTR/tools/train_utils/train_utils.py", line 46, in train_one_epoch
loss.backward()
File "/home/sfy/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/_tensor.py", line 307, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/home/sfy/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/autograd/init.py", line 154, in backward
Variable._execution_engine.run_backward(
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [7944, 64]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
When add “torch.autograd.set_detect_anomaly(True)”, the error:
How did you solve this problem?
Thank you!
My GPU is RTX30 series.
I can't use CUDA10 so that I cant use Spconv1.2.
Although I try to transfer it to OpenPCDet that supports Spconv2 , there still are some question I cant fix it.
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [2962, 64]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
Hi, PointsCoder, thanks for open source the project. as you say, your mainly difference with transformer is batch normalization and linear projection layer, So I just wonder how much improvement your transformer compared with origin transformer? As I didn't see the ablation studies about that.
Hello, do you think VOTR can be migrated to other tasks? such as scene completion
Is this difficult to achieve?THX
Hello, I started to study 3D object detection models.
I want to know how did you calculate the number of parameters in the VoTr and Second model.
Please help me.
Thank you.
Hi @PointsCoder
Could you please provide the pre-trained model that you used for producing metrics in the paper?
I used your code to train models. However, the results are slightly worse than your report.
Thank you!
Hello,
I am have been using the SECOND method based on the sparse 3D CNN, with around 16 Million parameters in my whole model, I do not get the "Cuda Out of Memory". However, when I replace the sparse 3D CNN in the backbone with your VoTr, although my model number of parameters is around 10 Million, I get the "Cuda Out of Memory" error.
I've also tried to make VoTr even simpler than what it is, but it still gives the "Cuda Out of Memory" error.
I really appreciate it if you help me.
After reading the paper twice, I am still confused with the model details, especially the Voxel Transformer Modole (by the way, I am knowledgeable about Transformer)
For example,
i) how to connect different VoTr building blocks?
ii) why positional encoding like this, (p_i - p_j)W_pos, then add to K_j, V_j, without Q_j? In original Transformer, firstly, the token embedding adds position embedding, then convert to Q, K, V with different linear projections.
iii) why is it necessary extract features on empty voxels and in ablation studies, there is no relevant evidence.
iiii) the highest score in KITTI 3D object detection benchmark is ~85%, while VoTr achevies 89+%
v) ...
Besides, the 'scripts' folder described in the Readme is absent.
I am following your environment using 1080TI.
Python 3.6
PyTorch 1.5
CUDA 10.1
OpenPCDet v0.3.0
spconv v1.2.1
I don't meet any fatal when compiling spconv and openpcdet.
However, I get a
'CUDA Kernel failed: invalid deveice function' Segmentation fault (core dumped)
Could you give any help? thx
Thanks for your greate job and paper, I learn a lot from it, But I have some questions:
I don’t quite understand the implementation of Sparse local Attention, The code in function sparse_local_attention_with_tensor_kernel :
for (int sz_idx = z_idx * z_stride - attend_range; sz_idx <= z_idx * z_stride + (z_stride - 1) + attend_range; ++sz_idx)
in which z_idx denotes z indices of non-empty voxels, What does sz_idx and z_stride mean? Look forward to your reply
In the introduction in the paper,you write that ' the voxel size as (0.05m,0.05m,0.1m) on the KITTI dataset, the maximum receptive field in the last layer is only (3.65m,3.65m,7.3m)', Can you tell me why the receptive field expands the 73 times?
I am trying to train the model with fp16 input. Could you please guide me to work around votr.ops that requires fp32?
Thanks!
Nice work for reproducing VOTR!
Since there are many details missing in the paper, how much is the reproduced version close to the original version?
May I ask how is the performance of this reproduced version? Can it achieve the performance in the paper?
Many thanks!
Have tried to train a model with all Waymo classes (Vehicles, Pedestrians and Cyclists)? do you recommend me to try that? or there will be a potential drop in mAP.
Do you have a plan to add a simple inference code of your model that takes a point cloud and returns the detected objects?
Thanks!
Excuse me. I am a beginner in this field and I encountered a problem when training:
File "/home/VOTR/pcdet/datasets/augmentor/database_sampler.py", line 123, in add_sampled_boxes_to_scene
sampled_gt_boxes, data_dict['road_plane'], data_dict['calib']
KeyError: 'road_plane'
Why was that?Thank you!
I don't know where the code of empty voxel feature, can you tell me pls?
Hey,
I am training the model from scratch on my __ with 12G of memory. I have decreased the batch size, size of attention SIZE parameters ( as suggested by the author) to bare minimum but still keep facing this error.
File "train.py", line 211, in <module>
main()
File "train.py", line 182, in main
merge_all_iters_to_one_epoch=args.merge_all_iters_to_one_epoch
File "/automount_home_students/vsandhu/master_project_2/VOTR/tools/train_utils/train_utils.py", line 99, in train_model
dataloader_iter=dataloader_iter
File "/automount_home_students/vsandhu/master_project_2/VOTR/tools/train_utils/train_utils.py", line 19, in train_one_epoch
batch = next(dataloader_iter)
File "/automount_home_students/vsandhu/anaconda3/envs/voxtr/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 521, in __next__
data = self._next_data()
File "/automount_home_students/vsandhu/anaconda3/envs/voxtr/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1203, in _next_data
return self._process_data(data)
File "/automount_home_students/vsandhu/anaconda3/envs/voxtr/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1229, in _process_data
data.reraise()
File "/automount_home_students/vsandhu/anaconda3/envs/voxtr/lib/python3.6/site-packages/torch/_utils.py", line 425, in reraise
raise self.exc_type(msg)
AssertionError: Caught AssertionError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/automount_home_students/vsandhu/anaconda3/envs/voxtr/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/automount_home_students/vsandhu/anaconda3/envs/voxtr/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/automount_home_students/vsandhu/anaconda3/envs/voxtr/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/automount_home_students/vsandhu/master_project_2/VOTR/pcdet/datasets/kitti/kitti_dataset.py", line 433, in __getitem__
data_dict = self.prepare_data(data_dict=input_dict)
File "/automount_home_students/vsandhu/master_project_2/VOTR/pcdet/datasets/dataset.py", line 142, in prepare_data
data_dict=data_dict
File "/automount_home_students/vsandhu/master_project_2/VOTR/pcdet/datasets/processor/data_processor.py", line 127, in forward
data_dict = cur_processor(data_dict=data_dict)
File "/automount_home_students/vsandhu/master_project_2/VOTR/pcdet/datasets/processor/data_processor.py", line 62, in transform_points_to_voxels
voxel_output = voxel_generator.generate(points)
File "/automount_home_students/vsandhu/anaconda3/envs/voxtr/lib/python3.6/site-packages/spconv/utils/__init__.py", line 173, in generate
or self._max_voxels, self._full_mean)
File "/automount_home_students/vsandhu/anaconda3/envs/voxtr/lib/python3.6/site-packages/spconv/utils/__init__.py", line 69, in points_to_voxel
assert block_filtering is False
AssertionError
Thanks in advance for the help
I try to train my module votr_ssd with a single GPU and make my batch size as 2, which needs 50 hours. The problem is that the result is false and displays "wait 30 seconds for next check (progress: 3175.0 / 0 minutes): /home/wangtingting/VOTR/output/kitti" after 50 hours training. Is there anyone else have the same problem as me? Hope you could help me!
Thanks for sharing your work.
I got some errors when I try to compile votr_ops.
As shown below, the source list includes build_mapping.cpp
, build_mapping_gpu.cu
,build_attention_indices.cpp
and build_attention_indices_gpu.cu
. But these files cannot be found in the folder of pcdet/ops/votr_ops/
.
Lines 56 to 64 in c44a21c
hi @PointsCoder
you done a great work, i have some questions about your work.
Can you give me your email address?
Looking forward to your reply
I didn't find this pkl file in your code, how should I get it please???
Traceback (most recent call last):
File "train.py", line 211, in
main()
File "train.py", line 123, in main
total_epochs=args.epochs
File "/mnt/Disk8T/donght/VOTR/pcdet/datasets/init.py", line 48, in build_dataloader
logger=logger,
File "/mnt/Disk8T/donght/VOTR/pcdet/datasets/kitti/kitti_dataset.py", line 23, in init
dataset_cfg=dataset_cfg, class_names=class_names, training=training, root_path=root_path, logger=logger
File "/mnt/Disk8T/donght/VOTR/pcdet/datasets/dataset.py", line 32, in init
) if self.training else None
File "/mnt/Disk8T/donght/VOTR/pcdet/datasets/augmentor/data_augmentor.py", line 21, in init
cur_augmentor = getattr(self, cur_cfg.NAME)(config=cur_cfg)
File "/mnt/Disk8T/donght/VOTR/pcdet/datasets/augmentor/data_augmentor.py", line 29, in gt_sampling
logger=self.logger
File "/mnt/Disk8T/donght/VOTR/pcdet/datasets/augmentor/database_sampler.py", line 19, in init
with open(str(db_info_path), 'rb') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/mnt/Disk8T/donght/VOTR/data/kitti/kitti_dbinfos_train.pkl'
no comment
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.