jabb0 / fastflow3d Goto Github PK

Implementation of the FastFlow3D architecture for scene flow estimation from LiDAR point clouds in PyTorch using PyTorch Lightning.

License: MIT License

Python 80.10% C 1.28% C++ 6.37% Cuda 11.68% Shell 0.58%

point-cloud pytorch-lightning pytorch scene-flow

fastflow3d's People

Contributors

Stargazers

Watchers

Forkers

zivzone akirahero alexpostnikov whuhxb

fastflow3d's Issues

About the metric.

Hi,
Thanks for your implementation! Now I have a question about the calculation of the metric. In your code, I find you compute the pointwise metric at each step. Then, pytorch lightning will average the metric on each step automatically to get the mean metric on epoch. In my understanding, the point-wise mentioned in the paper is performed on the entire epoch. I want to know if this will lead to some bias in evaluation.

Index out of bounds error in pillarFeatureNetScatter.py:35 (grid.scatter_add_(1, indices, x))

Thanks for providing this fastflow3d implementation here. I'm using it with a custom dataset. Some of the data in it trigger an index out of bounds error, an example trace is pasted at the end.

I think that the error happens because the upper limit of the grid (x_max, y_max, z_max) is an exclusive boundary and lidar points that fall exactly on that value are then out of bounds. For example in a 1D grid from x_min=-2 to x_max=2 with a grid_size of 4, the grid cells would contain

      0           1           2         3
[-2.0, -1.0) [-1.0, 0.0) [0.0, 1.0) [1.0, 2.0)

A point at x=2.0 (x=x_max) would fall into cell with index 4, which is out of bounds.

The easiest workaround I see is to change remove_out_of_bounds_points in utils/pillars.py to exclude the *_max values, i.e. change <= to < for x_max, y_max, z_max. This seems to fix the error for me. Does this make sense?

diff --git a/utils/pillars.py b/utils/pillars.py
index 5714c8d..88f0125 100644
--- a/utils/pillars.py
+++ b/utils/pillars.py
@@ -4,9 +4,9 @@ import numpy as np
 def remove_out_of_bounds_points(pc, y, x_min, x_max, y_min, y_max, z_min, z_max):
     # Calculate the cell id that this entry falls into
     # Store the X, Y indices of the grid cells for each point cloud point
-    mask = (pc[:, 0] >= x_min) & (pc[:, 0] <= x_max) \
-           & (pc[:, 1] >= y_min) & (pc[:, 1] <= y_max) \
-           & (pc[:, 2] >= z_min) & (pc[:, 2] <= z_max)
+    mask = (pc[:, 0] >= x_min) & (pc[:, 0] < x_max) \
+           & (pc[:, 1] >= y_min) & (pc[:, 1] < y_max) \
+           & (pc[:, 2] >= z_min) & (pc[:, 2] < z_max)
     pc_valid = pc[mask]
     y_valid = None
     if y is not None:

[...]
../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:111: operator(): block: [118834,0,0], thread: [124,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.              
../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:111: operator(): block: [118834,0,0], thread: [125,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.              
../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:111: operator(): block: [118834,0,0], thread: [126,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.              
../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:111: operator(): block: [118834,0,0], thread: [127,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.              
Traceback (most recent call last):                                                                                                                                                                         
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 722, in _call_and_handle_interrupt                                                                              
    return self.strategy.launcher.launch(trainer_fn, *args, trainer=self, **kwargs)                                                                                                                        
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 93, in launch                                                                            
    return function(*args, **kwargs)                                                                                                                                                                       
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 812, in _fit_impl                                                                                               
    results = self._run(model, ckpt_path=self.ckpt_path)                                                                                                                                                   
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 1237, in _run                                                                                                   
    results = self._run_stage()                                                                                                                                                                            
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 1324, in _run_stage                                                                                             
    return self._run_train()                                                                                                                                                                               
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 1354, in _run_train                                                                                             
    self.fit_loop.run()                                                                                                                                                                                    
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/base.py", line 204, in run                                                                                                          
    self.advance(*args, **kwargs)                                                                                                                                                                          
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/fit_loop.py", line 269, in advance                                                                                                  
    self._outputs = self.epoch_loop.run(self._data_fetcher)                                                                                                                                                
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/base.py", line 204, in run                                                                                                          
    self.advance(*args, **kwargs)                                                                                                                                                                          
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 208, in advance                                                                                 
    batch_output = self.batch_loop.run(batch, batch_idx)                                                                                                                                                   
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/base.py", line 204, in run                                                                                                          
    self.advance(*args, **kwargs)                                                                                                                                                                          
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 88, in advance                                                                                  
    outputs = self.optimizer_loop.run(split_batch, optimizers, batch_idx)                                                                                                                                  
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/base.py", line 204, in run                                                                                                          
    self.advance(*args, **kwargs)                                                                                                                                                                          
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 203, in advance                                                                               
    result = self._run_optimization(                                                                                                                                                                       
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 256, in _run_optimization                                                                     
    self._optimizer_step(optimizer, opt_idx, batch_idx, closure)                                                                                                                                           
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 369, in _optimizer_step                                                                       
    self.trainer._call_lightning_module_hook(                                                                                                                                                              
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 1596, in _call_lightning_module_hook                                                                            
    output = fn(*args, **kwargs)                                                                                                                                                                           
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/core/lightning.py", line 1625, in optimizer_step                                                                                          
    optimizer.step(closure=optimizer_closure)                                                                                                                                                              
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/core/optimizer.py", line 168, in step                                                                                                     
    step_output = self._strategy.optimizer_step(self._optimizer, self._optimizer_idx, closure, **kwargs)                                                                                                   
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/strategies/ddp.py", line 278, in optimizer_step                                                                                           
    optimizer_output = super().optimizer_step(optimizer, opt_idx, closure, model, **kwargs)                                                                                                                
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/strategies/strategy.py", line 193, in optimizer_step                                                                                      
    return self.precision_plugin.optimizer_step(model, optimizer, opt_idx, closure, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 155, in optimizer_step                                                                       
    return optimizer.step(closure=closure, **kwargs)                                                                                                                                                       
  File "/usr/local/lib/python3.8/dist-packages/torch/optim/optimizer.py", line 88, in wrapper                                                                                                              
    return func(*args, **kwargs)                                                                                                                                                                           
  File "/usr/local/lib/python3.8/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context                                                                                                  
    return func(*args, **kwargs)                                                                                                                                                                           
  File "/usr/local/lib/python3.8/dist-packages/torch/optim/adam.py", line 100, in step                                                                                                                     
    loss = closure()                                                                                                                                                                                       
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 140, in _wrap_closure                                                                        
    closure_result = closure()                                                                                                                                                                             
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 148, in __call__                                                                              
    self._result = self.closure(*args, **kwargs)                                                                                                                                                           
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 134, in closure                                                                               
    step_output = self._step_fn()                                                                                                                                                                          
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 427, in _training_step                                                                        
    training_step_output = self.trainer._call_strategy_hook("training_step", *step_kwargs.values())                                                                                                        
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 1766, in _call_strategy_hook                                                                                    
    output = fn(*args, **kwargs)                                                                                                                                                                           
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/strategies/ddp.py", line 344, in training_step                                                                                            
    return self.model(*args, **kwargs)                                                                                                                                                                     
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1110, in _call_impl                                                                                                       
    return forward_call(*input, **kwargs)                                                                                                                                                                  
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/parallel/distributed.py", line 963, in forward                                                                                                     
    output = self.module(*inputs[0], **kwargs[0])                                                                                                                                                          
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1110, in _call_impl                                                                                                       
    return forward_call(*input, **kwargs)                                                                                                                                                                  
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/overrides/base.py", line 82, in forward                                                                                                   
    output = self.module.training_step(*inputs, **kwargs)                                                                                                                                                  
  File "/workspace/FastFlow3D/models/BaseModel.py", line 167, in training_step                                                                                                                             
    loss, metrics = self.general_step(batch, batch_idx, phase)                                                                                                                                             
  File "/workspace/FastFlow3D/models/BaseModel.py", line 119, in general_step                                                                                                                              
    y_hat = self(x)                               
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1110, in _call_impl                                                                                                       
    return forward_call(*input, **kwargs)                                                            
  File "/workspace/FastFlow3D/models/FastFlow3DModelScatter.py", line 93, in forward                                                                                                                       
    current_pillar_embeddings = self._pillar_feature_net(current_batch_pc_embedding, current_batch_grid)                                                                                                   
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1110, in _call_impl                                                                                                       
    return forward_call(*input, **kwargs)                                                            
  File "/workspace/FastFlow3D/networks/pillarFeatureNetScatter.py", line 35, in forward                                                                                                                    
    grid.scatter_add_(1, indices, x)                                                                 
RuntimeError: CUDA error: device-side assert triggered

[W CUDAGuardImpl.h:113] Warning: CUDA warning: device-side assert triggered (function destroyEvent)                                                                                              
terminate called after throwing an instance of 'c10::CUDAError'                                                                                                                                  
  what():  CUDA error: device-side assert triggered                                                                                                                                              
Exception raised from create_event_internal at ../c10/cuda/CUDACachingAllocator.cpp:1230 (most recent call first):                                                                               
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7f5becd167d2 in /usr/local/lib/python3.8/dist-packages/torch/lib/libc10.so)                                              
frame #1: <unknown function> + 0x2319e (0x7f5becf8319e in /usr/local/lib/python3.8/dist-packages/torch/lib/libc10_cuda.so)                                                                       
frame #2: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0x22d (0x7f5becf84d3d in /usr/local/lib/python3.8/dist-packages/torch/lib/libc10_cuda.so)                                         
frame #3: <unknown function> + 0x2ffc28 (0x7f5c40051c28 in /usr/local/lib/python3.8/dist-packages/torch/lib/libtorch_python.so)                                                                  
frame #4: c10::TensorImpl::release_resources() + 0x175 (0x7f5beccff005 in /usr/local/lib/python3.8/dist-packages/torch/lib/libc10.so)                                                            
frame #5: std::vector<c10d::Reducer::Bucket, std::allocator<c10d::Reducer::Bucket> >::~vector() + 0x2e9 (0x7f5c2b9018d9 in /usr/local/lib/python3.8/dist-packages/torch/lib/libtorch_cpu.so)     
frame #6: c10d::Reducer::~Reducer() + 0x205 (0x7f5c2b8f4015 in /usr/local/lib/python3.8/dist-packages/torch/lib/libtorch_cpu.so)                                                                 
frame #7: std::_Sp_counted_ptr<c10d::Reducer*, (__gnu_cxx::_Lock_policy)2>::_M_dispose() + 0x12 (0x7f5c4052f8d2 in /usr/local/lib/python3.8/dist-packages/torch/lib/libtorch_python.so)          
frame #8: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 0x46 (0x7f5c3ff3fbc6 in /usr/local/lib/python3.8/dist-packages/torch/lib/libtorch_python.so)                         
frame #9: <unknown function> + 0x7e0eef (0x7f5c40532eef in /usr/local/lib/python3.8/dist-packages/torch/lib/libtorch_python.so)                                                                  
frame #10: <unknown function> + 0x1f51e0 (0x7f5c3ff471e0 in /usr/local/lib/python3.8/dist-packages/torch/lib/libtorch_python.so)                                                                 
frame #11: <unknown function> + 0x1f638e (0x7f5c3ff4838e in /usr/local/lib/python3.8/dist-packages/torch/lib/libtorch_python.so)                                                                 
frame #12: python() [0x5d0147]                  
frame #13: python() [0x5a9e9d]                  
frame #14: python() [0x5d0168]                  
frame #15: python() [0x5a6152]                  
frame #16: python() [0x4ef7f8]                  
<omitting python frames>                        
frame #22: __libc_start_main + 0xf3 (0x7f5c41db70b3 in /usr/lib/x86_64-linux-gnu/libc.so.6)                                                                                                      

Aborted (core dumped)

Do you have a pre-trained model?

Do you have the model you trained on waymo available to download, or do we have to download dataset, preprocess data, and train for 3 days to get the model you achieved?

Little bug in readme.md and request for open source checkpoint file

Thanks for your excellent job and detailed tutorial！ I notice that there maybe a little bug in the readme.md, since there are double "offset_y" in Architecture-Scene Encoder-4.Encode each point as 8D (pillarCenter_x, pillarCenter_y, pillarCenter_z, offset_x, offset_y, offset_y, feature_0, feature_1)
Moreover, will you kindly release the trained checkpoint file for the network? Sincerely looking forward to your reply！

Question of the dataset

Thank you again for this outstanding work! Here I have a question about the waymo dataset version you used. I notice that the waymo dataset download path you provided in the readme.md (https://console.cloud.google.com/storage/browser/waymo_open_dataset_scene_flow) is different from the dataset of any version on the official website of waymo dataset (now latest version is waymo 1.4.0). I would like to know what is the difference between the dataset used in this work and the datasets that are open for download in the official website. Sincerely looking forward to your reply!

Accelerator='ddp' is an invalid accelerator name

Hello again! I encounter an error when I'm trying to run the train.py, which shows that accelerator='ddp' is an invalid accelerator name. The error message is shown at the end of the issue.
My environment is：
CUDA 11.3
Python 3.10.8
PyTorch 1.12.1
PyTorch lightning 1.8.3

and I've also tried the environment setting as follows and still encounter the same problem:
CUDA 11.3
Python 3.8.13
PyTorch 1.10.0
PyTorch lightning 1.7.7

Can you kindly offer some suggestions? Thanks a lot and looking forward to your reply!

~/FastFlow3D-main$ python train.py --accelerator='ddp' --batch_size=16 --gpus=4 --num_workers=16 --learning_rate=0.0001 --disable_ddp_unused_check=True
No weights and biases API key set. Using tensorboard instead!
Disabling unused parameter check for DDP
Traceback (most recent call last):
File "/home/fjy/FastFlow3D-main/train.py", line 286, in
cli()
File "/home/fjy/FastFlow3D-main/train.py", line 263, in cli
trainer = pl.Trainer.from_argparse_args(args,
File "/home/fjy/anaconda3/envs/fastflow/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1917, in from_argparse_args
return from_argparse_args(cls, args, **kwargs)
File "/home/fjy/anaconda3/envs/fastflow/lib/python3.10/site-packages/pytorch_lightning/utilities/argparse.py", line 66, in from_argparse_args
return cls(**trainer_kwargs)
File "/home/fjy/anaconda3/envs/fastflow/lib/python3.10/site-packages/pytorch_lightning/utilities/argparse.py", line 340, in insert_env_defaults
return fn(self, **kwargs)
File "/home/fjy/anaconda3/envs/fastflow/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 408, in init
self._accelerator_connector = AcceleratorConnector(
File "/home/fjy/anaconda3/envs/fastflow/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py", line 192, in init
self._check_config_and_set_final_flags(
File "/home/fjy/anaconda3/envs/fastflow/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py", line 291, in _check_config_and_set_final_flags
raise ValueError(
ValueError: You selected an invalid accelerator name: accelerator='ddp'. Available names are: cpu, cuda, hpu, ipu, mps, tpu.

About the model performance

Hello @Jabb0 , thanks for your implementation!

I have some questions about the performance of this paper. After your model is trained, can the test results reach the accuracy in the paper? And could you share the test results?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.