drprojects / deepviewagg Goto Github PK

[CVPR'22 Best Paper Finalist] Official PyTorch implementation of the method presented in "Learning Multi-View Aggregation In the Wild for Large-Scale 3D Semantic Segmentation"

License: Other

Python 93.72% Shell 0.79% Jupyter Notebook 5.49%

cvpr deep-learning image multimodal multimodal-deep-learning point-cloud pytorch semantic-segmentation cvpr2022 multi-view

deepviewagg's Introduction

DeepViewAgg [CVPR'22 Best Paper Finalist 🎉]

Official repository for Learning Multi-View Aggregation In the Wild for Large-Scale 3D Semantic Segmentation paper 📄 selected for an Oral presentation at CVPR 2022.

We propose to exploit the synergy between images and 3D point clouds by learning to select the most relevant views for each point. Our approach uses the viewing conditions of 3D points to merge features from images taken at arbitrary positions. We reach SOTA results for S3DIS (74.7 mIoU 6-Fold) and on KITTI- 360 (58.3 mIoU) without requiring point colorization, meshing, or the use of depth cameras: our full pipeline only requires raw 3D scans and a set of images and poses.

Coming soon 🚨 🚧

Change log

2023-01-11 Fixed some bug when using intermediate fusion
2022-04-20 Added notebooks and scripts to get started with DeepViewAgg
2022-04-27 Added pretrained weights and features to help reproduce our results

Requirements 📝

The following must be installed before installing this project.

Anaconda3
cuda >= 10.1
gcc >= 7

All remaining dependencies (PyTorch, PyTorch Geometric, etc) should be installed using the provided installation script.

The code has been tested in the following environment:

Ubuntu 18.04.6 LTS
Python 3.8.5
PyTorch 1.7.1
CUDA 10.2, 11.2 and 11.4
NVIDIA V100 32G
64G RAM

Installation 🧱

To install DeepViewAgg, simply run ./install.sh from inside the repository.

You will need to have sudo rights to install MinkowskiEngine and TorchSparse dependencies.
⚠️ Do not install Torch-Points3D from the official repository, or with pip.

Disclaimer

This is not the official Torch-Points3D framework. This work builds on and modifies a fixed version of the framework and has not been merged with the official repository yet. In particular, this repository introduces numerous features for multimodal learning on large-scale 3D point clouds. In this repository, some TP3D-specific files were removed for simplicity.

Project structure

The project follows the original Torch-Points3D framework structure.

├─ conf                    # All configurations live there
├─ notebooks               # Notebooks to get started with multimodal datasets and models
├─ eval.py                 # Eval script
├─ insall.sh               # Installation script for DeepViewAgg
├─ scripts                 # Some scripts to help manage the project
├─ torch_points3d
    ├─ core                # Core components
    ├─ datasets            # All code related to datasets
    ├─ metrics             # All metrics and trackers
    ├─ models              # All models
    ├─ modules             # Basic modules that can be used in a modular way
    ├─ utils               # Various utils
    └─ visualization       # Visualization
└─ train.py                # Main script to launch a training

Several changes were made to extend the original project to multimodal learning on point clouds with images. The most important ones can be found in the following:

conf/data/segmentation/multimodal: configs for the 3D+2D datasets.
conf/models/segmentation/multimodal: configs for the 3D+2D models.
torch_points3d/core/data_transform/multimodal: transforms for 3D+2D data.
torch_points3d/core/multimodal: multimodal data and mapping objects.
torch_points3d/datasets/segmentation/multimodal: 3D+2D datasets (eg S3DIS, ScanNet, KITTI360).
torch_points3d/models/segmentation/multimodal: 3D+2D architectures.
torch_points3d/modules/multimodal: 3D+2D modules. This is where the DeepViewAgg module can be found.
torch_points3d/visualization/multimodal_data.py: tools for interactive visualization of multimodal data.

Getting started 🚀

Notebook to create synthetic toy dataset and get familiar with 2D-3D mappings construction :

notebooks/synthetic_multimodal_dataset.ipynb

Notebooks to create dataset, get familiar with dataset configuration and produce interactive visualization. You can also run inference from a checkpoint and visualize predictions:

notebooks/kitti360_visualization.ipynb (at least 350G of memory 💾)
notebooks/s3dis_visualization.ipynb (at least 400G of memory 💾)
notebooks/scannet_visualization.ipynb (at least 1.3T of memory 💾)

Notebooks to create multimodal models, get familiar with model configuration and run forward and backward passes for debugging:

notebooks/multimodal_model.ipynb

Notebooks to run full inference on multimodal datasets, from a model checkpoint. Those should allow you to reproduce our results by using the pretrained models in Models:

notebooks/kitti360_inference.ipynb
notebooks/s3dis_inference.ipynb
notebooks/scannet_inference.ipynb

Scripts to replicate our paper's best experiments 📈 for each dataset:

scripts/train_kitti360.sh
scripts/train_s3dis.sh
scripts/train_scannet.sh

If you need to go deeper into this project, see the Documentation section.

If you have trouble using these or need reproduce other results from our paper, create an issue or leave me a message 💬 !

Models

Model name	Dataset	mIoU	💾	👇
Res16UNet34-L4-early	S3DIS 6-Fold	74.7	2.0G	link
Res16UNet34-PointPyramid-early-cityscapes-interpolate	KITTI-360	61.7 Val / 58.3 Test	339M	link
Res16UNet34-L4-early	ScanNet	71.0 Val	341M	link

Documentation 📚

The official documentation of Pytorch Geometric and Torch-Points3D are good starting points, since this project largely builds on top of these frameworks. For DeepViewAgg-specific features (i.e. all that concerns multimodal learning), the provided code is commented as much as possible, but hit me up 💬 if some parts need clarification.

Visualization of multimodal data 🔭

We provide code to produce interactive and sharable HTML visualizations of multimodal data and point-image mappings:

Examples of such HTML produced on S3DIS Fold 5 are zipped here and can be opened in your browser.

Known issues

Setting use_faiss=True or use_cuda=True to accelerate PCAComputePointwise, MapImages or NeighborhoodBasedMappingFeatures. As suggested here, one should stick to the CPU-based computation for now.

Credits 💳

This implementation of DeepViewAgg largely relies on the Torch-Points3D framework, although not merged with the official project at this point.
For datasets, some code from the official KITTI-360 and ScanNet repositories was used.

Reference

In case you use all or part of the present code, please include a citation to the following paper:

@inproceedings{robert2022dva,
  title={Learning Multi-View Aggregation In the Wild for Large-Scale 3D Semantic Segmentation},
  author={Robert, Damien and Vallet, Bruno and Landrieu, Loic},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2022},
  pages={5575--5584},
  year={2022},
  url = {\url{https://github.com/drprojects/DeepViewAgg}}
}

deepviewagg's People

Contributors

Stargazers

Watchers

deepviewagg's Issues

Intermediate Fusion Model Config Issues with KITTI and ScanNet

Hello @drprojects ,

Sorry for bothering you with questions during the holiday season but I'm really puzzled about this one.

I was trying to configure a default model provided (small_2d_3d) to implement intermediate fusion as described in your nice paper but encountered cryptic CUDA errors in the backward pass while training on the KITTI360 dataset. What I was aiming to do was to fuse the image features only at the last encoder layer but anytime I change the branching_index to a higher value I get the same error.

Here's a config that works in my environment:

base-intermediate:
    class: sparseconv3d.APIModel
    conv_type: "SPARSE"
    backend: "torchsparse"
    backbone: # backbone offset specific for Sparse conv application builder
        define_constants:
            in_feat: 2
            block: ResBlock # Can be any of the blocks in modules/MinkowskiEngine/api_modules.py
            out_feat_img_0: 128  # out dim of CityscapesResNet18

        down_conv:
            module_name: ResNetDown
            block: block
            conv3d_after_fusion: False # conv->fusion
            N: [ 0, 1, 1, 1, 1 ]
            kernel_size: [ 3, 2, 2, 2, 2 ]
            stride: [ 1, 2, 2, 2, 2 ]

            down_conv_nn:
              [
                  [ FEAT, 4*in_feat ],
                  [ 4*in_feat + out_feat_img_0, in_feat ],
                  [ in_feat , 2*in_feat ],
                  [ 2*in_feat , 4*in_feat ],
                  [ 4*in_feat, 8*in_feat ],
              ]

            image:
                down_conv:
                    module_name: CityscapesResNet18TruncatedLayer0
                atomic_pooling:
                    module_name: BimodalCSRPool
                    mode: max
                view_pooling:
                    module_name: BimodalCSRPool
                    mode: max
                fusion:
                    module_name: BimodalFusion
                    mode: concatenation
                branching_index: 1
                out_channels: 4*in_feat + out_feat_img_0  # This is necessary to support batches with no images
#                 checkpointing: cav

        up_conv:
            block: block
            module_name: ResNetUp
            N: [ 1, 1, 1, 1, 1 ]
            kernel_size: [ 2, 2, 2, 2, 3 ]
            stride: [ 2, 2, 2, 2, 1 ]
            up_conv_nn:
                [
                  [ 8*in_feat, 4*in_feat, 4*in_feat ],
                  [ 4*in_feat, 2*in_feat, 4*in_feat ],
                  [ 4*in_feat, in_feat, 3*in_feat ],
                  [ 3*in_feat, 4*in_feat + out_feat_img_0, 3*in_feat ],
                  [ 3*in_feat, 0, 3*in_feat ],
                ]

And if I change the branching index by even 1 (i.e. the config below) I get the following error which contains some comments I added while trying to debug, hopefully it isn't distracting!

The error is:

The model will not be able to be used from pretrained weights without the corresponding dataset. Current properties are {'feature_dimension': 1, 'num_classes': 15}
  0%|                                                                                                                                                                                                                            | 0/6000 [00:00<?, ?it/s]/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [395,0,0], thread: [48,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [399,0,0], thread: [0,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [399,0,0], thread: [1,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [399,0,0], thread: [2,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [399,0,0], thread: [3,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [399,0,0], thread: [4,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [399,0,0], thread: [5,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [399,0,0], thread: [6,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [399,0,0], thread: [7,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [399,0,0], thread: [8,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [399,0,0], thread: [9,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [399,0,0], thread: [10,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [399,0,0], thread: [11,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [399,0,0], thread: [12,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [399,0,0], thread: [13,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
  0%|                                                                                                                                                                                                                            | 0/6000 [00:12<?, ?it/s]
Traceback (most recent call last):
  File "/home/rozenberszki/altay/DeepViewAgg_playground/debug_kitti_train.py", line 107, in <module>
    main(cfg)
  File "/home/rozenberszki/anaconda3/envs/dva/lib/python3.7/site-packages/hydra/main.py", line 44, in decorated_main
    return task_function(cfg_passthrough)
  File "/home/rozenberszki/altay/DeepViewAgg_playground/debug_kitti_train.py", line 41, in main
    trainer.train()
  File "/home/rozenberszki/altay/DeepViewAgg_playground/torch_points3d/trainer.py", line 146, in train
    self._train_epoch(epoch)
  File "/home/rozenberszki/altay/DeepViewAgg_playground/torch_points3d/trainer.py", line 202, in _train_epoch
    self._model.optimize_parameters(epoch, self._dataset.batch_size)
  File "/home/rozenberszki/altay/DeepViewAgg_playground/torch_points3d/models/base_model.py", line 249, in optimize_parameters
    self.backward()  # calculate gradients
  File "/home/rozenberszki/altay/DeepViewAgg_playground/torch_points3d/models/segmentation/sparseconv3d.py", line 58, in backward
    self.loss_seg.backward() # TODO: WITH KITTI THIS IS SOMEHOW EMPTY, do loss.item()
  File "/home/rozenberszki/anaconda3/envs/dva/lib/python3.7/site-packages/torch/tensor.py", line 245, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/home/rozenberszki/anaconda3/envs/dva/lib/python3.7/site-packages/torch/autograd/__init__.py", line 147, in backward
    allow_unreachable=True, accumulate_grad=True)  # allow_unreachable flag
RuntimeError: transform: failed to synchronize: cudaErrorAssert: device-side assert triggered

And this is the model I ran it with:

base-intermediate:
    class: sparseconv3d.APIModel
    conv_type: "SPARSE"
    backend: "torchsparse"
    backbone: # backbone offset specific for Sparse conv application builder
        define_constants:
            in_feat: 2
            block: ResBlock # Can be any of the blocks in modules/MinkowskiEngine/api_modules.py
            out_feat_img_0: 128  # out dim of CityscapesResNet18

        down_conv:
            module_name: ResNetDown
            block: block
            conv3d_after_fusion: False # conv->fusion

            N: [ 0, 1, 1, 1, 1 ]
            kernel_size: [ 3, 2, 2, 2, 2 ]
            stride: [ 1, 2, 2, 2, 2 ]

            down_conv_nn:
              [
                  [ FEAT, 4*in_feat ],
                  [ 4*in_feat, in_feat ],
                  [ in_feat + out_feat_img_0 , 2*in_feat ], # <== Changed this
                  [ 2*in_feat , 4*in_feat ],
                  [ 4*in_feat, 8*in_feat ],
              ]

            image:
                down_conv:
                    module_name: CityscapesResNet18TruncatedLayer0
                atomic_pooling:
                    module_name: BimodalCSRPool
                    mode: max
                view_pooling:
                    module_name: BimodalCSRPool
                    mode: max
                fusion:
                    module_name: BimodalFusion
                    mode: concatenation
                branching_index: 2                      # <== Changed this
                out_channels: in_feat + out_feat_img_0  # <== Changed this
#                 checkpointing: cav

        up_conv:
            block: block
            module_name: ResNetUp
            N: [ 1, 1, 1, 1, 1 ]
            kernel_size: [ 2, 2, 2, 2, 3 ]
            stride: [ 2, 2, 2, 2, 1 ]
            up_conv_nn:
                [
                  [ 8*in_feat, 4*in_feat, 4*in_feat ],
                  [ 4*in_feat, 2*in_feat, 4*in_feat ],
                  [ 4*in_feat, in_feat + out_feat_img_0, 3*in_feat ], # <== Changed this
                  [ 3*in_feat, 4*in_feat, 3*in_feat ],
                  [ 3*in_feat, 0, 3*in_feat ],
                ]

What I'd like to be able to do is to set the branching index to 5 and concatenate the image features to the 3d features right before the decoder. The reason the branching index is 1 in the first model is that as far as I understand n_early_conv is by default equal to 1. Me and a teammate are looking into an alternative fusion strategy which would only make sense in the intermediate fusion case. Hopefully we can share our results with you once we get everything running!

Could you please let me know what might be causing this problem or whether something is wrong in the way the config is specified?
Also I couldn't find any premade configs for the intermediate fusion case, did I miss them? Could you please point me in the right direction so I could use those as examples?

Thank you in advance for your help and I hope you had a nice holiday season with your loved ones!

EDIT: Added context to the error message

ImageBatch of the S3DIS dataset

Hi, Dr. Robert.
I met a problem when I customed my model based on your DeepViewAgg project again with the S3DIS dataset.
I add a customed classification label for each pixel in images, i.e., extend the ImageMapping class with added values[3] in ImageMapping.values, as follows:

But when I trained the model with my customed dataset class, the ImageBatch (derived from SameSettingImageBatch? )present different downscale values such as 1, 0.5, and 0.25. Why it shows different scales? I thought they should be the same scales for all images.

The program shows fine with downscale=1.0 but breaks down with downscale=0.5 or 0.25.
I trace the code to the upscale_images() function in torch_point3d/core/multimodal/image.py and it seems you recalculate the pixel coords and update them in value[1].values[0].

The pixel coords indexes exceed the output of the pretrained image model, which is fixed to [batch, C_dim, 1024, 512].

Hope to get some help from you.
Best regards,

How to test custom data

Hi, congratulations for the great job, even the algorithm or the paper.
I want to ask about evaluation of my collected data for semantic segmenation (S3DIS):
1- is the point clouds have (x,y,z) format only for the code to work or can have xyzrgb format ?
2- what is min number of images used in one scene test, how to make co registration to images?
3-i did not find steps how to evaluate the collected data, can you please illustrate to me?

I hope you help me, thank you.

A problem about IndexError: index 8 is out of bounds for dimension 0 with size 5

Before this training session, I attempted to train a 3D monocular model using the KITTI360 dataset. After successfully completing the training, I attempted to switch the training data to the point cloud data I wanted to train with.
However, I encountered the following error message after changing the training data.

Error executing job with overrides: ['data=segmentation/kitti360-sparse', 'models=segmentation/sparseconv3d', 'model_name=Res16UNet34', 'task=segmentation', 'training=kitti360_benchmark/sparseconv3d', 'lr_scheduler=multi_step_kitti360', 'eval_frequency=5', 'data.sample_per_epoch=12000', 'data.dataroot=./directory', 'data.train_is_trainval=False', 'data.mini=False', 'training.cuda=0', 'training.batch_size=8', 'training.epochs=60', 'training.num_workers=4', 'training.optim.base_lr=0.1', 'training.wandb.log=True', 'training.wandb.name=My_awesome_KITTI-360_experiment', 'tracker_options.make_submission=False', 'training.checkpoint_dir=']
Traceback (most recent call last):
  File "train.py", line 13, in main
    trainer = Trainer(cfg)
  File "/home/DeepViewAgg-release/torch_points3d/trainer.py", line 46, in __init__
    self._initialize_trainer()
  File "/home/DeepViewAgg-release/torch_points3d/trainer.py", line 92, in _initialize_trainer
    self._dataset: BaseDataset = instantiate_dataset(self._cfg.data)
  File "/home/DeepViewAgg-release/torch_points3d/datasets/dataset_factory.py", line 47, in instantiate_dataset
    dataset = dataset_cls(dataset_config)
  File "/home/DeepViewAgg-release/torch_points3d/datasets/segmentation/kitti360.py", line 881, in __init__
    transform=self.train_transform)
  File "/home/DeepViewAgg-release/torch_points3d/datasets/segmentation/kitti360.py", line 264, in __init__
    super().__init__(root, transform, pre_transform, pre_filter)
  File "/opt/conda/lib/python3.7/site-packages/torch_geometric/data/in_memory_dataset.py", line 55, in __init__
    pre_filter)
  File "/opt/conda/lib/python3.7/site-packages/torch_geometric/data/dataset.py", line 92, in __init__
    self._process()
  File "/opt/conda/lib/python3.7/site-packages/torch_geometric/data/dataset.py", line 165, in _process
    self.process()
  File "/home/DeepViewAgg-release/torch_points3d/datasets/segmentation/kitti360.py", line 532, in process
    self._process_3d(*path_tuple)
  File "/home/DeepViewAgg-release/torch_points3d/datasets/segmentation/kitti360.py", line 567, in _process_3d
    raw_window_path, instance=self._keep_instance, remap=True)
  File "/home/DeepViewAgg-release/torch_points3d/datasets/segmentation/kitti360.py", line 46, in read_kitti360_window
    data.y = torch.from_numpy(ID2TRAINID)[y] if remap else y
IndexError: index 8 is out of bounds for dimension 0 with size 5

I changed the labels in the kitti360_config.py file to :

     Label(  'Never classified '     ,  0 ,        -1 ,        0 , 'void'            , 0       , False        , True         , True          , (  150,150,150) ),
     Label(  'Unclassified'            ,  1 ,        -1 ,        1 , 'void'            , 1       , True         , False        , False         , (  217,217,217) ),
     Label(  'pipeline'                  ,  8 ,        -1 ,        2 , 'void'            , 2       , True         , False        , False         , (  241, 2, 2) ),
     Label(  'bracket'                  ,  19 ,       -1 ,        3 , 'void'            , 3       , True         , False        , False         , (   5,115,252) ),
     Label(  'hole'                       ,  20 ,       -1 ,        4 , 'void'            , 4       , True         , False        , False         , ( 105,248, 12) ),

Dynamic-Size Image-Batching

Hi Damien!

Thanks for releasing the code of your great paper! :)

Your Dynamic-Size Image-Batching of Section 3.3 sounds very interesting to me!

However, I could not find the code for it. Could you point me to the right file?

In particular, I'm very interested in the part where you iteratively select images with the probability proportional to the number of pixels vs. number of newly seen points in the cloud, up until a certain budget.

Do you still use the cropping technique for ScanNet even though you operate on full rooms and do not do any sampling of spheres as in S3DIS? In that case, do you still use Dynamic-Size Image-Batching for ScanNet?

Best,
Jonas

Warning -> Error when running kitti360_inference.ipynb

@drprojects I am currently getting this warning that results in the cell being exited.

WARNING: the KITTI-360 terms and conditions require that you register and manually download KITTI-360 data from: http://www.cvlibs.net/datasets/kitti-360/download.php
Files must be organized in the following structure:

    /home/akrishna/Research/sem_seg/minkowski_kitti/kitti/kitti360mm/
        └── raw/
            ├── data_3d_semantics/
            |   └── 2013_05_28_drive_{seq:0>4}_sync/
            |       └── static/
            |           └── {start_frame:0>10}_{end_frame:0>10}.ply
            ├── data_2d_raw/
            |   └── 2013_05_28_drive_{seq:0>4}_sync/
            |       ├── image_{00|01}/
            |       |   └── data_rect/
            |       |       └── {frame:0>10}.png
            |       └── image_{02|03}/
            |           └── data_rgb/
            |               └── {frame:0>10}.png
            ├── data_poses/
            |   └── 2013_05_28_drive_{seq:0>4}_sync/
            |       ├── poses.txt
            |       └── cam0_to_world.txt   
            └── calibration/
                ├── calib_cam_to_pose.txt
                ├── calib_cam_to_velo.txt
                ├── calib_sick_to_velo.txt
                ├── perspective.txt
                └── image_{02|03}.yaml

I have all the files in the corresponding directories. Is there something I need to change to get this code to work? I found that this warning happens when the Trainer is created in the last cell of the kitti360_inference.ipynb notebook.

RuntimeError: CUDA error: device-side assert triggered

Hi, thanks for your great work!
When I try to train on the kitti360 dataset, I get the following error:

Traceback (most recent call last):
  File ".../anaconda3/envs/py37/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3524, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-8-06bbdd2e2640>", line 1, in <module>
    self.__class__([im.select_points(idx, mode=mode) for im in self])
  File "<ipython-input-8-06bbdd2e2640>", line 1, in <listcomp>
    self.__class__([im.select_points(idx, mode=mode) for im in self])
  File ".../DeepViewAgg/torch_points3d/core/multimodal/image.py", line 856, in select_points
    return self.clone()
  File ".../DeepViewAgg/torch_points3d/core/multimodal/image.py", line 1170, in clone
    out._x = self.x.clone() if self.x is not None \
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.

The training parameters are the same as yours, except the batch_size is 2.

problem in creating no3d model

Hello
when I change the model to "Res16UNet21-15_Res16Image21_late_max" I face this issue during the creation of the model:
##############################################################################

AssertionError Traceback (most recent call last)
in
----> 1 model: BaseModel = instantiate_model(copy.deepcopy(cfg), dataset)

~/notebook/DeepViewAgg/torch_points3d/models/model_factory.py in instantiate_model(config, dataset)
42 % (model_module, class_name)
43 )
---> 44 model = model_cls(model_config, "dummy", dataset, modellib)
45 return model

~/notebook/DeepViewAgg/torch_points3d/models/segmentation/multimodal/sparseconv3d.py in init(self, option, model_type, dataset, modules)
33 # No3D backbone init
34 self.backbone_no3d = No3DEncoder(
---> 35 option.backbone_no3d, model_type, dataset, modules)
36
37 # Set modalities based on the No3D backbone

~/notebook/DeepViewAgg/torch_points3d/applications/multimodal/no3d.py in init(self, model_config, model_type, dataset, modules, *args, **kwargs)
22 # UnwrappedUnetBasedModel init
23 super(No3DEncoder, self).init(
---> 24 model_config, model_type, dataset, modules)
25
26 # Make sure the model is multimodal and has no 3D. Note that

~/notebook/DeepViewAgg/torch_points3d/models/base_architectures/backbone.py in init(self, opt, model_type, dataset, modules_lib)
53 else:
54 self._init_from_compact_format(
---> 55 opt, model_type, dataset, modules_lib)
56
57 def _init_from_compact_format(self, opt, model_type, dataset, modules_lib):

~/notebook/DeepViewAgg/torch_points3d/models/base_architectures/backbone.py in _init_from_compact_format(self, opt, model_type, dataset, modules_lib)
151 # length
152 assert idx < n_mm_blocks,
--> 153 f"Cannot build multimodal model: branching index "
154 f"'{idx}' of modality '{m}' is too large for the "
155 f"'{n_mm_blocks}' multimodal blocks."

AssertionError: Cannot build multimodal model: branching index '1' of modality 'image' is too large for the '1' multimodal blocks.
######################################################################

could you please let me what can I do?

how to load and play with a small portion of the KITTI-360 dataset

Hi @drprojects, thanks for your wonderful work.
Now I 'm running notebooks kitti360_visualization.ipynb and want to load and play with a small portion of the KITTI-360 dataset.I make 'mini' set to True but it doesn't work.After loading a small portion of the KITTI-360 dataset,it starts to download all of the KITTI-360 dataset.

My training is unable to execute the final Epoch.

While conducting the training, the code runs smoothly until the last Epoch, where it suddenly terminates without any error message. Subsequently, it prints out the training results.

Here is the translation of the provided output.log:

100%|█| 2000/2000 [1:16:40<00:00, 2.30s/it, data_loading=0.087, iteration=1.525, train_acc=93.86, train_loss_cross_entropy=0
0%| | 0/2000 [00:00<?, ?it/s]
[2023-07-31 06:48:07,126][torch_points3d.trainer][INFO] - Learning rate = 0.000800

100%|█| 2000/2000 [1:16:08<00:00, 2.28s/it, data_loading=0.071, iteration=1.029, train_acc=93.61, train_loss_cross_entropy=0
[2023-07-31 08:04:27,512][torch_points3d.trainer][INFO] - Learning rate = 0.000800
[2023-07-31 08:04:27,513][torch_points3d.trainer][INFO] - EPOCH 58 / 60

100%|█| 2000/2000 [1:16:17<00:00, 2.29s/it, data_loading=0.141, iteration=1.928, train_acc=93.56, train_loss_cross_entropy=0.208, train_loss_seg=0.208, train_m
[2023-07-31 09:20:57,583][torch_points3d.trainer][INFO] - Learning rate = 0.000800
[2023-07-31 09:20:57,583][torch_points3d.trainer][INFO] - EPOCH 59 / 60

100%|█| 2000/2000 [1:16:15<00:00, 2.29s/it, data_loading=0.128, iteration=2.148, train_acc=93.69, train_loss_cross_entropy=0.198, train_loss_seg=0.198, train_m
[2023-07-31 10:37:25,583][torch_points3d.trainer][INFO] - Learning rate = 0.000800

Error when running last cell in kitti360_inference.ipynb notebook...

@drprojects I was able to import all the necessary modules in the kitti360_inference.ipynb file. However, when I am trying to run the last cell, I get the following error when it calls the hydra_read function:

ValueError                                Traceback (most recent call last)
<ipython-input-4-fc785dddeef7> in <module>
     19 
     20 # Parse the arguments with Hydra and OmegaConf
---> 21 cfg = hydra_read(overrides, config_name='eval.yaml')
     22 OmegaConf.set_struct(cfg, False)
     23 

~/Research/sem_seg/minkowski_kitti/DeepViewAgg/torch_points3d/utils/config.py in hydra_read(overrides, config_path, config_name)
    163     with warnings.catch_warnings():
    164         warnings.simplefilter("ignore")
--> 165         with initialize(config_path=config_path):
    166             cfg = compose(config_name=config_name, overrides=overrides)
    167         OmegaConf.resolve(cfg)

~/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/hydra/initialize.py in __init__(self, config_path, job_name, caller_stack_depth)
     87             calling_module=calling_module,
     88             config_path=config_path,
---> 89             job_name=job_name,
     90         )
     91 

~/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/hydra/_internal/hydra.py in create_main_hydra_file_or_module(cls, calling_file, calling_module, config_path, job_name)
     50         )
     51 
---> 52         return Hydra.create_main_hydra2(job_name, config_search_path)
     53 
     54     @classmethod

~/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/hydra/_internal/hydra.py in create_main_hydra2(cls, task_name, config_search_path)
     59     ) -> "Hydra":
     60         config_loader: ConfigLoader = ConfigLoaderImpl(
---> 61             config_search_path=config_search_path
     62         )
     63 

~/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/hydra/_internal/config_loader_impl.py in __init__(self, config_search_path)
     53     ) -> None:
     54         self.config_search_path = config_search_path
---> 55         self.repository = ConfigRepository(config_search_path=config_search_path)
     56 
     57     @staticmethod

~/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/hydra/_internal/config_repository.py in __init__(self, config_search_path)
     63 
     64     def __init__(self, config_search_path: ConfigSearchPath) -> None:
---> 65         self.initialize_sources(config_search_path)
     66 
     67     def initialize_sources(self, config_search_path: ConfigSearchPath) -> None:

~/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/hydra/_internal/config_repository.py in initialize_sources(self, config_search_path)
     71             assert search_path.provider is not None
     72             scheme = self._get_scheme(search_path.path)
---> 73             source_type = SourcesRegistry.instance().resolve(scheme)
     74             source = source_type(search_path.provider, search_path.path)
     75             self.sources.append(source)

~/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/hydra/_internal/sources_registry.py in resolve(self, scheme)
     28             supported = ", ".join(sorted(self.types.keys()))
     29             raise ValueError(
---> 30                 f"No config source registered for schema {scheme}, supported types : [{supported}]"
     31             )
     32         return self.types[scheme]

ValueError: No config source registered for schema pkg, supported types : [file, structured]

Do you know how I can fix this error?

Thank you in advance, and sorry for the trouble.

ModuleNotFoundError: No module named 'libKeOpstorch20877e0caa'

Compiling libKeOpstorch3001cb3e02 in /home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02:
formula: ArgKMin_Reduction(Sum(Square((Var(0,3,0) - Var(1,3,1)))),50,0)
aliases: Var(0,3,0); Var(1,3,1);
dtype : float32
... /home/kilox/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/pykeops/keops/core/autodiff/BinaryOp.h(46): error: too many arguments for template template parameter "OP"
detected during:
instantiation of class "keops::BinaryOp_base<OP, FA, FB, PARAMS...> [with OP=keops::Subtract_Impl, FA=keops::Var<0, 3, 0>, FB=keops::Var<1, 3, 1>, PARAMS=<>]"
(113): here
instantiation of class "keops::BinaryOp<OP, keops::Var<NA, DIMA, CATA>, keops::Var<NB, DIMB, CATB>, PARAMS...> [with OP=keops::Subtract_Impl, NA=0, DIMA=3, CATA=0, NB=1, DIMB=3, CATB=1, PARAMS=<>]"
/home/kilox/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/pykeops/keops/core/formulas/maths/Subtract.h(31): here
instantiation of class "keops::Subtract_Impl<FA, FB> [with FA=keops::Var<0, 3, 0>, FB=keops::Var<1, 3, 1>]"
/home/kilox/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/pykeops/keops/core/pre_headers.h(40): here
instantiation of class "keops::KeopsNS [with F=keops::CondType<keops::Subtract_Impl<keops::Var<0, 3, 0>, keops::Var<1, 3, 1>>, keops::CondType<keops::Add_Impl_Broadcast<keops::Minus<keops::Var<1, 3, 1>>, keops::Var<0, 3, 0>>, keops::CondType<keops::Subtract_Impl_Broadcast<keops::Var<0, 3, 0>, keops::Var<1, 3, 1>>, keops::Subtract_Impl<keops::Var<0, 3, 0>, keops::Var<1, 3, 1>>, false>, false>, true>]"
/home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/libKeOpstorch3001cb3e02.h(27): here

/home/kilox/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/pykeops/keops/core/autodiff/UnaryOp.h(50): error: too many arguments for template template parameter "OP"
detected during:
instantiation of class "keops::UnaryOp_base<OP, F, NS...> [with OP=keops::Square, F=keops::CondType<keops::Subtract_Impl<keops::Var<0, 3, 0>, keops::Var<1, 3, 1>>, keops::CondType<keops::Add_Impl_Broadcast<keops::Minus<keops::Var<1, 3, 1>>, keops::Var<0, 3, 0>>, keops::CondType<keops::Subtract_Impl_Broadcast<keops::Var<0, 3, 0>, keops::Var<1, 3, 1>>, keops::Subtract_Impl<keops::Var<0, 3, 0>, keops::Var<1, 3, 1>>, false>, false>, true>, NS=<>]"
(61): here
instantiation of class "keops::UnaryOp<OP, F, NS...> [with OP=keops::Square, F=keops::CondType<keops::Subtract_Impl<keops::Var<0, 3, 0>, keops::Var<1, 3, 1>>, keops::CondType<keops::Add_Impl_Broadcast<keops::Minus<keops::Var<1, 3, 1>>, keops::Var<0, 3, 0>>, keops::CondType<keops::Subtract_Impl_Broadcast<keops::Var<0, 3, 0>, keops::Var<1, 3, 1>>, keops::Subtract_Impl<keops::Var<0, 3, 0>, keops::Var<1, 3, 1>>, false>, false>, true>, NS=<>]"
/home/kilox/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/pykeops/keops/core/formulas/maths/Square.h(23): here
instantiation of class "keops::Square [with F=keops::CondType<keops::Subtract_Impl<keops::Var<0, 3, 0>, keops::Var<1, 3, 1>>, keops::CondType<keops::Add_Impl_Broadcast<keops::Minus<keops::Var<1, 3, 1>>, keops::Var<0, 3, 0>>, keops::CondType<keops::Subtract_Impl_Broadcast<keops::Var<0, 3, 0>, keops::Var<1, 3, 1>>, keops::Subtract_Impl<keops::Var<0, 3, 0>, keops::Var<1, 3, 1>>, false>, false>, true>]"
/home/kilox/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/pykeops/keops/core/pre_headers.h(40): here
instantiation of class "keops::KeopsNS [with F=keops::Square<keops::CondType<keops::Subtract_Impl<keops::Var<0, 3, 0>, keops::Var<1, 3, 1>>, keops::CondType<keops::Add_Impl_Broadcast<keops::Minus<keops::Var<1, 3, 1>>, keops::Var<0, 3, 0>>, keops::CondType<keops::Subtract_Impl_Broadcast<keops::Var<0, 3, 0>, keops::Var<1, 3, 1>>, keops::Subtract_Impl<keops::Var<0, 3, 0>, keops::Var<1, 3, 1>>, false>, false>, true>>]"
/home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/libKeOpstorch3001cb3e02.h(27): here

2 errors detected in the compilation of "/home/kilox/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/pykeops/keops/core/link_autodiff.cu".
CMake Error at keopslibKeOpstorch3001cb3e02_generated_link_autodiff.cu.o.Release.cmake:280 (message):
Error generating file
/home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core/./keopslibKeOpstorch3001cb3e02_generated_link_autodiff.cu.o

make[3]: *** [CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/build.make:65：CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core/keopslibKeOpstorch3001cb3e02_generated_link_autodiff.cu.o] 错误 1
make[2]: *** [CMakeFiles/Makefile2:298：CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/all] 错误 2
make[1]: *** [CMakeFiles/Makefile2:252：CMakeFiles/libKeOpstorch3001cb3e02.dir/rule] 错误 2
make: *** [Makefile:183：libKeOpstorch3001cb3e02] 错误 2

--------------------- MAKE DEBUG -----------------
Command '['cmake', '--build', '.', '--target', 'libKeOpstorch3001cb3e02', '--', 'VERBOSE=1']' returned non-zero exit status 2.
/usr/bin/cmake -S/home/kilox/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/pykeops -B/home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02 --check-build-system CMakeFiles/Makefile.cmake 0
/usr/bin/make -f CMakeFiles/Makefile2 libKeOpstorch3001cb3e02
make[1]: 进入目录“/home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02”
/usr/bin/cmake -S/home/kilox/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/pykeops -B/home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02 --check-build-system CMakeFiles/Makefile.cmake 0
/usr/bin/cmake -E cmake_progress_start /home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles 4
/usr/bin/make -f CMakeFiles/Makefile2 CMakeFiles/libKeOpstorch3001cb3e02.dir/all
make[2]: 进入目录“/home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02”
/usr/bin/make -f CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/build.make CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/depend
make[3]: 进入目录“/home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02”
[ 25%] Building NVCC (Device) object CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core/keopslibKeOpstorch3001cb3e02_generated_link_autodiff.cu.o
cd /home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core && /usr/bin/cmake -E make_directory /home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core/.
cd /home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core && /usr/bin/cmake -D verbose:BOOL=1 -D build_configuration:STRING=Release -D generated_file:STRING=/home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core/./keopslibKeOpstorch3001cb3e02_generated_link_autodiff.cu.o -D generated_cubin_file:STRING=/home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core/./keopslibKeOpstorch3001cb3e02_generated_link_autodiff.cu.o.cubin.txt -P /home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core/keopslibKeOpstorch3001cb3e02_generated_link_autodiff.cu.o.Release.cmake
-- Removing /home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core/./keopslibKeOpstorch3001cb3e02_generated_link_autodiff.cu.o
/usr/bin/cmake -E remove /home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core/./keopslibKeOpstorch3001cb3e02_generated_link_autodiff.cu.o
-- Generating dependency file: /home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core/keopslibKeOpstorch3001cb3e02_generated_link_autodiff.cu.o.NVCC-depend
/usr/local/cuda-11.4/bin/nvcc -M -D__CUDACC__ /home/kilox/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/pykeops/keops/core/link_autodiff.cu -o /home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core/keopslibKeOpstorch3001cb3e02_generated_link_autodiff.cu.o.NVCC-depend -m64 -DkeopslibKeOpstorch3001cb3e02_EXPORTS -DMAXIDGPU=0 -DMAXTHREADSPERBLOCK0=1024 -DSHAREDMEMPERBLOCK0=49152 -D_FORCE_INLINES -DCUDA_BLOCK_SIZE=192 -DUSE_CUDA=1 -D__TYPE__=float -DC_CONTIGUOUS=1 -DMODULE_NAME=libKeOpstorch3001cb3e02 -DUSE_DOUBLE=0 -DKERNEL_GEOM_TYPE=0 -DKERNEL_SIG_TYPE=0 -DKERNEL_SPHERE_TYPE=0 -DMODULE_NAME_FSHAPE_SCP=fshape_scp_gaussiangaussiangaussian_unoriented_float -Xcompiler ,"-Wall","-Wno-unknown-pragmas","-fmax-errors=2","-fPIC","-O3","-DNDEBUG","-O3" -gencode arch=compute_75,code=sm_75 --use_fast_math --compiler-options=-fPIC -ccbin /usr/bin/c++ --pre-include=libKeOpstorch3001cb3e02.h -DNVCC -I/usr/local/cuda-11.4/include -I/home/kilox/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/pykeops -I/home/kilox/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/pykeops/keops -I/home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02 -I/home/kilox/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/torch/include -I/home/kilox/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/torch/include/torch/csrc/api/include
-- Generating temporary cmake readable file: /home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core/keopslibKeOpstorch3001cb3e02_generated_link_autodiff.cu.o.depend.tmp
/usr/bin/cmake -D input_file:FILEPATH=/home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core/keopslibKeOpstorch3001cb3e02_generated_link_autodiff.cu.o.NVCC-depend -D output_file:FILEPATH=/home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core/keopslibKeOpstorch3001cb3e02_generated_link_autodiff.cu.o.depend.tmp -D verbose=1 -P /usr/share/cmake-3.16/Modules/FindCUDA/make2cmake.cmake
-- Copy if different /home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core/keopslibKeOpstorch3001cb3e02_generated_link_autodiff.cu.o.depend.tmp to /home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core/keopslibKeOpstorch3001cb3e02_generated_link_autodiff.cu.o.depend
/usr/bin/cmake -E copy_if_different /home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core/keopslibKeOpstorch3001cb3e02_generated_link_autodiff.cu.o.depend.tmp /home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core/keopslibKeOpstorch3001cb3e02_generated_link_autodiff.cu.o.depend
-- Removing /home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core/keopslibKeOpstorch3001cb3e02_generated_link_autodiff.cu.o.depend.tmp and /home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core/keopslibKeOpstorch3001cb3e02_generated_link_autodiff.cu.o.NVCC-depend
/usr/bin/cmake -E remove /home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core/keopslibKeOpstorch3001cb3e02_generated_link_autodiff.cu.o.depend.tmp /home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core/keopslibKeOpstorch3001cb3e02_generated_link_autodiff.cu.o.NVCC-depend
-- Generating /home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core/./keopslibKeOpstorch3001cb3e02_generated_link_autodiff.cu.o
/usr/local/cuda-11.4/bin/nvcc /home/kilox/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/pykeops/keops/core/link_autodiff.cu -c -o /home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core/./keopslibKeOpstorch3001cb3e02_generated_link_autodiff.cu.o -m64 -DkeopslibKeOpstorch3001cb3e02_EXPORTS -DMAXIDGPU=0 -DMAXTHREADSPERBLOCK0=1024 -DSHAREDMEMPERBLOCK0=49152 -D_FORCE_INLINES -DCUDA_BLOCK_SIZE=192 -DUSE_CUDA=1 -D__TYPE__=float -DC_CONTIGUOUS=1 -DMODULE_NAME=libKeOpstorch3001cb3e02 -DUSE_DOUBLE=0 -DKERNEL_GEOM_TYPE=0 -DKERNEL_SIG_TYPE=0 -DKERNEL_SPHERE_TYPE=0 -DMODULE_NAME_FSHAPE_SCP=fshape_scp_gaussiangaussiangaussian_unoriented_float -Xcompiler ,"-Wall","-Wno-unknown-pragmas","-fmax-errors=2","-fPIC","-O3","-DNDEBUG","-O3" -gencode arch=compute_75,code=sm_75 --use_fast_math --compiler-options=-fPIC -ccbin /usr/bin/c++ --pre-include=libKeOpstorch3001cb3e02.h -DNVCC -I/usr/local/cuda-11.4/include -I/home/kilox/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/pykeops -I/home/kilox/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/pykeops/keops -I/home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02 -I/home/kilox/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/torch/include -I/home/kilox/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/torch/include/torch/csrc/api/include
-- Removing /home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core/./keopslibKeOpstorch3001cb3e02_generated_link_autodiff.cu.o
/usr/bin/cmake -E remove /home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02/CMakeFiles/keopslibKeOpstorch3001cb3e02.dir/keops/core/./keopslibKeOpstorch3001cb3e02_generated_link_autodiff.cu.o
make[3]: 离开目录“/home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02”
make[2]: 离开目录“/home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02”
make[1]: 离开目录“/home/kilox/.cache/pykeops-1.3-cpython-37/build-libKeOpstorch3001cb3e02”

Done.

Batch.from_data_list for S3DIS dataset

Hi ROBERT!
I am trying to train the S3DIS dataset and it broke down because of the batch class in the torch-geometric library.
Traceback (most recent call last): File "/root/.local/conda/envs/py38cu102/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/root/.local/conda/envs/py38cu102/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/root/.local/share/code-server/extensions/ms-python.python-2021.8.1159798656/pythonFiles/lib/python/debugpy/__main__.py", line 45, in <module> cli.main() File "/root/.local/share/code-server/extensions/ms-python.python-2021.8.1159798656/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 444, in main run() File "/root/.local/share/code-server/extensions/ms-python.python-2021.8.1159798656/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 285, in run_file runpy.run_path(target_as_str, run_name=compat.force_str("__main__")) File "/root/.local/conda/envs/py38cu102/lib/python3.8/runpy.py", line 265, in run_path return _run_module_code(code, init_globals, run_name, File "/root/.local/conda/envs/py38cu102/lib/python3.8/runpy.py", line 97, in _run_module_code _run_code(code, mod_globals, init_globals, File "/root/.local/conda/envs/py38cu102/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/root/share/code/DeepViewAgg/s3dis_preprocess.py", line 120, in <module> initial_trainer() File "/root/share/code/DeepViewAgg/s3dis_preprocess.py", line 95, in initial_trainer trainer.train() File "/root/share/code/DeepViewAgg/torch_points3d/trainer.py", line 147, in train self._train_epoch(epoch) File "/root/share/code/DeepViewAgg/torch_points3d/trainer.py", line 198, in _train_epoch for i, data in enumerate(tq_train_loader): File "/root/.local/lib/python3.8/site-packages/tqdm/std.py", line 1195, in __iter__ for obj in iterable: File "/root/.local/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 435, in __next__ data = self._next_data() File "/root/.local/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 475, in _next_data data = self._dataset_fetcher.fetch(index) # may raise StopIteration File "/root/.local/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch return self.collate_fn(data) File "/root/share/code/DeepViewAgg/torch_points3d/datasets/base_dataset.py", line 172, in _collate_fn return collate_fn(batch) File "/root/share/code/DeepViewAgg/torch_points3d/core/multimodal/data.py", line 189, in from_mm_data_list data = Batch.from_data_list( File "/root/.local/lib/python3.8/site-packages/torch_geometric/data/batch.py", line 50, in from_data_list assert 'batch' not in keys and 'ptr' not in keys

pytorch_geometric/3389 It is mentioned in pytorch_geometric GitHub and it is fixed in the newer version. I use torch-geometric 1.7.2 as you recommended.
The MMData does not have the 'batch' keys.

I wonder if this ever happens to you?

Issue when running the inference script on the test data

Hello @drprojects,

I am getting this error when running the kitti360_inference.ipynb on the test dataset:

ValueError: UnimodalBranch.out_channels has not been set. Please set it to allow inference even when the modality has no data.

I have narrowed this down to the ImageBatch of the data being as follows:

MMBatch(
    data = DataBatch(scattering=[591], norm=[591, 3], pos=[591, 3], grid_size=[8], origin_id=[591], mapping_index=[591], num_raw_points=[8], planarity=[591], linearity=[591], idx_window=[8], idx_center=[8], x=[591, 1], coords=[591, 3], batch=[591], ptr=[9])
    image = ImageBatch(num_settings=1, num_views=0, num_points=591, device=cpu)
)

I need clarification on why the ImageBatch has no data even though the same transforms are used on the validation dataset, which works properly for running inference. Please point me to a few possibilities as to why I am getting this error.

training time

Hello author, dataset S3DIS is how long have you run with your device?

errors while optimize_parameters() from your pretrained models

Hi, I downloaded your pretrained model for reproducing the whole training process.
But it encounters this bug as follows:
File "/root/DeepViewAgg/torch_points3d/models/base_model.py", line 259, in optimize_parameters self._grad_scale.step(self._optimizer) # update parameters AttributeError: 'NoneType' object has no attribute “step"

It seems in the optimize_parameters() function in torch_points3d/models/base_model.py the self._grad_scale is None.
I thought it may derive from the checkpoint.pt file but apparently it doesn't. It only initiates in instantiate_optimizers() with the torch.cuda.amp.GradScaler class. I am not sure where went wrong.
So I hope I can get some clues from you.

TypeError: initialize() got an unexpected keyword argument 'config_path'

with initialize(config_path=config_path):
166             cfg = compose(config_name=config_name, overrides=overrides)
167         OmegaConf.resolve(cfg)

TypeError: initialize() got an unexpected keyword argument 'config_path'

when i run the kitti inference notebook with pre train weights it show that to me . have any idea for it..

some issuees of LazyTensor in pykeops.torch

Dear Dr Robert:

Thank you for your kindly sharing codes.

I encountered some issues in both PCAComputePointwise function and NeighborhoodBasedMappingFeatures function for KNN search when I run with the scannet dataset.
for example, in PCAComputePointwise:
if xyz_search.shape[0] > 1.6e7: xyz_query_keops = LazyTensor(xyz_query[:, None, :].double()) xyz_search_keops = LazyTensor(xyz_search[None, :, :].double()) else: xyz_query_keops = LazyTensor(xyz_query[:, None, :]) xyz_search_keops = LazyTensor(xyz_search[None, :, :]) d_keops = ((xyz_query_keops - xyz_search_keops) ** 2).sum(dim=2) neighbors = d_keops.argKmin(self.num_neighbors, dim=1)
The error message is "Arg at position 1 : is not contiguous "

So I revised the xyz_query and xyz_search before knn search and it works:
dtype = torch.cuda.FloatTensor if self.use_cuda else torch.FloatTensor xyz_query = xyz_query.contiguous().type(dtype) xyz_search = xyz_query.contiguous().type(dtype)

Have you ever meet this problem? I don't konw whether my revision is the right way to solve this problem. Is there another alternative solution?

ValueError: need at least one array to stack

Dear Dr.Robert:
I ran s3dis_visualization.py and I met ValueError.How to fix it?

I found an issue while testing the project

When I run this project, it appears that the test is combined with the train and validation processes. I want to inquire if there is a separate test in the code that allows me to use my own trained .pt file for testing independently.

Accuracy is low for Validation dataset

Hello @drprojects,

I was wondering if there are any changes I need to make to run the kitti360_inference code with the CPU instead of the GPU.

Thank you,
Aditya

Installation Errors with NVIDIA RTX3080 & Newer CUDA Versions

Hey @drprojects , thank you for making the repo available and the extensive instructions! However, I ran into a lot problems while trying to setup the environment and would be grateful if you could help out.

I was trying to install based on the install.sh script that is provided and couldn't figure out what to do after days of trying. As far as I can tell the problem is due to the installation of Minkowski Engine and the GPU I'm using.

Here's the system and

GPU information I've been working with:

System:
Ubuntu 20.04.5 LTS

GPU & CUDA:
Model: NVIDIA GeForce RTX 3080
nvcc path: /usr/local/cuda-11.4/bin/nvcc
nvcc version:

NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Mon_Oct_11_21:27:02_PDT_2021
Cuda compilation tools, release 11.4, V11.4.152
Build cuda_11.4.r11.4/compiler.30521435_0

I've tried using multiple CUDA versions (11.4 and 11.6) and still couldn't get the installation script to work. As far as I understand MinkowskiEngine 0.4 doesn't work with newer GPUs so in install.sh I removed the version specification for it (i.e. pip install -U MinkowskiEngine --install-option="--blas=openblas" -v --no-deps). It also resulted in similar errors when I used the default version in the script.

My main questions would be:

How could we adopt the install.sh script to support newer CUDA versions and GPUs?
Is there an alternative installation method to create the conda environment?
Could you have a look at the error logs below? I'm attaching them as txt files because they are quite long but I hope they make sense and are helpful.

Beginning of the installation & pip versioning error with `urllib: error_1.txt

Minkowski Engine compilation problems: error_2.txt

Errors with Jupyter, these only happen at the end and probably are due to the earlier errors: error_3.txt

Thank you in advance for your help!

Python 3.8.5

@drprojects What changes would I need to make to the install.sh and/or the deep_view_aggregation.yml to set up the conda environment with Python 3.8.5 instead of 3.7.9?

Is it possible to run with multiple GPU?

Hi,
Thank you for your kindly sharing codes. As the title shown , I am wondering whether such codebase supports multiple GPU run.

appreciate for your help.

How to produce KITTI-360 `test` predictions ?

My current input is:

I_GPU=0

DATA_ROOT="./directory"                        # set your dataset root directory, where the data was/will be downloaded
EXP_NAME="My_awesome_KITTI-360_experiment"                              # whatever suits your needs
TASK="segmentation"
MODELS_CONFIG="${TASK}/sparseconv3d"                                    # family of 3D-only models using the sparseconv3d backbone
MODEL_NAME="Res16UNet34"                                                # specific model name
DATASET_CONFIG="${TASK}/kitti360-sparse"
TRAINING="kitti360_benchmark/sparseconv3d"                              # training configuration for discriminative learning rate on the model
EPOCHS=60
CYLINDERS_PER_EPOCH=12000                                               # roughly speaking, 40 cylinders per window
TRAINVAL=False                                                          # True to train on Train+Val (eg before submission)
MINI=False                                                              # True to train on mini version of KITTI-360 (eg to debug)
BATCH_SIZE=6                                             # 4 fits in a 32G V100. Can be increased at inference time, of course
WORKERS=0                                                         # adapt to your machine
BASE_LR=0.1                                                             # initial learning rate
LR_SCHEDULER='multi_step_kitti360' # learning rate scheduler for 60 epochs
EVAL_FREQUENCY=5                                                        # frequency at which metrics will be computed on Val. The less the faster the training but the less points on your validation curves
SUBMISSION=False                                                        # True if you want to generate files for a submission to the KITTI-360 3D semantic segmentation benchmark
CHECKPOINT_DIR="/home/Deep"                                                       # optional path to an already-existing checkpoint. If provided, the training will resume where it was left
export SPARSE_BACKEND=torchsparse

The code only ran "train" and "val" in the end. I would like to inquire about how to execute the "test" phase. Is there something missing or incorrect in my input?

data config for training scannet dataset

Dear Dr.Robert:

I coded the scannet_training.py according to your scripts/train_scannet. I found that the TRAINING config yaml (Line19) is set to "s3dis_benchmark/sparseconv3d_rgb-pretrained-0", so the program broke down because a nonexistent "fold" parameter in "conf/training/s3dis_benchmark/sparseconv3d_rgb-pretrained-0.yaml".

Then I change the TRAINING to "scannet_benchmark/minkowski-pretrained-0" instead.

But it still broke down while initializing Trainer in "self._model: BaseModel = instantiate_model(copy.deepcopy(self._cfg), self._dataset)" function. I tracked down to "resolve_model" function and it seems the dataset class does not get the "feature_dimension" attributes from the config file.

def resolve_model(model_config, dataset, tested_task): """ Parses the model config and evaluates any expression that may contain constants """ # placeholders to subsitute constants = { "FEAT": max(dataset.feature_dimension, 0),#4 "TASK": tested_task, "N_CLS": dataset.num_classes if hasattr(dataset, "num_classes") else None, }

The program entered an infinite loop of finding keywords and then crashed.

The Debug Variables viewer shows the error message:

"/root/share/code/DeepViewAgg/torch_points3d/datasets/segmentation/scannet.py", line 1111, in indices\n print("indices "+str(len(self)))\n File "/root/.local/lib/python3.7/site-packages/torch_geometric/data/dataset.py", line 176, in len\n return len(self.indices())\n File "/root/share/code/DeepViewAgg/torch_points3d/datasets/segmentation/scannet.py", line 1111, in indices\n print("indices "+str(len(self)))\n File "/root/.local/lib/python3.7/site-packages/torch_geometric/data/dataset.py", line 176, in len\n return len(self.indices())\n File "/root/share/code/DeepViewAgg/torch_points3d/datasets/segmentation/scannet.py", line 1108, in indices\n version = pyg.version.split('.')\nRecursionError: maximum recursion depth exceeded while calling a Python object\n'"

I think the "FEAT" should get its value from model config file or the feature_dimension should be added in data config file.

So I hope I can get some help from you.

Input feature size and kernel size mismatch in torchsparse_cuda function

Dear Dr Robert:

Thanks again for your excellent project.

I met some error in the training stage. When I run the scripts/train_scannet.sh with modelname 'Res16UNet34-PointPyramid-early-ade20k-interpolate', I get into a error as follows:
Traceback (most recent call last): File "/root/.local/conda/envs/py38cu102/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/root/.local/conda/envs/py38cu102/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/root/.local/share/code-server/extensions/ms-python.python-2021.8.1159798656/pythonFiles/lib/python/debugpy/__main__.py", line 45, in <module> cli.main() File "/root/.local/share/code-server/extensions/ms-python.python-2021.8.1159798656/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 444, in main run() File "/root/.local/share/code-server/extensions/ms-python.python-2021.8.1159798656/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 285, in run_file runpy.run_path(target_as_str, run_name=compat.force_str("__main__")) File "/root/.local/conda/envs/py38cu102/lib/python3.8/runpy.py", line 265, in run_path return _run_module_code(code, init_globals, run_name, File "/root/.local/conda/envs/py38cu102/lib/python3.8/runpy.py", line 97, in _run_module_code _run_code(code, mod_globals, init_globals, File "/root/.local/conda/envs/py38cu102/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/root/share/code/DeepViewAgg/scannet_preprocess.py", line 163, in <module> initial_trainer() File "/root/share/code/DeepViewAgg/scannet_preprocess.py", line 90, in initial_trainer trainer.train() File "/root/share/code/DeepViewAgg/torch_points3d/trainer.py", line 147, in train self._train_epoch(epoch) File "/root/share/code/DeepViewAgg/torch_points3d/trainer.py", line 202, in _train_epoch self._model.optimize_parameters(epoch, self._dataset.batch_size) File "/root/share/code/DeepViewAgg/torch_points3d/models/base_model.py", line 245, in optimize_parameters self.forward(epoch=epoch) # first call forward to calculate intermediate results File "/root/share/code/DeepViewAgg/torch_points3d/models/segmentation/sparseconv3d.py", line 44, in forward features = self.backbone(self.input).x File "/root/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/root/share/code/DeepViewAgg/torch_points3d/applications/sparseconv3d.py", line 228, in forward data = self.down_modules[i](data) File "/root/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/root/share/code/DeepViewAgg/torch_points3d/modules/multimodal/modules.py", line 84, in forward mm_data_dict = self.forward_3d_block_down( File "/root/share/code/DeepViewAgg/torch_points3d/modules/multimodal/modules.py", line 171, in forward_3d_block_down x_3d = block(x_3d) File "/root/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/root/share/code/DeepViewAgg/torch_points3d/modules/SparseConv3d/modules.py", line 165, in forward out = self.conv_in(x) File "/root/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/root/.local/lib/python3.8/site-packages/torch/nn/modules/container.py", line 117, in forward input = module(input) File "/root/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/root/.local/lib/python3.8/site-packages/torchsparse/nn/modules/conv.py", line 58, in forward return conv3d(inputs, File "/root/.local/lib/python3.8/site-packages/torchsparse/nn/functional/sparseconv.py", line 183, in conv3d output_features = sparseconv_op(features, kernel, idx_query[0], File "/root/.local/lib/python3.8/site-packages/torchsparse/nn/functional/sparseconv.py", line 57, in forward torchsparse_cuda.sparseconv_forward(features, out, kernel, ValueError: Input feature size and kernel size mismatch

I check the feature size of input x i.e., torch.Size([53419, 513]). It seems following the configuration files. I dont't know why this happened?

The problem about AttributeError: 'int' object has no attribute 'feats'

I encountered the following problem while training the 3D point cloud model:

[2023-07-20 08:48:30,841][torch_points3d.datasets.base_dataset][INFO] - Available stage selection datasets:  ['test', 'val'] 
[2023-07-20 08:48:30,842][torch_points3d.datasets.base_dataset][INFO] - The models will be selected using the metrics on following dataset:  val 
[2023-07-20 08:48:34,292][torch_points3d.trainer][INFO] - EPOCH 1 / 60
  0%|                                                                                                                                                 | 0/1500 [00:02<?, ?it/s]
Error executing job with overrides: ['data=segmentation/kitti360-sparse', 'models=segmentation/sparseconv3d', 'model_name=Res16UNet34', 'task=segmentation', 'training=kitti360_benchmark/sparseconv3d', 'lr_scheduler=multi_step_kitti360', 'eval_frequency=5', 'data.sample_per_epoch=12000', 'data.dataroot=./directory', 'data.train_is_trainval=False', 'data.mini=False', 'training.cuda=0', 'training.batch_size=8', 'training.epochs=60', 'training.num_workers=0', 'training.optim.base_lr=0.1', 'training.wandb.log=True', 'training.wandb.name=My_awesome_KITTI-360_experiment', 'tracker_options.make_submission=False', 'training.checkpoint_dir=']
Traceback (most recent call last):
  File "train.py", line 14, in main
    trainer.train()
  File "/home/DeepViewAgg-release/torch_points3d/trainer.py", line 146, in train
    self._train_epoch(epoch)
  File "/home/DeepViewAgg-release/torch_points3d/trainer.py", line 201, in _train_epoch
    self._model.optimize_parameters(epoch, self._dataset.batch_size)
  File "/home/DeepViewAgg-release/torch_points3d/models/base_model.py", line 245, in optimize_parameters
    self.forward(epoch=epoch)  # first call forward to calculate intermediate results
  File "/home/DeepViewAgg-release/torch_points3d/models/segmentation/sparseconv3d.py", line 67, in forward
    features=self.backbone(self.input).x
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/DeepViewAgg-release/torch_points3d/applications/sparseconv3d.py", line 281, in forward
    data = self.up_modules[i](data, skip)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/DeepViewAgg-release/torch_points3d/modules/SparseConv3d/modules.py", line 225, in forward
    x = snn.cat(x,skip)
  File "/home/DeepViewAgg-release/torch_points3d/modules/SparseConv3d/nn/torchsparse.py", line 76, in cat
    return TS.cat(arg)
  File "/opt/conda/lib/python3.8/site-packages/torchsparse/operators.py", line 11, in cat
    feats = torch.cat([input.feats for input in inputs], dim=1)
  File "/opt/conda/lib/python3.8/site-packages/torchsparse/operators.py", line 11, in <listcomp>
    feats = torch.cat([input.feats for input in inputs], dim=1)
AttributeError: 'int' object has no attribute 'feats'

and the wrong code is features=self.backbone ( self.input ).x
I try to solve this problem and do the following adjustments features=self.backbone ( self.input.x )

[2023-07-20 08:56:59,190][torch_points3d.datasets.base_dataset][INFO] - Available stage selection datasets:  ['test', 'val'] 
[2023-07-20 08:56:59,191][torch_points3d.datasets.base_dataset][INFO] - The models will be selected using the metrics on following dataset:  val 
[2023-07-20 08:57:02,141][torch_points3d.trainer][INFO] - EPOCH 1 / 60
  0%|                                                                                                                                                 | 0/1500 [00:01<?, ?it/s]
Error executing job with overrides: ['data=segmentation/kitti360-sparse', 'models=segmentation/sparseconv3d', 'model_name=Res16UNet34', 'task=segmentation', 'training=kitti360_benchmark/sparseconv3d', 'lr_scheduler=multi_step_kitti360', 'eval_frequency=5', 'data.sample_per_epoch=12000', 'data.dataroot=./directory', 'data.train_is_trainval=False', 'data.mini=False', 'training.cuda=0', 'training.batch_size=8', 'training.epochs=60', 'training.num_workers=0', 'training.optim.base_lr=0.1', 'training.wandb.log=True', 'training.wandb.name=My_awesome_KITTI-360_experiment', 'tracker_options.make_submission=False', 'training.checkpoint_dir=']
Traceback (most recent call last):
  File "train.py", line 14, in main
    trainer.train()
  File "/home/DeepViewAgg-release/torch_points3d/trainer.py", line 146, in train
    self._train_epoch(epoch)
  File "/home/DeepViewAgg-release/torch_points3d/trainer.py", line 201, in _train_epoch
    self._model.optimize_parameters(epoch, self._dataset.batch_size)
  File "/home/DeepViewAgg-release/torch_points3d/models/base_model.py", line 245, in optimize_parameters
    self.forward(epoch=epoch)  # first call forward to calculate intermediate results
  File "/home/DeepViewAgg-release/torch_points3d/models/segmentation/sparseconv3d.py", line 67, in forward
    features=self.backbone(self.input.x)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/DeepViewAgg-release/torch_points3d/applications/sparseconv3d.py", line 245, in forward
    self._set_input(data)
  File "/home/DeepViewAgg-release/torch_points3d/applications/sparseconv3d.py", line 166, in _set_input
    self.input = sp3d.nn.SparseTensor(data.x, data.coords, data.batch, self.device)
AttributeError: 'Tensor' object has no attribute 'x'

RuntimeError & Machine Type Inquery

Running pre-collate on 3D data...
Traceback (most recent call last):
  File "s3dis_vis.py", line 100, in <module>
    dataset = S3DISFusedDataset(cfg.data)
  File "/xxx/torch_points3d/datasets/segmentation/multimodal/s3dis.py", line 767, in __init__
    self.train_dataset = S3DISSphereMM(
  File "/xxx/torch_points3d/datasets/segmentation/multimodal/s3dis.py", line 596, in __init__
    super().__init__(root, *args, **kwargs)
  File "/xxx/torch_points3d/datasets/segmentation/multimodal/s3dis.py", line 178, in __init__
    super(S3DISOriginalFusedMM, self).__init__(
  File "/home/xxx/lib/python3.8/site-packages/torch_geometric/data/in_memory_dataset.py", line 56, in __init__
    super().__init__(root, transform, pre_transform, pre_filter)
  File "/home/xxx/lib/python3.8/site-packages/torch_geometric/data/dataset.py", line 87, in __init__
    self._process()
  File "/home/xxx/lib/python3.8/site-packages/torch_geometric/data/dataset.py", line 170, in _process
    self.process()
  File "/xxx/torch_points3d/datasets/segmentation/multimodal/s3dis.py", line 655, in process
    super().process()
  File "/xxx/torch_points3d/datasets/segmentation/multimodal/s3dis.py", line 418, in process
    data_list = self.pre_collate_transform(data_list)
  File "/home/xxx/lib/python3.8/site-packages/torch_geometric/transforms/compose.py", line 19, in __call__
    data = [transform(d) for d in data]
  File "/home/xxx/lib/python3.8/site-packages/torch_geometric/transforms/compose.py", line 19, in <listcomp>
    data = [transform(d) for d in data]
  File "/xxx/torch_points3d/core/data_transform/features.py", line 541, in __call__
    data = self._process(data)
  File "/xxx/torch_points3d/core/data_transform/features.py", line 500, in _process
    neighbors = nn_finder(xyz_search, xyz_query, None, None)
  File "/xxx/torch_points3d/core/spatial_ops/neighbour_finder.py", line 17, in __call__
    return self.find_neighbours(x, y, batch_x, batch_y)
  File "/xxx/torch_points3d/core/spatial_ops/neighbour_finder.py", line 263, in find_neighbours
    return torch.LongTensor(gpu_index_flat.search(y_np, k)[1]).to(x.device)
  File "/xxx/lib/python3.8/site-packages/faiss/__init__.py", line 322, in replacement_search
    self.search_c(n, swig_ptr(x), k, swig_ptr(D), swig_ptr(I))
  File "/xxx/lib/python3.8/site-packages/faiss/swigfaiss_avx2.py", line 9009, in search
    return _swigfaiss_avx2.GpuIndex_search(self, n, x, k, distances, labels)
RuntimeError: Error in virtual void* faiss::gpu::StandardGpuResourcesImpl::allocMemory(const faiss::gpu::AllocRequest&) at
/root/miniconda3/conda-bld/faiss-pkg_1639741185190/work/faiss/gpu/StandardGpuResources.cpp:452:
Error: 'err == cudaSuccess' failed: StandardGpuResources: alloc fail type TemporaryMemoryOverflow dev 0
space Device stream 0x558ecfc66c70 size 22479120128 bytes (cudaMalloc error out of memory [2])

Hi, I run the s3dis_visualization.ipynb under notebooks for s3dis dataset. It seems need a huge memory for both CPU & GPU. And I got this OOM error, which hints that it requires over 20G GPU to preprocess the data. 😢
Therefore, I wonder know the machine type of yours as a reference. And this preprcossing looks like not memory-friendly, Is there any way to walk around this 20+G GPU memory requirement.
Thanks! and looking forward to your help.

Hi, Author.

'SparseTensor' object has no attribute 'coord_maps'

Hello, thanks for your great work!

When I used KITTI360 and the pre-trained model you provided to reproduce your paper, the following error occurred. How can I get the result? I installed the latest torchsparse using pip

Thanks a lot!

ModuleNotFoundError: No module named 'libKeOpstorch4381e0bfc3'

Hi @drprojects, first of all, thanks a lot for the great work!!!

I am currently playing with notebooks/synthetic_multimodal_dataset.ipynb and notebooks/kitti360_visualization.ipynb and encountered the same errors. In notebooks/synthetic_multimodal_dataset.ipynb, the error comes from the first cell under "Project 3D Data onto ImageData to generate mappings"

from torch_points3d.core.data_transform.multimodal.image import *

r_max = 10                        # maximum point-camera distance for the mappings
r_min = 0.2                       # minimum point-camera distance for the mappings 
k_list = [50]                     # number of neighbors used for neighborhood-based mapping features (eg density, occlusion)
exact = True                      # False: points are mapped to their whole z-buffering patch (denser mapping). True: only to the center (more accurate mapping)  
use_cuda = False                  # whether to use cuda to accelerate mapping computation
camera = 's3dis_equirectangular'  # camera model used (keep s3dis_equirectangular for this notebook)

data_list, image_data_list = MapImages(r_min=r_min, r_max=r_max, exact=exact, use_cuda=use_cuda, camera=camera)(data_list, image_data_list)
# image_data_list[0].mappings.features = None  # uncomment to visualize only densities and occlusions
data_list, image_data_list = NeighborhoodBasedMappingFeatures(k=k_list, voxel=voxel, density=True, occlusion=True, use_faiss=False, use_cuda=use_cuda)(data_list, image_data_list)

And the errors are coming from the last line:

Output exceeds the [size limit](command:workbench.action.openSettings?[%22notebook.output.textLineLimit%22]). Open the full output data[ in a text editor](command:workbench.action.openLargeOutput?410ce9c8-0625-4e9a-92f6-2c93931cec62)
Compiling libKeOpstorch4381e0bfc3 in /home/songyou/.cache/pykeops-1.4.2-cpython-37:
       formula: ArgKMin_Reduction(Sum(Square((Var(0,3,0) - Var(1,3,1)))),50,0)
       aliases: Var(0,3,0); Var(1,3,1); 
       dtype  : float32
... 
--------------------- CMAKE DEBUG -----------------
Command '['cmake', '/home/songyou/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/pykeops', "-DCMAKE_BUILD_TYPE='Release'", "-DFORMULA_OBJ='ArgKMin_Reduction(Sum(Square((Var(0,3,0) - Var(1,3,1)))),50,0)'", "-DVAR_ALIASES=''", "-Dshared_obj_name='libKeOpstorch4381e0bfc3'", "-D__TYPE__='float'", "-DPYTHON_LANG='torch'", "-DPYTHON_EXECUTABLE='/home/songyou/anaconda3/envs/deep_view_aggregation/bin/python'", "-DPYBIND11_PYTHON_VERSION='3.7'", '-DC_CONTIGUOUS=1', '-D__TYPEACC__=float', '-DENABLECHUNK=1', '-DPYTORCH_ROOT_DIR=/home/songyou/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/torch', '-D_GLIBCXX_USE_CXX11_ABI=0', "-DcommandLine=cmake /home/songyou/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/pykeops -DCMAKE_BUILD_TYPE='Release' -DFORMULA_OBJ='ArgKMin_Reduction(Sum(Square((Var(0,3,0) - Var(1,3,1)))),50,0)' -DVAR_ALIASES='' -Dshared_obj_name='libKeOpstorch4381e0bfc3' -D__TYPE__='float' -DPYTHON_LANG='torch' -DPYTHON_EXECUTABLE='/home/songyou/anaconda3/envs/deep_view_aggregation/bin/python' -DPYBIND11_PYTHON_VERSION='3.7' -DC_CONTIGUOUS=1 -D__TYPEACC__=float -DENABLECHUNK=1 -DPYTORCH_ROOT_DIR=/home/songyou/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/torch -D_GLIBCXX_USE_CXX11_ABI=0"]' returned non-zero exit status 1.
-- The CUDA Host CXX Compiler: /usr/bin/c++
-- Compute properties automatically set to: -DMAXIDGPU=1;-DMAXTHREADSPERBLOCK0=1024;-DSHAREDMEMPERBLOCK0=49152;-DMAXTHREADSPERBLOCK1=1024;-DSHAREDMEMPERBLOCK1=49152
-- Autodetected CUDA architecture(s):  8.0 8.0
-- Using shared_obj_name: libKeOpstorch4381e0bfc3
-- First i variables detected is 0
-- First j variables detected is 1
-- Compiled formula is ArgKMin_Reduction(Sum(Square((Var(0,3,0) - Var(1,3,1)))),50,0);  where the number of args is 2.
-- pybind11 v2.6.1 
-- Configuring done
-- Generating done
-- Build files have been written to: /home/songyou/.cache/pykeops-1.4.2-cpython-37/build-libKeOpstorch4381e0bfc3

--------------------- ----------- -----------------

--------------------- MAKE DEBUG -----------------
Command '['cmake', '--build', '.', '--target', 'libKeOpstorch4381e0bfc3', '--', 'VERBOSE=1']' returned non-zero exit status 2.
/usr/bin/cmake -S/home/songyou/anaconda3/envs/deep_view_aggregation/lib/python3.7/site-packages/pykeops -B/home/songyou/.cache/pykeops-1.4.2-cpython-37/build-libKeOpstorch4381e0bfc3 --check-build-system CMakeFiles/Makefile.cmake 0
/usr/bin/make -f CMakeFiles/Makefile2 libKeOpstorch4381e0bfc3
...
make[1]: Leaving directory '/home/songyou/.cache/pykeops-1.4.2-cpython-37/build-libKeOpstorch4381e0bfc3'

--------------------- ----------- -----------------
Done.
Output exceeds the [size limit](command:workbench.action.openSettings?[). Open the full output data [in a text editor](command:workbench.action.openLargeOutput?4a21863a-d4ef-44fc-9851-d5c611e3cb64)
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-7-1f2d3c55a1c4> in <module>
     10 data_list, image_data_list = MapImages(r_min=r_min, r_max=r_max, exact=exact, use_cuda=use_cuda, camera=camera)(data_list, image_data_list)
     11 # image_data_list[0].mappings.features = None  # uncomment to visualize only densities and occlusions
---> 12 data_list, image_data_list = NeighborhoodBasedMappingFeatures(k=k_list, voxel=voxel, density=True, occlusion=True, use_faiss=False, use_cuda=use_cuda)(data_list, image_data_list)

~/workspace/DeepViewAgg/torch_points3d/core/data_transform/multimodal/image.py in __call__(self, data, images)
     43                 f"List(Data) items and List(SameSettingImageData) must " \
     44                 f"have the same lengths."
---> 45             out = [self.__call__(da, im) for da, im in zip(data, images)]
     46             data_out, images_out = [list(x) for x in zip(*out)]
     47         elif isinstance(images, ImageData) and not self._PROCESS_IMAGE_DATA:

~/workspace/DeepViewAgg/torch_points3d/core/data_transform/multimodal/image.py in <listcomp>(.0)
     43                 f"List(Data) items and List(SameSettingImageData) must " \
     44                 f"have the same lengths."
---> 45             out = [self.__call__(da, im) for da, im in zip(data, images)]
     46             data_out, images_out = [list(x) for x in zip(*out)]
     47         elif isinstance(images, ImageData) and not self._PROCESS_IMAGE_DATA:

~/workspace/DeepViewAgg/torch_points3d/core/data_transform/multimodal/image.py in __call__(self, data, images)
     54                 images = ImageData([images])
     55             # data_out, images_out = self._process(data.clone(), images.clone())
---> 56             data_out, images_out = self._process(data, images)
...
~/anaconda3/envs/deep_view_aggregation/lib/python3.7/importlib/_bootstrap.py in _find_and_load(name, import_)

~/anaconda3/envs/deep_view_aggregation/lib/python3.7/importlib/_bootstrap.py in _find_and_load_unlocked(name, import_)

ModuleNotFoundError: No module named 'libKeOpstorch4381e0bfc3'

Next, the same errors also happened in the kitti360_visualization.ipynb in this cell where processing the dataset:

# Dataset instantiation
start = time()
dataset = KITTI360DatasetMM(cfg.data)
# print(dataset)
print(f"Time = {time() - start:0.1f} sec.")

It seems to me that the error is coming from the data processing step, but I am not sure how to solve this. Could you please provide some advice on how I can solve this issue? Thanks so much in advance!

Best
Songyou

Accuracy for KITTI-360 inference on Validation dataset

Hello @drprojects, I was able to get the errors fixed and was able to run the code entirely. However, for the validation dataset, I am getting an extremely low accuracy (mIoU: 1.79% OA: 26.91%) when running the kitti360_inference notebook.

Do you know why I am getting an extremely low accuracy even though I made no significant changes to the code?

mm_files folder problem.

Hi, I am trying to run scripts/train_kitti360.py

I met the following error as followed below [Error Log]
Actullay I met this error before when trying to run kitti360_inference.ipynb.
I resolved this issue by creating a folder called "mm_folder". It was because there are no mm_folder.
But, now, trying to run scripts/train_kitti360.py, this issue popped up again even with a mm_folder created.
I don't know where to create this folder for this script, or did I do something wrong?
I've almost done my task with codes for reviewing your paper except this issue.
Please give me any advice to resolve this.
It certainly occurs while execuing at the part of the following code:

class APIModel(BaseModel):
def forward(self, *args, **kwargs):
features = self.backbone(self.input).x <==== here.

[Error Log]

....
{'mm_time': 0.001059}
{'mm_time': 0.000477}
{'mm_time': 0.000126}
{'mm_time': 0.001301}
Saving mm_files/in_feat_0.pt and mm_files/kernel_16_0.pt
terminate called after throwing an instance of 'c10::Error'
what(): [enforce fail at inline_container.cc:380] . PytorchStreamWriter failed writing file version: file write failed
frame #0: c10::ThrowEnforceNotMet(char const*, int, char const*, std::string const&, void const*) + 0x47 (0x7f03a85196a7 in /home/okssi/anaconda3/envs/deep_view_aggregation_rev2/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: caffe2::serialize::PyTorchStreamWriter::valid(char const*, char const*) + 0xa2 (0x7f039139dc72 in /home/okssi/anaconda3/envs/deep_view_aggregation_rev2/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #2: caffe2::serialize::PyTorchStreamWriter::writeRecord(std::string const&, void const*, unsigned long, bool) + 0xbf (0x7f039139e61f in /home/okssi/anaconda3/envs/deep_view_aggregation_rev2/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #3: caffe2::serialize::PyTorchStreamWriter::writeEndOfFile() + 0xe1 (0x7f039139f141 in /home/okssi/anaconda3/envs/deep_view_aggregation_rev2/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #4: caffe2::serialize::PyTorchStreamWriter::~PyTorchStreamWriter() + 0x115 (0x7f039139f935 in /home/okssi/anaconda3/envs/deep_view_aggregation_rev2/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #5: + 0x3132245 (0x7f0392836245 in /home/okssi/anaconda3/envs/deep_view_aggregation_rev2/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #6: torch::jit::ExportModule(torch::jit::Module const&, std::string const&, std::unordered_map<std::string, std::string, std::hashstd::string, std::equal_tostd::string, std::allocator<std::pair<std::string const, std::string> > > const&, bool, bool) + 0x374 (0x7f0392835114 in /home/okssi/anaconda3/envs/deep_view_aggregation_rev2/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
.....

Should the kitti360 dataset be complete now, or only a portion of it need to be downloaded

torch-points-kernels and torchsparse install error with 3090, CUDA 11.2

i use the install script you privide to install, and some problem concerning to torch-points-kernels and torchsparse show up. i've try many way to solve it but still can not work. and error log is listed here : error.log.

could you show me a way to solve.

appreciate for your help .