Giter VIP home page Giter VIP logo

4d-pls's Introduction

4D Panoptic Lidar Segmentation

4dmain

Project Website with Demo Video.

This repo contains code for the paper 4D Panoptic Lidar Segmentation. The code is based on the Pytoch implementation of KPConv.

Installation

git clone https://github.com/MehmetAygun/4D-PLS
cd 4D-PLS
pip install -r requirements.txt
cd cpp_wrappers
sh compile_wrappers.sh

Data

Create a directory data in main directory, download the SemanticKitti to there with labels from here

Also add semantic semantic-kitti.yaml file in SemanticKitti folder, you can download the file from here

Then create additional labels using utils/create_center_label.py,

python create_center_label.py

The data folder structure should be as follows:

data/SemanticKitti/
└── semantic-kitti.yaml
└── sequences/
    └── 08/
        └── poses.txt
        └── calib.txt
        └── times.txt
        └── labels
            ├── 000000.label
            ...
         └── velodyne
            ├── 000000.bin
            ...

Models

For saving models or using pretrained models create a folder named results in main directory. You can download a pre-trained model from here .

Training

For training, you should modify the config parameters in train_SemanticKitti.py. The most important thing that, to get a good performance train the model using config.pre_train = True firstly at least for 200 epochs, then train the model using config.pre_train = False.

python train_SemanticKitti.py

This code will generate config file and save the pre-trained models in the results directory.

Testing & Tracking

For testing, set the model directory the choosen_log in test_models.py, and modify the config parameters as you wish. Then run :

python test_models.py

This will generate semantic and instance predictions for small 4D volumes under the test/model_dir. To generate long tracks using small 4D volumes use stitch_tracklets.py

python stitch_tracklets.py --predictions test/model_dir --n_test_frames 4

This code will generate predictions in the format of SemanticKITTI under test/model_dir/stitch .

Evaluation

For getting the metrics introduced in the paper, use utils/evaluate_4dpanoptic.py

python evaluate_4dpanoptic.py --dataset=SemanticKITTI_dir --predictions=output_of_stitch_tracket_dir --data_cfg=semantic-kitti.yaml

Citing

If you find the code useful in your research, please consider citing:

@InProceedings{aygun20214d,
    author    = {Aygun, Mehmet and Osep, Aljosa and Weber, Mark and Maximov, Maxim and Stachniss, Cyrill and Behley, Jens and Leal-Taixe, Laura},
    title     = {4D Panoptic LiDAR Segmentation},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {5527-5537}
}

License

GNU General Public License (http://www.gnu.org/licenses/gpl.html)

Copyright (c) 2021 Mehmet Aygun Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

4d-pls's People

Contributors

mehmetaygun avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

4d-pls's Issues

evaluation

something regrading the evaluation code does not make sense to me, and I hope you can help me clarify some things..

  1. I run the evaluation code with prediction = GT,to evaluate GT vs GT, and I did not get association = 1 (or 100%) like I expected, which is weird:

code output:

/media/nirit/mugiwara/code/4D-PLS/4D-PLS-master/venv/bin/python /media/nirit/mugiwara/code/4D-PLS/4D-PLS-master/utils/evaluate_4dpanoptic.py --dataset=/media/nirit/mugiwara/datasets/SemanticKitti/ --predictions=/media/nirit/mugiwara/datasets/SemanticKitti/ --data_cfg=/media/nirit/mugiwara/datasets/SemanticKitti/semantic-kitti.yaml --split valid --output=/media/nirit/mugiwara/code/4D-PLS/4D-PLS-master/test/Log_2020-10-06_16-51-05_importance_None_str1_bigpug_4/stitch4 
********************************************************************************
INTERFACE:
Data:  /media/nirit/mugiwara/datasets/SemanticKitti/
Predictions:  /media/nirit/mugiwara/datasets/SemanticKitti/
Split:  valid
Config:  /media/nirit/mugiwara/datasets/SemanticKitti/semantic-kitti.yaml
Limit:  None
Min instance points:  50
Output directory /media/nirit/mugiwara/code/4D-PLS/4D-PLS-master/test/Log_2020-10-06_16-51-05_importance_None_str1_bigpug_4/stitch4
********************************************************************************
Ignoring classes:  [0]
[PANOPTIC4D EVAL] IGNORE:  [0]
[PANOPTIC4D EVAL] INCLUDE:  [ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]
Evaluating sequences: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
=== Results ===
LSTQ: 0.828126534143485
S_assoc (LAQ): 0.6857935565525006
Assoc: [0.00 0.90 0.07 0.10 0.31 0.25 0.11 0.18 0.02 0.00 0.00 0.00 0.00 0.00
 0.00 0.00 0.00 0.00 0.00 0.00]
iou: [0.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
 1.00 1.00 1.00 1.00 1.00 1.00]
things_iou: 1.0
stuff_iou: 1.0
S_cls (LSQ): 1.0

Process finished with exit code 0
  1. as I understand , the instance label of the data is unique for every class, but not for all classes,
    this means that i can have an object with instance id =2 for class 'cars', and for this class i will not find another object with the same instance label =2, but I may find another instance label ==2 in the class of "moving-car", and it will be a different object.

so , in the evaluation code(evaluate_4dpanoptic.py) you first load the instance label like so :

image

(this means that the label_inst is not unique for all classes)
and then later on , when you use it in the code(eval_np.py) combining with offset:

image

and here:
image

we can get the same value (offset * y_inst_in_cl) for both objects of different class with the same instance label
and if prediction label is the same as the GT (unique for every classes but not unique for all classes),
then I suspect a problem here, since different objects can receive the same number for the above line

Testing model and time

Hi @MehmetAygun , thank you for your code and contribution to the community. It is a really interesting work.

I have run your training and testing code. It seems that the validation process takes quite a long time, maybe one day. For each frame, it will takes more than one minute, while in the training process, each iteration only takes no more than 2 seconds. So why is it? No point downsampling for validation or there is something wrong with my implementation?

Another question is about the trained model. It seems that you only share the pretrained model but not the well-trained final model, will you share it?

Single-frame panoptic segmentation results on validation set

Hi, I want to ask whether there are single-frame panoptic segmentation results on validation set available. In the paper there is only test-set result. Without validation set results, there is no reference to compare with, and it is hard to check whether my single-frame model is trained correctly. Thank you!

Traning schedule and problem when running test_models.py

Hi, @MehmetAygun , i enjoy this paper and work very much. However, I run into some difficulties when training and testing the model.
Firstly, would you mind telling me your best training schedule? Is your learning shedule like this: 200 epochs with pre_train=True and then 800 epochs with pre_train=False ; learning_rate = 1e-4 and batch_size = 1 ?
Secondly, I immediately ran the test_models.py after I got the code and data ready, but it can't exit "the test loop " with the same output in each loop ( Min potential = 0.0 in each test epoch). In particular , the condition "last_min + 1 < new_min" is never met. Could you please teach me why this happened or whether i mad some mistakes?
I would be grateful if you could provide me with some instruction.

Ground Truth Labels for 4D Volume Formation during Training

Hi @MehmetAygun!
Many thanks for sharing your code. It is a really interesting work!

I have a question about the 4D Volume Formation during training. At this point you use the ground truth center and semantic labels to find out which points of the previous frames belong to thing classes and which are close to object center, right?
During testing/validation, on the other hand, you use the predictions of the previous frames for the 4D volume formation (here).
This allows you to use multi-thread data-loaders during training, but not during testing/validation. Do I understand this correctly?
Based on your description of 4D Volume Formation in the paper, I would have thought that we would also use the predicted labels and not the GT labels of the previous frames during training for the 4D Volume Formation. The problem here would be that the training would take much longer, right?

Thank you very much for your help!

Question about testing of the pre-trained model

Hi! May I ask what are the expected testing performance of the released pretrained model? I got LSTQ 62.14 on the validation set using the provided model, which is slightly lower than the reported LSTQ 62.74 in the paper. Is the 62.14 number also what you got from this released model? Is the small gap with the number in paper an expected randomness due to different runs of the training, or I am not configuring the testing correctly? (I also retrained a model from scratch, which also gives a consistent 62.14 LSTQ. )

My testing setting is:

    config.global_fet = False
    config.validation_size = 200
    config.input_threads = 0
    config.n_frames = 4
    config.n_test_frames = 4 #it should be smaller than config.n_frames
    if config.n_frames < config.n_test_frames:
        config.n_frames = config.n_test_frames
    config.big_gpu = True
    config.dataset_task = '4d_panoptic'
    config.sampling = 'importance'
    config.decay_sampling = 'None'
    config.stride = 1
    config.first_subsampling_dl = 0.061

which I did not change from the provided code. Especially, I want to ask about the first_subsampling_dl term. Is there a reason why it is 0.061 instead of 0.06, as in the training configuration? I am testing with a V100 GPU with 32GB memory.

Thank you!

4D-PLS on PanopticNuscenes

Hey, I really like your work! Thanks for sharing.
One question. Have you also trained and evaluated 4D-PLS on Panoptic nuScenes? Or only on SemanticKITTI?
If you have also done it on Panoptic nuScenes, would it be possible to share your code for it? I would be very grateful!

Many thanks in advance!

Dataloader structure for other datasets

@MehmetAygun Really enjoyed reading the paper. Really excited about applying the clustering problem for large point cloud sequences to handle the memory constraint.
I have a lidar dataset that has been calibrated and all point cloud frames are in the world view. Up until now, I have been plainly concatenating the point cloud frames and trying to train/test.
I would like to use the clustering method mentioned in the paper and was looking at the SemanticKitti dataloader to understand. What are the major steps that need be applied to any dataset for application in your backbone.
I see the following

  1. Stacking points and features
  2. grid_subsampling
  3. Randomly drop some points
  4. augmentation_transform
  5. segmentation_inputs

I did not understand how the dataloader getitem works without using the batch_index.
Any direction to approach this will be really helpful.

Thank you

visualization tools

Hi @MehmetAygun @jbehley

Thank you for sharing your excellent open-source code and the visualization video on your homepage.
I would greatly appreciate it if you could provide instructions on how to generate the visualization video for 4D Panoptic Lidar Segmentation predictions or ground truth.

Inquiry on GPU Memory Requirements for 4D Panoptic Lidar Segmentation Training

Dear @MehmetAygun ,

I hope this message finds you well. I am currently studying your work on the 4D Panoptic Lidar Segmentation project, and I find it extremely insightful and promising for advancements in autonomous driving technologies.

I am particularly interested in understanding the hardware requirements for training the models described in your paper. Could you please provide information on the minimum GPU memory specifications needed for effective training? Specifically, I would like to know the minimum GPU memory capacity required to handle the dataset and the computational demands of your approach.

Thank you for your time and assistance. I look forward to your response.

Best regards,
Deli Wang

"RuntimeError: Error(s) in loading state_dict for KPFCNN:" when running test_models.py

Hi, @MehmetAygun ,

Thanks for releasing the package. I got the following error when running test_models.py. I got the following error:

Model Preparation
*****************
Traceback (most recent call last):
  File "test_models.py", line 215, in <module>
    tester = ModelTester(net, chkp_path=chosen_chkp)
  File "/data/code13/4D-PLS/utils/tester.py", line 77, in __init__
    net.load_state_dict(checkpoint['model_state_dict'])
  File "/root2/anaconda3/envs/pytorch1.5_4d_pls/lib/python3.7/site-packages/torch/nn/modules/module.py", line 847, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for KPFCNN:
	size mismatch for head_softmax.mlp.weight: copying a param with shape torch.Size([19, 256]) from checkpoint, the shape in current model is torch.Size([25, 256]).
	size mismatch for head_softmax.batch_norm.bias: copying a param with shape torch.Size([19]) from checkpoint, the shape in current model is torch.Size([25]).

Here is the link for the log details. Could you give some hints to solve this issue?

Thanks~

Questions about training configurations

Hi,

Thanks for open-sourcing the great work. I met some difficulty reproducing the result by training the model myself. The performance does not match that of the trained model you released. I am training on Tesla V100 so the GPU memory should not be a problem. My evaluation on the released trained model is close to the number reported in paper, so the evaluation code should be running properly.

My first question is that some configurations used by your pretrained model (found in the parameters.txt file in the zip file of the pretrained weight) is not the same as the default as specified in train_SemanticKitti.py. The differences I found are listed below:

  • batch_num: 8 (released trained weight) vs. 4 (train_SemanticKitti.py)
  • first_subsampling_dl: 0.06 (released trained weight) vs. 0.12 (train_SemanticKitti.py)
  • in_features_dim: 2 (released trained weight) vs. 3 (train_SemanticKitti.py)
  • n_frames: 4 (released trained weight) vs. 2 (train_SemanticKitti.py)

Which setting shall I use?

Second, shall I set reinit_var to True or False after pretraining for the 200 epochs? My understanding is that if you continue training from a pretrained model which is not trained with the variance head, then set reinit_var to True . If we resume training in the middle of the second stage training (in which case the variance head is also trained), then set reinit_var to False. Is it correct?

Third, do you pick the best model in a training session from the intermediate checkpoints, or you always use the latest model after the 1000 epochs of training? It seems that the intermediate checkpoints are not synchronized with the validation, thus we do not have information about the performance of the intermediate models?

Fourth, is there any way to know whether my model is being trained properly? Currently I have to train all the 1000 epochs and evaluate on the final model and compare with the released trained model, which is time-consuming. Do you have any suggestions on getting the rough idea in an earlier stage? (e.g., what is a proper loss or accuracy during training or validation at a smaller epoch number?)

Thanks for your help!

Questions about merged_points and test_models.py

Hi,@MehmetAygun i encounter some other questions this time.

  1. While reading the code , i find something confusing Here. When training, i found that this condition if f_inc == 0 or (hasattr(self.config, 'stride') and f_inc % self.config.stride == 0): is only met when f_inc = 0, which means that the merged points only contain data from one frame while training. Does this mean that you don't merge the time information during training?
  2. Because the test process is too slow, I change here to config.input_threads. But i encounter error IndexError: index 4073 is out of bounds for axis 0 with size 4071 . Does this mean that i must set num_workers = 0 ?
  3. What does "epoch" mean during testing ? Does it mean we need sample and take average during testing ?
  4. Is this whole test standard in point cloud segmentation ?
    I'd appreciate some help!

single-frame 4DPLS model

I would suggest this schedule,

  1. Set previous previous_training_path as empty, config.pre_train as True, config.learning_rate = 1e-2, and train a model for about 200 epochs, but check the segmentation acc from validation.
  2. Then set the previous_training_path whatever it saved from previous trianing, set config.learning_rate = 1e-3, and config.pre_train as False and do fine-tuning about 800 epoch or do validation once a while during training with the saved models to stop training.
  3. After this done, you can use the model for testing.

The model that I shared is for testing not for training. It is fully trained. You can take the model that I shared and do validation/testing without any training.

Originally posted by @MehmetAygun in #7 (comment)

A question about training

Could you please tell me the difference between the first “config.pre_train=True” and the later “config.pre_train=False”? Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.