Giter VIP home page Giter VIP logo

redal's People

Contributors

tsunghan-wu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

redal's Issues

Semantic KITTI training configuration

Hi authors, thanks for the interesting work. I am having trouble running the ReDAL benchmark for Semantic KITTI. I've processed the data according to the steps with no errors, but I'm hitting the following error when training:

/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [31,0,0] Assertion `t >= 0 && t < n_classes` failed.

I ran training with:

CUDA_VISIBLE_DEVICES=0 python3 train_region_active.py -n semantic_kitti -d <dataset location> --active_method ReDAL --ReDAL_config_path active_selection/ReDAL_SemanticKITTI.config --num_classes 19

I tried with num_classes as 20 as well and got the same error.

With some debugging, it appears that the max of targets = batch['targets'].F.long().cuda(non_blocking=True) = 255 at line 86 of base_agent.py which is more than the 19 classes of Semantic KITTI. I see the same range of values [0,255] for labels after adding print statements in region_dataset.py. Not sure if this is the intended range of labels but the mismatch in num classes and range of labels should not work when directly fed into torch.nn.CrossEntropyLoss(ignore_index=self.args.ignore_idx) in line 44 of base_agent.py.

The miou result in S3DIS

Hi, I run ReDAL in S3DIS, but no matter how many times I try, I can't reproduce the results in the paper, for the best I reproduce, it can only get 38.2/5%, 43.2/7%,45.5/%9,48.9/%11, much lower than the paper show. On the other hand, the results of the experiment seem to be very unstable, and the fluctuation of the results of each experiment is very large, about 2%~3%

I try it on two RTX3090(24GB) and the experiment settings are same as the paper(the code default setting). I see others in issue#5 said they can reproduce the result on two GPUs (V5000) with 24GB, so I don't doubt the authenticity of the experimental results.

One reason I guess is that the division of superpoints is not stable, leading to different experimental results, but I have not been able to obtain superpoints that can achieve the performance presented in the paper. Or do you have any other suggestions to reproduce your model performance?

About config file

Hi, thanks for opening source this brilliant work.

But it seems you miss the ReDAL config file. Could you provide that?

Many thanks~

Question about the S3DIS Validation Set

Thank you for your great work!
In your code I notice that you divide the S3DIS dataset into train set and val set(area 5, like other projects have done). But when a new active learning iteration begins , you choose the checkpoint that performs the best miou in val set to initialize parameters this iteration. I'm wondering if this may causes information leakage and makes the model overfitting.

Question about memory

Hello, first thank you for your excellent work. And I would like to ask you several questions.

  1. When I run your program on S3DIS, after each active learning, the memory will increase greatly, resulting in cuda being out of memory. Is this the same when you are training? I train the network on 4 1080Ti GPUs and set the batch size to 4.
  2. And even if the second iteration can run successfully, the speed of a single epoch will be much slower. Is this normal?

Distributed training lowers perfomance

Hello, first of all thanks for sharing your codebase!
We've been testing it for a while and it's working well for us.
But unfortunately we've noticed that turning on distributed training degrades the performance significantly on our setup.
Running fully supervised on the S3DIS dataset with spvcnn as the model we get ~62% validation mIoU.
With same hyper-parameters and distributed_training on 4 gpus it is much faster, but we only get ~50%.
Tweaking some hps and increasing the training epochs, the best we got was ~56%. (with batch size 2 and lr 0.005)

Now we're wondering, if you used the distributed training and noticed similar performance drops?
Or are there maybe some other parameters that need to be adjusted when using distributed training?

Thanks in advance!

Error 404 when cloning repository

Hi, I am getting the following error when performing git clone on the repository

git clone https://github.com/tsunghan-wu/ReDAL
Cloning into 'ReDAL'...
remote: Enumerating objects: 193, done.
remote: Counting objects: 100% (193/193), done.
remote: Compressing objects: 100% (137/137), done.
remote: Total 193 (delta 55), reused 182 (delta 50), pack-reused 0
Receiving objects: 100% (193/193), 29.03 MiB | 20.89 MiB/s, done.
Resolving deltas: 100% (55/55), done.
Checking out files: 100% (153/153), done.
Downloading data_preparation/region_division/region_label_data.json (732 KB)
Error downloading object: data_preparation/region_division/region_label_data.json (0046aba): Smudge error: Error downloading data_preparation/region_division/region_label_data.json (0046aba13aac0d39b798b7df19003f3c52e0aae5d0578b6ddc59f40905d12de1): [0046aba13aac0d39b798b7df19003f3c52e0aae5d0578b6ddc59f40905d12de1] Object does not exist on the server: [404] Object does not exist on the server

manual git lfs pull results in

~/code/ReDAL$ git lfs pull
Git LFS: (0 of 0 files, 4 skipped) 0 B / 0 B, 49.13 MB skipped                                                                                    [8ab4c6ad5d9833f71c2244ce96772041c668546b5911826fe13a21158222874e] Object does not exist on the server: [404] Object does not exist on the server
[e78af34772365164a11efcd11d758bb4036df0e51c1aabbb24ca20a9326f72f8] Object does not exist on the server: [404] Object does not exist on the server
[0046aba13aac0d39b798b7df19003f3c52e0aae5d0578b6ddc59f40905d12de1] Object does not exist on the server: [404] Object does not exist on the server
[0f831139e7bd89c360b69462b6bbba3722ad56f27ac9a2dd455f329b9d5360b6] Object does not exist on the server: [404] Object does not exist on the server
error: failed to fetch some objects from 'https://github.com/tsunghan-wu/ReDAL.git/info/lfs'

data preparation for semantic_kitti

Hi, thanks for your great work!

I'm getting errors when trying to prepare data for semantickitti in Step 3(Calculate point cloud properties). gen_surface_variation.py only contains opreations for S3DIS. When I modify the dataset setting for semantic_kitti, I get the following error

Traceback (most recent call last):
  File "gen_surface_variation.py", line 58, in <module>
    for fname in os.listdir(coords_dir):
FileNotFoundError: [Errno 2] No such file or directory: '/data/semantic_kitti/dataset/sequences/00/coords'

Image segmentation

@tsunghan-mama hi thanks for sharing this wonderful code base i had few queries

  1. Does this source code support or can be user for point cloud data lidar for automotive driving scenario
  2. Does this source code also support 2d image segmentation
    Thanks in advance

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.