Giter VIP home page Giter VIP logo

occnerf's People

Contributors

junchengyan avatar linshan-bin avatar weiyithu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

occnerf's Issues

torch and CUDA version

Hi,

Thanks for your great work! There are some problems when I installed the torch environment.
My device is Ubuntu 2204, CUDA 11.3, 4 RTX4090 (40 series need at least CUDA 11.3).
My training setting: v1.0-mini dataset. contracted_coord = False, auxiliary_frame = False, render_h = 45, render_w = 80, input_channel = 16, train only depth.

  1. conda install pytorch==1.9.1 torchvision==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
    This is from your repo, but it would install torch of cpu version and torch.cuda.is_available() = False
    1715852887514

  2. conda install pytorch==1.10.0 torchvision==0.11.0 torchaudio==0.10.0 cudatoolkit=11.3 -c pytorch -c conda-forge and
    conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
    These two are from pytorch official website. They are gpu version, but they have the same error.
    截图 2024-05-17 13-47-27

  3. pip install torch==1.9.1+cu111 torchvision==0.10.1+cu111 torchaudio==0.9.1 -f https://download.pytorch.org/whl/torch_stable.html
    This is from pytorch official website. It trained successfully with contracted_coord = True and little warnings, but there is error about CUDA with contracted_coord = False, maybe because it is cu111:
    2

  4. pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113 and conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3 -c pytorch
    They both are from pytorch official website. It seems like training successfully. There are lots of warnings about CUDA, is this torch and cuda version suitable?
    截图 2024-05-17 11-16-26
    截图 2024-05-17 11-16-36
    截图 2024-05-17 11-16-45

  5. How to set the config to train model with semantic with 4090(24GB each)? A little accuracy can be sacrificed.

Thanks for your help!

Some problems of nusc-depth training code

Thanks for your great work! I have run the training code for depth estimation and found following two problems:

  1. Sometimes get nan while using fp16.
    problem1

  2. The loss function does not decrease while training.
    problem2

Could you please give me some advice about these problems? I run the nusc-depth training code in 4 GPUS with the same setting of the release code (auxiliary_frame=True and use_fp16=True).

Lack of memory on the GPU during training

Hi, I'm reproducing with the latest version of the code on 8 A30 GPUs (24GB of memory each) and it shows CUDA out of memory at the beginning of the 1st epoch, what is the cause of this.

1 2

Generalizability

Hi, thank you for great work.

I have a question.
Does the model need to be trained every single scene or is it generalizable?

depth training problems when set contracted_coord = False

你好,感谢你们出色的工作!我希望使用更小的显存和更少的时间来训练模型,于是我设置contracted_coord = False并调整了voxels_size = [16, 200, 200]。但此时模型训练失败,验证时最小深度和最大深度差距很小,在训练一两个epoch后渲染的深度值就为同一值了。你们有遇到过这样的情况吗?面对这样的情况我该如何调整模型参数让模型能够训练?非常感谢,如果你们能够给予一些帮助!
0-0

depth trainning problems

你好,感谢你出色的工作!我在使用你的项目进行自定义数据的训练时(only custom-depth.txt),遇到一些问题想请教。训练损失看起来正常,但是在前几个epoch验证的时候,我发现输出的深度的min和max差距很小,像下图这样:
企业微信截图_20240321104344
请问这种情况你初始化训练遇到过吗,还是说我需要针对自定义数据集进行一些参数的调整?

How to train with v1.0-mini dataset?

Hi, thanks for your great work!

Because of equipment limitations, I can only train with 8GB GPU (Nivdia 4060 laptop) and v1.0-mini dataset. Metrics are not important, just want to run on my own computer.

  1. I changed some hyperparameters, generated train.txt and val.txt of v1.0-mini (the first and last frames of each scene are excluded). It seem to train successfully:
    input_channel: 64 -> 4
    con_channel: 16 -> 1
    encoder: 101 ->50
    render_h: 180 ->45
    render_w: 320 -> 80
    截图 2024-01-09 10-57-04

  2. Do you have some advice on training with v1.0-mini dataset?

I generate the train.txt, val.txt, and depth of v1.0mini dataset by tools/export_gt_depth_nusc.py

Do I need to change ground truth occupancy labels in ./data/nuscenes/gts, 2D semantic labels in ./data/nuscenes/nuscenes_semantic and checkpoint in ./ckpts?

  1. There are 700 training scene and 150 val scene in v1.0-full dataset, but why there are just 20096 sample tokens (about 500 scene) in your train.txt?

Why do you flip the xyz here?

ind_norm_sem = ind_norm_sem.flip((-1,))
In my opinion, the xyz has been converted to the world coordiante. So if flip the xyz, the function will sample grid along the wrong axis(x along z-axis, z along x-axis).
Is there anything wrong with my idea?

Contact Author

Hello author, I am from Wuhan University of Technology. I would like you to discuss some technical details of your algorithm and its applications in other fields. Could you please provide a commonly used email so that we can communicate via email? Thank you.

depth training problems

When training custom data, the output depth value is strange, and the difference between min and max is very small.
企业微信截图_20240321104344

why cumsum the prob to render depth?

Hello author, I'm not quite familiar with NeRF volume rendering. Could you explain why probability accumulation is summed up here to render depth? What is the corresponding mathematical formula for this process? What is the corresponding physical meaning? in my opinion,cumulative multiplication might make more sense?

def get_density(self, rays_o, rays_d, Voxel_feat, is_train, inputs):
        dtype = torch.float16 if self.opt.use_fp16 else torch.float32
        device = rays_o.device
        rays_o, rays_d, Voxel_feat = rays_o.to(dtype), rays_d.to(dtype), Voxel_feat.to(dtype)

        reg_loss = {}
        eps_time = time.time()
        with torch.no_grad():
            rays_o_i = rays_o[0, ...].flatten(0, 2)  # HXWX3
            rays_d_i = rays_d[0, ...].flatten(0, 2)  # HXWX3
            rays_pts, mask_outbbox, z_vals, rays_pts_depth = self.sample_ray(rays_o_i, rays_d_i, is_train=is_train)

        dists = rays_pts_depth[..., 1:] - rays_pts_depth[..., :-1]  # [num pixels, num points - 1]
        dists = torch.cat([dists, 1e4 * torch.ones_like(dists[..., :1])], dim=-1)  # [num pixels, num points]

        sample_ret = self.grid_sampler(rays_pts, Voxel_feat, avail_mask=~mask_outbbox)
        if self.use_semantic:
            if self.opt.semantic_sample_ratio < 1.0:
                geo_feats, mask, semantic, mask_sem, group_num, group_size = sample_ret
            else:
                geo_feats, mask, semantic = sample_ret
        else:
            geo_feats, mask = sample_ret


        if self.opt.render_type == 'prob':
            weights = torch.zeros_like(rays_pts[..., 0])
            weights[:, -1] = 1
            geo_feats = torch.sigmoid(geo_feats)
            if self.opt.last_free:
                geo_feats = 1.0 - geo_feats  # the last channel is the probability of being free
            weights[mask] = geo_feats

            # accumulate
            weights = weights.cumsum(dim=1).clamp(max=1)
            alphainv_fin = weights[..., -1]
            weights = weights.diff(dim=1, prepend=torch.zeros((rays_pts.shape[:1])).unsqueeze(1).to(device=device, dtype=dtype))
            depth = (weights * z_vals).sum(-1)
            rgb_marched = 0```

Question about photometric loss.

Thanks for sharing this nice work!
Sorry, I can't run the code directly in my environment. One question I have is what the input to photometric loss looks like.

loss = L1_loss(pred, target), are the pred and target 3-channels RGB images?
Whether the value ranges from 0 to 255 (original image) or normalized value (normalized by mean/std value)? I don't see a place in the code for image normalization.

Why do you mask the feat here?

mask_mems = (torch.abs(feat_mems) > 0).float()
feat_mem = basic.reduce_masked_mean(feat_mems, mask_mems, dim=1) # B, C, Z, Y, X
feat_mem = feat_mem.permute(0, 1, 4, 3, 2) # [0, ...].unsqueeze(0) # ZYX -> XYZ

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.