Giter VIP home page Giter VIP logo

dsnerf's People

Contributors

dunbar12138 avatar junyanz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dsnerf's Issues

Scene with known camera poses

Hello, thank you for this project :)

I got a problem while training on a scene.

When I extracted the camera poses from COLMAP this is the output

kitti_exp_2_spiral_100000_rgb.mp4

the result is not so crispy as the scene is a little bit hard to learn but

But when I use the camera poses from the dataset I got a result like this

kitti_exp_pose_1_spiral_025000_rgb.mp4

The difference between the two is really great. I changed the gt camera poses to the coordinate system of right, upwards, backwards.

What am I doing wrong?

Thanks.

depth loss

hello! I found that you used KL divergence penalty instead of MSE loss in your newest paper. Are you going to update your code?

Performance aspect of w/o batching for few-shot learning

I think the released code is based on batching(use_batching=True) for depth supervision setting.
Is there a performance drop when not using batching (especially for few-shot learning & sparse depth supervision)?

Thank you

Directly use Redwood depth for supervision

Hi, thanks for your excellent work!
I found that one of your set of experiments was trained directly using RedWood Depth as the supervision source instead of COLMAP, I would like to read the code for this part of the work, please where can I find it, thanks!

image

DSNeRF for inward looking scenes?

Does DS-NeRF work also for inward looking scenes?

I've tried to make it work by brutally setting --spherify=True but it gives awful results, unlike for forward looking scenes.

I'm not sure whether I should try changing some parameters or if DS-NeRF is just not meant for this. If so, why?

PS: Thanks very much for your work and code.

Meaning of the bd_factor

Hi,

thank you for open-sourcing the code of this great work! I was wondering if you could provide some insight on why the depth from COLMAP is scaled using the bd_factor and how to appropriately set its value?

Should bd_factor be scene-dependent?

Thank you for your help.
Zan

NDC space and depth loss

I am using NDC space and I have depth values collected from a sensor, should I normalize these values between 0 and 1 or should I convert them into NDC space?
How should we calculate the depth loss while using NDC space as the depth_col value comes between 0 and 1 in NDC space but the target_depth is bigger than 1.

Thanks.

Ray distribution loss (sigma_loss) is not used?

Hi, thanks for sharing the great work.

I run with the default config file and it works well. However, it use the MSE loss for depth supervision and the ray distribution loss is not used by default. I try to enable it by setting sigma_loss = True in the config file but it raise an error as below:

File "run_nerf.py", line 993, in train
    sigma_loss = extras_col['sigma_loss'].mean()
KeyError: 'sigma_loss'

Could you explain why not use the sigma_loss by default and give some instruction to enable it or do i miss something?

Thanks for your help.

Unreasobable result with the pre-trained model.

Hi, thanks for your great work and code!

I'm trying to reproduce the demo results with the pre-trained model, but I got unreasonable frame outputs as below:
image

The checkpoint fern_2v/020000.tar was loaded. The only modification to the code is replacing from torchsearchsorted import searchsorted withfrom torch import searchsorted and the corresponding usage of it.
Do you have any idea what may have gone wrong here?

Thanks in advance!

No depth is generated in load_colmap_depth.py

Hi,
Thanks for your great work! But I've got some issues here:
When I was loading ETH3D courtyard scene, pose bound file could be generated. I use image.bin, point3D.bin from the dataset. However, when I start to train, I found that there is no depth for this scene and the terminal printed zero for all photos. I didn't test on other scenes. Is there something wrong I've done?
Shengkun Tang

What does the code in line 50-51 of pose_utils .py mean?

must switch to [-u, r, -t] from [r, -u, t], NOT [r, u, -t]

poses = np.concatenate([poses[:, 1:2, :], poses[:, 0:1, :], -poses[:, 2:3, :], poses[:, 3:4, :], poses[:, 4:5, :]], 1)

the comment is not self-explained, so what does this transform mean?

How to run COLMAP with ground truth camera poses on DTU?

Hi there, thanks for sharing your work!

In your paper, it is said that the sparse point clouds are generated by COLMAP with ground truth camera poses.
Does ground truth camera poses means the cameras.npz of each scan in DTU?
If so, how to pass cameras.npz to COLMAP for reconstruction? Looks like the gen_poses function does not accept cameras.npz as an input.

Thanks.

running python img2poses.py ERROR

with only one image in data/f2/images
i run python img2poses.py data/f2
got:
Need to run COLMAP
Features extracted
Features matched
ERROR: failed to create sparse model
Traceback (most recent call last):
File "imgs2poses.py", line 18, in
gen_poses(args.scenedir, args.match_type)
File "/data/NeRF_xy/llff/poses/pose_utils.py", line 268, in gen_poses
run_colmap(basedir, match_type)
File "/data/NeRF_xy/llff/poses/colmap_wrapper.py", line 71, in run_colmap
map_output = ( subprocess.check_output(mapper_args, universal_newlines=True) )
File "/opt/conda/lib/python3.6/subprocess.py", line 356, in check_output
**kwargs).stdout
File "/opt/conda/lib/python3.6/subprocess.py", line 438, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['/usr/local/bin/colmap', 'mapper', '--database_path', 'data/f2/database.db', '--image_path', 'data/f2/images', '--output_path', 'data/f2/sparse', '--Mapper.num_threads', '16', '--Mapper.init_min_tri_angle', '4', '--Mapper.multiple_models', '0', '--Mapper.extract_colors', '0']' returned non-zero exit status 1.
do you know how to solve?
btw the sparse/0/project.ini of fern2view on your website have nearly 200 lines,the one i download from google drive have only about 65 lines, do you know why?

missing config files

Sorry, I can't find your config files of your results (only weights files can be found). There are some difficulties in reproducing your achievements. Can you share your config.txt and args.txt of the LLFF dataset?

Shooting angle problem and camera internal parameter adjustment

Hi, thanks for your excellent work, while reproducing your paper, I have the following three questions:

  1. Which file should I modify the camera's internal parameters in the program?
  2. How do you choose the camera angle when you create the dataset?
  3. During training, the following 4 videos (see picture) are generated, do they have to be generated? Because I always run out of memory when generating and rendering video during training, when I block this part of the code, the reproduction effect is not good.
    image
    Looking forward to your reply

colmap_depth.npy

hello, I use imgs2poses.py can not to produce colmap_depth.npy
how to solve this problem??

Which views are you used in paper exactly?

Take Table.1 for example, which 2 views , 5 views, 10 views are you used to calculate those results? Do you select those views randomly, or just hand-picking them? Could you please provide a link to download them? Thanks a lot!

Standard rendering demo mentioned in the tutorial outputs corrupted images and video.

No code changes, just followed the readme

I downloaded the pretrained model to the ./logs folder, and initiated:

run_nerf.py --config configs/fern_dsnerf.txt --render_only

I keep seeing a compression warning every iteration as follows:

max: 5.9706993
Lossy conversion from float32 to uint8. Range [0.10505260527133942, 5.970699310302734]. Convert image to uint8 prior to saving to suppress this warning.
 99%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏   | 119/120 [14:05<00:07,  7.12s/it]119 7.150676250457764
max: 5.921188
Lossy conversion from float32 to uint8. Range [0.11592616140842438, 5.921187877655029]. Convert image to uint8 prior to saving to suppress this warning.
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 120/120 [14:12<00:00,  7.10s/it]
Done rendering ./logs/fern_dsnerf_2v/renderonly_path_000000

And the output mentioned above says the following:
000

Confused about laod_dtu

Hi, @dunbar12138

Thanks for your interesting work. I read the code of load_dtu.py and want to experiment with the code on the dtu dataset. But I'm confused about the direction of the camera poses:

  • (1) Is the original projection matrix back-projects points from world to camera or projects points from camera to world?
  • (2) Why the rotation matrix is transposed at line 45?
  • (3) Why the final pose is flipped twice at line 56 ?

Training on DTU

Hi, I am a little confused about training and evaluating on DTU. You have mentioned in the paper that

We run COLMAP with the ground truth calibrated
camera poses to get keypoints. Images are down-sampled to
a resolution of 400 × 300 for training and evaluation.

1- Did you make any changes to the ground truth camera poses after you down-sampled images to 400 x 300?
2- imgs2poses.py does not create colmap_depth.npy.
3- Do you have a separate config file for DTU?
4- Where can I get cameras.npz from?

How to download fern_2v?

I can't download fern_2v with download_example_data.sh?
could you share again?
thanks.

The following paragraph is the information.

‘’‘Document not found
The requested address (URL) was not found on this server.

You may have used an outdated link or may have typed the address incorrectly.’‘’

Performance on widely spaced rotated views

I have been researching many NeRF-related projects recently and have found that a common constraint is that the views used for training cannot be too far apart.
Most common seems to be the forward-facing view scenario.
Hence, I am curious if you did any testing to see whether DSNeRF performs better with inward-facing images (i.e. 5 views spaced at 72 degree intervals) or if it has the same issues.

Thank you for your interesting work.

Configs does not use weighted loss

Hi,

I noticed that the configs you've provided has not enabled weighted loss (reprojection error), but in the paper you've used weighted loss. Can you please confirm if it's a typo in configs and I have to enable weighted loss or did you deliberately disable weighted loss?

Discussion about bad results on scene horns of LLFF dataset

Hi, I tried to run DS-NeRF on scene horns of LLFF dataset while getting bad results. I am trying to figure out what happened and it will be really appreciated if someone can discuss with me.

First, I will show the results of what I got on scene horns:
image

I did three experiments, each using all images of scene horns (62 images, 54 for train and 8 for test). The 'original' uses the original NeRF's codes and settings. The 'no ndc only' only changes no_ndc of original's settings to True. The 'no ndc only, depth sup.' uses your codes and modified settings of fern_2v.txt to scene horns (It seems that no_ndc = True is a default setting in DS-NeRF).

As we can see, the results of 'no ndc only' is worse than 'original', while the results of 'no ndc only, depth sup.' is even worse. For checking, I would be happy if you can also share your results on scene horns. I checked the results of colmap and found nothing weird, so I am confused why using depth supervision gives me such bad results (especially in the rendering of background). As discussed in the original repo about NDC, we should set no_ndc to False for LLFF scenes, since
For unbounded scenes, because the Euclidean coordinates of the sampled 3d points are not bounded, we need to apply NDC trick to map the unbounded coordinates to be bounded.
So it is reasonable that 'no ndc only' is worse than 'original'. Then, I think that using depth supervion should help network to learn where the surface is, and then the fine MLP network can sample more points around the surface, whichmay alleivate the problem of unbounded scenes. However, the results of 'no ndc only, depth sup.' become even worse. Does anyone have ideas about this?

Thanks!

How to calibrate depth value or camera pose for evaluation

Thank you for sharing your code, but I have questions.

  1. Calibration for evaluation.
    As I understand the released code, for training, you used the camera poses which are obtained from COLMAP running on training images. Then how did you evaluate on test data(0, 8, 16 ,,, for llff data)?
    I think there are no calibrated camera pose information for test data.

For example of 2-view training,

  • training: camera pose and sparse depth (both are obtained using 2 images)
  • evaluation: there are only camera poses which are obtained using all images.

How did you solve this gap between 2-view(few-view) data and test data?
If possible, could you share the code for calibrating camera pose or depth value?

  1. max depth value in 'run_nerf.py'
    I think max depth in line 832 should be changed.
    max_depth = np.max(rays_depth[:,3,0]) -> max_depth = np.max(rays_depth[:,2,0])

CUDA / CPU error~!

Traceback (most recent call last):
File "run_nerf.py", line 1019, in
train()
File "run_nerf.py", line 858, in train
batch = next(raysRGB_iter).to(device)
File "/usr/local/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 521, in next
data = self._next_data()
File "/usr/local/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 560, in _next_data
index = self._next_index() # may raise StopIteration
File "/usr/local/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 512, in _next_index
return next(self._sampler_iter) # may raise StopIteration
File "/usr/local/lib/python3.7/site-packages/torch/utils/data/sampler.py", line 226, in iter
for idx in self.sampler:
File "/usr/local/lib/python3.7/site-packages/torch/utils/data/sampler.py", line 124, in iter
yield from torch.randperm(n, generator=generator).tolist()
RuntimeError: Expected a 'cuda' device type for generator but found 'cpu'

maybe rgb or depth inputs needs to transfered to gpu first. right?

RuntimeError: CUDA out of memory.

RuntimeError: CUDA out of memory. Tried to allocate 360.00 MiB (GPU 0; 10.76 GiB total capacity; 6.21 GiB already allocated; 200.56 MiB free; 6.76 GiB reserved in total by PyTorch)

image

I use my data but produce some problems, I don't know how to solve this problem,please help me?
I use 2080Ti

Did COLMAP use many images to calculate Sparse 3D Points?

Hi there,
thank you for sharing your great work :)

I wonder if you used many (more than two, e.g., 10) images to calculate Sparse 3D points when you had an experiment about a 2-view setting.

When I tried to calculate Sparse 3D points of fern_2v (2-view), the number of detected 3D points is 498, which is quite different from yours directly obtained from "download_example_data.sh". The number of your pre-calculated 3D points is 1081.

image

image

Could you tell me about the experiment setting?

how to get wandb access?

hello ! thank u for ur great work.
I would like to train your network with my custom data(around 100 images)
I set
train_scene = [ 1, 2, 3, 4, 5, 6, 7, 8, 9,
11, 12, 13, 14, 15, 16, 17, 18, 19,
21, 22, 23, 24, 25, 26, 27, 28, 29,
31, 32, 33, 34, 35, 36, 37, 38, 39,
41, 42, 43, 44, 45, 46, 47, 48, 49,
51, 52, 53, 54, 55, 56, 57, 58, 59,
61, 62, 63, 64, 65, 66, 67, 68, 69,
71, 72, 73, 74, 75, 76, 77, 78, 79,
81, 82, 83, 84, 85, 86, 87, 88, 89,
91, 92, 93, 94, 95, 96, 97, 98, 99,
101, 102, 103, 104, 105, 106, 107, 108, 109,
111, 112, 113, 114]
and then run the train code but it shows I need your accept for access.
I joined wandb and copied my API key but it doesn't work as below.

wandb.errors.CommError: Permission denied, ask the project owner to grant you access

Please help me~!

DTU dataset

Wonderful work!
Recently I want to evaluate this project on DTU dataset, could you release this dataset or its link? Thank you so much!

DefaultCPUAllocator: not enough memory

I decrease N_random,chunk,and N_iters to fit my computer to run trainning.
Though I decrease these to small digit,after 4999 iters,
[enforce fail at C:\cb\pytorch_1000000000000\work\c10\core\impl\alloc_cpu.cpp:81] data. DefaultCPUAllocator: not enough memory: you tried to allocate 48771072 bytes. happened.

Using rtx3060, on windows 10

Confusion about the implementation of weight calculation for depth and RGB rendering

Hi, thanks for your neat and excellent work! However, I am confused about the implementation of the weight for depth and RGB rendering. In the code implementation, it seems that the weight is calculated by the product of T(t) and alpha(t) (

weights = alpha * torch.cumprod(torch.cat([torch.ones((alpha.shape[0], 1)), 1.-alpha + 1e-10], -1), -1)[:, :-1]
) while it is formulated as the product of T(t) and sigma(t) in the paper.

Depth loss only applied to fine network?

Hi, thanks for your awsome work and code!

I noticed that in your code the depth loss is applied to the fine network only, and the coarse network only has RGB loss. Is there any reasons behind this choice?

SigmaLoss

Hi there~ Thank you for sharing the code.

I'm a little bit confused with the SigmaLoss in this repo. It is not mentioned in the paper. And looks like it's not enabled by the config args either. Could you please share your thoughts on this loss? What is it for and how does it affect the final perfromance?
Thanks.

Scanned depth

In your video you said, that you tested DSNeRF with scanned depth data. How can I train the model with my own depth data. Which format is needed? Thanks in advance.

License

First off, excellent work on this project!

Can you guys add the MIT License to your codebase? It's the licensed used by the nerf-pytorch project that this was based off of. Thanks!

Unable to get sparse point cloud with colmap with 2 images with known pose

Hi,

I'm trying to train DS-NeRF on two images whose intrinsics and extrinsics are known. I followed this link you suggested in Issue-20, but colmap point_triangulator is failing because it triangulated 0 points.

To debug, I used the fern_2view data and ran imgs2poses.py to obtain camera params and point cloud. Then I followed the known-camera-poses procedure (I removed every second line in images.txt), but it fails even for that. In other words, for the exact same camera params for which first approach worked, the second approach didn't work. So, it can't be a problem with the data, but instead it should be something that I'm doing incorrectly.

I'm new to NeRF and Colmap. Can you please help me out.

More details:
I created a new folder fern_trial and in images folder, added the two images and renamed them as image001.png and image002.png. Also created an empty sparse folder.

This works:

fern_trial$ colmap feature_extractor --database_path database.db --image_path images --ImageReader.single_camera 1
fern_trial$ colmap exhaustive_matcher --database_path database.db
fern_trial$ colmap mapper --database_path database.db --image_path images --output_path sparse --Mapper.num_threads 16 --Mapper.init_min_tri_angle 4 --Mapper.multiple_models 0 --Mapper.extract_colors 0

This does not work:

fern_trial2$ colmap feature_extractor --database_path database.db --image_path images --ImageReader.single_camera 1

I copied camera.txt and images.txt (deleted every second line) from fern_trial to this folder.

fern_trial2$ colmap exhaustive_matcher --database_path database.db
fern_trial2$ colmap point_triangulator --database_path database.db --image_path images --input_path sparse/0 --output_path sparse2 --Mapper.num_threads 16 --Mapper.init_min_tri_angle 4 --Mapper.multiple_models 0 --Mapper.extract_colors 0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.