dunbar12138 / dsnerf Goto Github PK
View Code? Open in Web Editor NEWCode release for DS-NeRF (Depth-supervised Neural Radiance Fields)
Home Page: https://www.cs.cmu.edu/~dsnerf/
License: MIT License
Code release for DS-NeRF (Depth-supervised Neural Radiance Fields)
Home Page: https://www.cs.cmu.edu/~dsnerf/
License: MIT License
Hello, thank you for this project :)
I got a problem while training on a scene.
When I extracted the camera poses from COLMAP this is the output
the result is not so crispy as the scene is a little bit hard to learn but
But when I use the camera poses from the dataset I got a result like this
The difference between the two is really great. I changed the gt camera poses to the coordinate system of right, upwards, backwards.
What am I doing wrong?
Thanks.
hello! I found that you used KL divergence penalty instead of MSE loss in your newest paper. Are you going to update your code?
I think the released code is based on batching(use_batching=True) for depth supervision setting.
Is there a performance drop when not using batching (especially for few-shot learning & sparse depth supervision)?
Thank you
DSNeRF/llff/poses/pose_utils.py
Line 18 in 3379957
Does DS-NeRF work also for inward looking scenes?
I've tried to make it work by brutally setting --spherify=True but it gives awful results, unlike for forward looking scenes.
I'm not sure whether I should try changing some parameters or if DS-NeRF is just not meant for this. If so, why?
PS: Thanks very much for your work and code.
Hi,
thank you for open-sourcing the code of this great work! I was wondering if you could provide some insight on why the depth from COLMAP is scaled using the bd_factor and how to appropriately set its value?
Should bd_factor be scene-dependent?
Thank you for your help.
Zan
I am using NDC space and I have depth values collected from a sensor, should I normalize these values between 0 and 1 or should I convert them into NDC space?
How should we calculate the depth loss while using NDC space as the depth_col value comes between 0 and 1 in NDC space but the target_depth is bigger than 1.
Thanks.
Hi, thanks for sharing the great work.
I run with the default config file and it works well. However, it use the MSE loss for depth supervision and the ray distribution loss is not used by default. I try to enable it by setting sigma_loss = True in the config file but it raise an error as below:
File "run_nerf.py", line 993, in train
sigma_loss = extras_col['sigma_loss'].mean()
KeyError: 'sigma_loss'
Could you explain why not use the sigma_loss by default and give some instruction to enable it or do i miss something?
Thanks for your help.
e.g. when you use two images to train NeRF, did you also use two images to generate point cloud?
Hi, thanks for your great work and code!
I'm trying to reproduce the demo results with the pre-trained model, but I got unreasonable frame outputs as below:
The checkpoint fern_2v/020000.tar
was loaded. The only modification to the code is replacing from torchsearchsorted import searchsorted
withfrom torch import searchsorted
and the corresponding usage of it.
Do you have any idea what may have gone wrong here?
Thanks in advance!
Hi, thanks for your excellent work!
I have a question that how can i get the colmap_depth.npy when i run on my own data?
Hi,
Thanks for your great work! But I've got some issues here:
When I was loading ETH3D courtyard scene, pose bound file could be generated. I use image.bin, point3D.bin from the dataset. However, when I start to train, I found that there is no depth for this scene and the terminal printed zero for all photos. I didn't test on other scenes. Is there something wrong I've done?
Shengkun Tang
poses = np.concatenate([poses[:, 1:2, :], poses[:, 0:1, :], -poses[:, 2:3, :], poses[:, 3:4, :], poses[:, 4:5, :]], 1)
the comment is not self-explained, so what does this transform mean?
Hi there, thanks for sharing your work!
In your paper, it is said that the sparse point clouds are generated by COLMAP with ground truth camera poses.
Does ground truth camera poses means the cameras.npz
of each scan in DTU?
If so, how to pass cameras.npz
to COLMAP for reconstruction? Looks like the gen_poses function does not accept cameras.npz
as an input.
Thanks.
with only one image in data/f2/images
i run python img2poses.py data/f2
got:
Need to run COLMAP
Features extracted
Features matched
ERROR: failed to create sparse model
Traceback (most recent call last):
File "imgs2poses.py", line 18, in
gen_poses(args.scenedir, args.match_type)
File "/data/NeRF_xy/llff/poses/pose_utils.py", line 268, in gen_poses
run_colmap(basedir, match_type)
File "/data/NeRF_xy/llff/poses/colmap_wrapper.py", line 71, in run_colmap
map_output = ( subprocess.check_output(mapper_args, universal_newlines=True) )
File "/opt/conda/lib/python3.6/subprocess.py", line 356, in check_output
**kwargs).stdout
File "/opt/conda/lib/python3.6/subprocess.py", line 438, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['/usr/local/bin/colmap', 'mapper', '--database_path', 'data/f2/database.db', '--image_path', 'data/f2/images', '--output_path', 'data/f2/sparse', '--Mapper.num_threads', '16', '--Mapper.init_min_tri_angle', '4', '--Mapper.multiple_models', '0', '--Mapper.extract_colors', '0']' returned non-zero exit status 1.
do you know how to solve?
btw the sparse/0/project.ini of fern2view on your website have nearly 200 lines,the one i download from google drive have only about 65 lines, do you know why?
Sorry, I can't find your config files of your results (only weights files can be found). There are some difficulties in reproducing your achievements. Can you share your config.txt and args.txt of the LLFF dataset?
Hi, thanks for your excellent work, while reproducing your paper, I have the following three questions:
hello, I use imgs2poses.py can not to produce colmap_depth.npy
how to solve this problem??
How to export mesh and point cloud?
tks for awesome work, i am just wondering are you trying to submit your papers to top journals.
Line 20 in 3379957
Take Table.1
for example, which 2 views , 5 views, 10 views are you used to calculate those results? Do you select those views randomly, or just hand-picking them? Could you please provide a link to download them? Thanks a lot!
No code changes, just followed the readme
I downloaded the pretrained model to the ./logs
folder, and initiated:
run_nerf.py --config configs/fern_dsnerf.txt --render_only
I keep seeing a compression warning every iteration as follows:
max: 5.9706993
Lossy conversion from float32 to uint8. Range [0.10505260527133942, 5.970699310302734]. Convert image to uint8 prior to saving to suppress this warning.
99%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ | 119/120 [14:05<00:07, 7.12s/it]119 7.150676250457764
max: 5.921188
Lossy conversion from float32 to uint8. Range [0.11592616140842438, 5.921187877655029]. Convert image to uint8 prior to saving to suppress this warning.
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 120/120 [14:12<00:00, 7.10s/it]
Done rendering ./logs/fern_dsnerf_2v/renderonly_path_000000
Hi, @dunbar12138
Thanks for your interesting work. I read the code of load_dtu.py and want to experiment with the code on the dtu dataset. But I'm confused about the direction of the camera poses:
Hi, I am a little confused about training and evaluating on DTU. You have mentioned in the paper that
We run COLMAP with the ground truth calibrated
camera poses to get keypoints. Images are down-sampled to
a resolution of 400 × 300 for training and evaluation.
1- Did you make any changes to the ground truth camera poses after you down-sampled images to 400 x 300?
2- imgs2poses.py does not create colmap_depth.npy.
3- Do you have a separate config file for DTU?
4- Where can I get cameras.npz from?
I can't download fern_2v with download_example_data.sh?
could you share again?
thanks.
The following paragraph is the information.
‘’‘Document not found
The requested address (URL) was not found on this server.
You may have used an outdated link or may have typed the address incorrectly.’‘’
I have been researching many NeRF-related projects recently and have found that a common constraint is that the views used for training cannot be too far apart.
Most common seems to be the forward-facing view scenario.
Hence, I am curious if you did any testing to see whether DSNeRF performs better with inward-facing images (i.e. 5 views spaced at 72 degree intervals) or if it has the same issues.
Thank you for your interesting work.
please add colab demo
When running
python run_nerf.py --config configs/fern_dsnerf.txt
I get the above error:
ImportError: cannot import name 'searchsorted' from 'torchsearchsorted' (unknown location)
What am I missing?
Hi,
I noticed that the configs you've provided has not enabled weighted loss (reprojection error), but in the paper you've used weighted loss. Can you please confirm if it's a typo in configs and I have to enable weighted loss or did you deliberately disable weighted loss?
Hi, I tried to run DS-NeRF on scene horns of LLFF dataset while getting bad results. I am trying to figure out what happened and it will be really appreciated if someone can discuss with me.
First, I will show the results of what I got on scene horns:
I did three experiments, each using all images of scene horns (62 images, 54 for train and 8 for test). The 'original' uses the original NeRF's codes and settings. The 'no ndc only' only changes no_ndc
of original's settings to True
. The 'no ndc only, depth sup.' uses your codes and modified settings of fern_2v.txt
to scene horns (It seems that no_ndc = True
is a default setting in DS-NeRF).
As we can see, the results of 'no ndc only' is worse than 'original', while the results of 'no ndc only, depth sup.' is even worse. For checking, I would be happy if you can also share your results on scene horns. I checked the results of colmap and found nothing weird, so I am confused why using depth supervision gives me such bad results (especially in the rendering of background). As discussed in the original repo about NDC, we should set no_ndc
to False
for LLFF scenes, since
For unbounded scenes, because the Euclidean coordinates of the sampled 3d points are not bounded, we need to apply NDC trick to map the unbounded coordinates to be bounded.
So it is reasonable that 'no ndc only' is worse than 'original'. Then, I think that using depth supervion should help network to learn where the surface is, and then the fine MLP network can sample more points around the surface, whichmay alleivate the problem of unbounded scenes. However, the results of 'no ndc only, depth sup.' become even worse. Does anyone have ideas about this?
Thanks!
Thank you for sharing your code, but I have questions.
For example of 2-view training,
How did you solve this gap between 2-view(few-view) data and test data?
If possible, could you share the code for calibrating camera pose or depth value?
Traceback (most recent call last):
File "run_nerf.py", line 1019, in
train()
File "run_nerf.py", line 858, in train
batch = next(raysRGB_iter).to(device)
File "/usr/local/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 521, in next
data = self._next_data()
File "/usr/local/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 560, in _next_data
index = self._next_index() # may raise StopIteration
File "/usr/local/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 512, in _next_index
return next(self._sampler_iter) # may raise StopIteration
File "/usr/local/lib/python3.7/site-packages/torch/utils/data/sampler.py", line 226, in iter
for idx in self.sampler:
File "/usr/local/lib/python3.7/site-packages/torch/utils/data/sampler.py", line 124, in iter
yield from torch.randperm(n, generator=generator).tolist()
RuntimeError: Expected a 'cuda' device type for generator but found 'cpu'
maybe rgb or depth inputs needs to transfered to gpu first. right?
Hi there,
thank you for sharing your great work :)
I wonder if you used many (more than two, e.g., 10) images to calculate Sparse 3D points when you had an experiment about a 2-view setting.
When I tried to calculate Sparse 3D points of fern_2v (2-view), the number of detected 3D points is 498, which is quite different from yours directly obtained from "download_example_data.sh". The number of your pre-calculated 3D points is 1081.
Could you tell me about the experiment setting?
hello ! thank u for ur great work.
I would like to train your network with my custom data(around 100 images)
I set
train_scene = [ 1, 2, 3, 4, 5, 6, 7, 8, 9,
11, 12, 13, 14, 15, 16, 17, 18, 19,
21, 22, 23, 24, 25, 26, 27, 28, 29,
31, 32, 33, 34, 35, 36, 37, 38, 39,
41, 42, 43, 44, 45, 46, 47, 48, 49,
51, 52, 53, 54, 55, 56, 57, 58, 59,
61, 62, 63, 64, 65, 66, 67, 68, 69,
71, 72, 73, 74, 75, 76, 77, 78, 79,
81, 82, 83, 84, 85, 86, 87, 88, 89,
91, 92, 93, 94, 95, 96, 97, 98, 99,
101, 102, 103, 104, 105, 106, 107, 108, 109,
111, 112, 113, 114]
and then run the train code but it shows I need your accept for access.
I joined wandb and copied my API key but it doesn't work as below.
wandb.errors.CommError: Permission denied, ask the project owner to grant you access
Please help me~!
Wonderful work!
Recently I want to evaluate this project on DTU dataset, could you release this dataset or its link? Thank you so much!
I decrease N_random,chunk,and N_iters to fit my computer to run trainning.
Though I decrease these to small digit,after 4999 iters,
[enforce fail at C:\cb\pytorch_1000000000000\work\c10\core\impl\alloc_cpu.cpp:81] data. DefaultCPUAllocator: not enough memory: you tried to allocate 48771072 bytes.
happened.
Using rtx3060, on windows 10
Hi, thanks for your neat and excellent work! However, I am confused about the implementation of the weight for depth and RGB rendering. In the code implementation, it seems that the weight is calculated by the product of T(t) and alpha(t) (
Line 374 in 168a13e
Hi, thanks for your awsome work and code!
I noticed that in your code the depth loss is applied to the fine network only, and the coarse network only has RGB loss. Is there any reasons behind this choice?
Hi there~ Thank you for sharing the code.
I'm a little bit confused with the SigmaLoss
in this repo. It is not mentioned in the paper. And looks like it's not enabled by the config args either. Could you please share your thoughts on this loss? What is it for and how does it affect the final perfromance?
Thanks.
Hi,
I noticed that you compute depth in input views in https://github.com/dunbar12138/DSNeRF/blob/main/load_llff.py#L371 but the depth given by NeRF model is in cannonical view https://github.com/dunbar12138/DSNeRF/blob/main/run_nerf_helpers.py#L377
Shouldn't both the depths be brought to same view before computing depth loss? Please let me know if I missed something
In your video you said, that you tested DSNeRF with scanned depth data. How can I train the model with my own depth data. Which format is needed? Thanks in advance.
Hi Authors,
I had a question about the implementation of the depth loss.
The depth calculated in the colmap depth loader seems to be the z
value of the point in camera frame, but the depth computed from the NeRF seems to be the distance of the point along the ray's direction.
What am I reading incorrectly here?
First off, excellent work on this project!
Can you guys add the MIT License to your codebase? It's the licensed used by the nerf-pytorch project that this was based off of. Thanks!
Hi,
I'm trying to train DS-NeRF on two images whose intrinsics and extrinsics are known. I followed this link you suggested in Issue-20, but colmap point_triangulator
is failing because it triangulated 0 points.
To debug, I used the fern_2view data and ran imgs2poses.py to obtain camera params and point cloud. Then I followed the known-camera-poses procedure (I removed every second line in images.txt), but it fails even for that. In other words, for the exact same camera params for which first approach worked, the second approach didn't work. So, it can't be a problem with the data, but instead it should be something that I'm doing incorrectly.
I'm new to NeRF and Colmap. Can you please help me out.
More details:
I created a new folder fern_trial
and in images folder, added the two images and renamed them as image001.png
and image002.png
. Also created an empty sparse
folder.
fern_trial$ colmap feature_extractor --database_path database.db --image_path images --ImageReader.single_camera 1
fern_trial$ colmap exhaustive_matcher --database_path database.db
fern_trial$ colmap mapper --database_path database.db --image_path images --output_path sparse --Mapper.num_threads 16 --Mapper.init_min_tri_angle 4 --Mapper.multiple_models 0 --Mapper.extract_colors 0
fern_trial2$ colmap feature_extractor --database_path database.db --image_path images --ImageReader.single_camera 1
I copied camera.txt
and images.txt
(deleted every second line) from fern_trial
to this folder.
fern_trial2$ colmap exhaustive_matcher --database_path database.db
fern_trial2$ colmap point_triangulator --database_path database.db --image_path images --input_path sparse/0 --output_path sparse2 --Mapper.num_threads 16 --Mapper.init_min_tri_angle 4 --Mapper.multiple_models 0 --Mapper.extract_colors 0
Hello,
Thanks for your great work. I am trying to use your work and when I type the
bash download_example_data.sh
It doesn't find the fern_2v.zip.
The data link provided in download_example_data.sh seems missing.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.