yhw-yhw / d2hc-rmvsnet Goto Github PK

The official repository of the paper "Dense Hybrid Recurrent Multi-view Stereo Net with Dynamic Consistency Checking" (ECCV2020 Spotlight)

License: MIT License

Python 98.12% Shell 1.88%

d2hc-rmvsnet's People

Contributors

Stargazers

Watchers

Forkers

frankfan007 weizizhuang rensimon lilipopololo phongnhhn92 brandontanzhirong flamehaze1115 python-repository-hub

d2hc-rmvsnet's Issues

How much GPU is needed

Excuse me.in paper,when training you set N = 3 ,D = 128, epochs=6 Batch size = 6 on 2 NVIDIA TITAN RTX graphics cards.But,in code train.sh
you set N = 4 ,D = 196, epochs=10 Batch size = 1 .When I attempt to train this network follow your train.sh ,batch =2 on 2 NVIDIA 3090(24G),11 days if training ends and GPU utilization below 46%，very slow.if batch=3,out cuda memory.why?

Implementation details

Hi, I'm asking about the implementation details. You mentioned in your paper the network for DTU is trained on 2 GPU with batchsize 6, but in your script train.sh, I don't see this. Also, is the resolution for DTU inference 480 x 640 according to eval_dtu.sh? Maybe it's not mentioned in the paper. Thank you for clarification!

Pretrained Model on BlendMVS wanted

Can you share your pretrained model on the BlendMVS dataset since I want to run your code in the outdoor scene

License

Thanks so much for open sourcing this amazing work!

What would be the license for this? If possible it would be great to add a LICENSE file.

Thanks!

Dynamic consistency fusion

From table 3 it seems dynamic consistency fusion increases the score by 1.65 which is a lot, so I want to test it on other methods to see if it consistently increases the score for any kind of depth prediction result. Have you tried to apply this module on other methods such as MVSNet? Should be just a drop-in replacement.

In your code however, I cannot find what is written in the paper.
There are two fusion functions, one in eval.py

D2HC-RMVSNet/eval.py

Lines 268 to 283 in 7ebeb16

 def check_geometric_consistency(depth_ref, intrinsics_ref, extrinsics_ref, depth_src, intrinsics_src, extrinsics_src): 

 width, height = depth_ref.shape[1], depth_ref.shape[0] 

 x_ref, y_ref = np.meshgrid(np.arange(0, width), np.arange(0, height)) 

 depth_reprojected, x2d_reprojected, y2d_reprojected, x2d_src, y2d_src = reproject_with_depth(depth_ref, intrinsics_ref, extrinsics_ref, 

 depth_src, intrinsics_src, extrinsics_src) 

 # check |p_reproj-p_1| < 1 

 dist = np.sqrt((x2d_reprojected - x_ref) ** 2 + (y2d_reprojected - y_ref) ** 2) 

 # check |d_reproj-d_1| / d_1 < 0.01 

 depth_diff = np.abs(depth_reprojected - depth_ref) 

 relative_depth_diff = depth_diff / depth_ref 

 mask = np.logical_and(dist < 1, relative_depth_diff < 0.01) 

 depth_reprojected[~mask] = 0 

 return mask, depth_reprojected, x2d_src, y2d_src

which is the traditional way to fuse. The other is in fusion.py

D2HC-RMVSNet/fusion.py

Lines 192 to 214 in 7ebeb16

 def check_geometric_consistency(depth_ref, intrinsics_ref, extrinsics_ref, depth_src, intrinsics_src, extrinsics_src 

 ): 

 width, height = depth_ref.shape[1], depth_ref.shape[0] 

 x_ref, y_ref = np.meshgrid(np.arange(0, width), np.arange(0, height)) 

 depth_reprojected, x2d_reprojected, y2d_reprojected, x2d_src, y2d_src = reproject_with_depth(depth_ref, 

 intrinsics_ref, 

 extrinsics_ref, 

 depth_src, 

 intrinsics_src, 

 extrinsics_src) 

 # check |p_reproj-p_1| < 1 

 dist = np.sqrt((x2d_reprojected - x_ref) ** 2 + (y2d_reprojected - y_ref) ** 2) 

 # check |d_reproj-d_1| / d_1 < 0.01 

 depth_diff = np.abs(depth_reprojected - depth_ref) 

 relative_depth_diff = depth_diff / depth_ref 

 masks=[] 

 for i in range(2,11): 

 mask = np.logical_and(dist < i/4, relative_depth_diff < i/1300) 

 masks.append(mask) 

 depth_reprojected[~mask] = 0 

 return masks, mask, depth_reprojected, x2d_src, y2d_src

which is different, and something I don't understand.

Anyway, they don't do what's described in the paper, so I wonder what's your final implementation to get the good results, and how is that different from the paper.

boolean index did not match indexed array along dimension 1; 479 not 480

File "fusion.py", line 376, in filter_depth
color = ref_img[:, : , :][valid_points] # hardcoded for DTU dataset
IndexError: boolean index did not match indexed array along dimension 1; dimension is 479 but corresponding boolean dimension is 480

Hello, your paper results are excellent. But I have tested a group of photos taken by myself. What's the matter when the fusion.sh stage reports that the bool dimension here is not correct? (My photos have been processed with colmap and colmap2mvsnet.py and are available correctly)

Training stage memory consumption

Hi,

I am doing training of D2HC-RMVSNet. I am using D=192, nviews=3 and batch=1, image_scale=0.25.
I am using the BlendedMVS dataset for training on one RTX 3080 Ti (12GB mem).
However, unfortunately, at d=160 of the regularization stage, I am running out of memory.
I am not using apex for automatic mixed precision.
My question is, in the training stage, will the recurrent structure will save memory? It seems it does not.

Thanks,
Han

How to test Blended_MVS

Hello, I tested blended MVs before, using the model you gave me. The graphics card is 3090, but the result is very poor. How do you set the test parameters and get the results

release time

This is a cool work! I appreciate it a lot.

When do you expect to release it so that I can have a try?

Are the parameters in eval_tanks.sh inconsistent with that on papers? Thanks

Are the parameters in eval_tanks.sh inconsistent with that on papers? Thanks
I tried increasing the max h max w, or reducing the threshold. But still only around 0.7 f2score for the reference Ignatius scene, which can be around 0.8 according to the official Tanks website.

about fusion.py

hello , thanks for share. i wonder the dynamic consistency checking noted in the paper is not the same with the code in fusion.py. maybe i did not get it ,could you give some guidance?

How do you compute the memory required?

Hi, since you reported memory usage in the paper and I noticed you have commented some memory usage in the code, so I am wondering if you use some package or the command "nvidia-smi" to report the memory. Thank you!

Which datasets is your pretrained model trained on, DTU or BlendedMVS?

Hi, Which datasets is your pretrained model trained on, DTU or BlendedMVS?

CUDA out of memory

1080Ti * 4， batch_size = 4
RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 10.92 GiB total capacity; 10.09 GiB already allocated; 5.00 MiB free; 19.09 MiB cached)

Test invocation questions

In what file is TEST_DATA_FOLDER set?
Similarly, could you please clarify what is intended by "Set MODEL_FOLDER to ckpt and model_ckpt_index to checkpoint_list."? That is, what files should be modified?

BlendedMVS training code & dataset code

Hi,
Thanks for your excellent work. I'm wondering will you release the blended training code & dataset code? I have retrianed this model using BlendedMVS by my implement-code, but the models performance is worse than your pretrained model.
Thanks!

To confirm I have run your code with the best performance

Hi, since I am working on a similar project and would like to use D2HC-RMVSNet as a reference. So, currently I have run it on Tanks and Temples training set and got the following results, which is very good:

f1 score | Barn | Caterpillar | Church | Courthouse | Ignatius | Meetingroom | Truck | mean
D2HC RMVSNet | 0.6555 | 0.6104 | 0.5421 | 0.594 | 0.8193 | 0.4095 | 0.7392 | 0.62428571

So I just want to make sure it is consistent with your best performance on that data (if you have these results)
(although I think I have followed your code exactly and it should be consistent)
Thank you!!

	def check_geometric_consistency(depth_ref, intrinsics_ref, extrinsics_ref, depth_src, intrinsics_src, extrinsics_src):
	width, height = depth_ref.shape[1], depth_ref.shape[0]
	x_ref, y_ref = np.meshgrid(np.arange(0, width), np.arange(0, height))
	depth_reprojected, x2d_reprojected, y2d_reprojected, x2d_src, y2d_src = reproject_with_depth(depth_ref, intrinsics_ref, extrinsics_ref,
	depth_src, intrinsics_src, extrinsics_src)
	# check \|p_reproj-p_1\| < 1
	dist = np.sqrt((x2d_reprojected - x_ref) 2 + (y2d_reprojected - y_ref) 2)

	# check \|d_reproj-d_1\| / d_1 < 0.01
	depth_diff = np.abs(depth_reprojected - depth_ref)
	relative_depth_diff = depth_diff / depth_ref

	mask = np.logical_and(dist < 1, relative_depth_diff < 0.01)
	depth_reprojected[~mask] = 0

	return mask, depth_reprojected, x2d_src, y2d_src