madaoer / s3im-neural-fields Goto Github PK

[ICCV 2023] Pytorch implementation of "S3IM: Stochastic Structural SIMilarity and Its Unreasonable Effectiveness for Neural Fields".

Home Page: https://madaoer.github.io/s3im_nerf/

License: MIT License

Python 89.46% C++ 2.69% Cuda 7.62% Shell 0.24%

iccv2023 nerf neural-radiance-fields neus s3im

s3im-neural-fields's People

Contributors

Stargazers

Watchers

Forkers

jiangwenpl islandlz aoxiangfan whuhxb shell0222 louhz wslucy peterzs kunyiwang

s3im-neural-fields's Issues

Question about applying S3IM to NeRFs which using ramdom sampling

Hi. I've been reading your paper 'S3IM: Stochastic Structural SIMilarity and Its Unreasonable Effectiveness for Neural Fields'. It's a great work! However, I have some questions about S3IM.

I noticed that in table 10, where scene setting to Truck and model setting to DVGO. I found that during training DVGO, the rays used for training is somehow generated by random permutation:

def batch_indices_generator(N, BS):
 # torch.randperm on cuda produce incorrect results in my machine
 idx, top = torch.LongTensor(np.random.permutation(N)), 0
 while True:
  if top + BS > N:
   idx, top = torch.LongTensor(np.random.permutation(N)), 0
  yield idx[top:top+BS]
  top += BS 
...
index_generator = dvgo.batch_indices_generator(len(rgb_tr), cfg_train.N_rand)
batch_index_sampler = lambda: next(index_generator)
...
# random sample rays
if cfg_train.ray_sampler in ['flatten', 'in_maskcache']:
 sel_i = batch_index_sampler()
 target = rgb_tr[sel_i]
 rays_o = rays_o_tr[sel_i]
 rays_d = rays_d_tr[sel_i]
 viewdirs = viewdirs_tr[sel_i]

This is fine when applying S3IM loss on them since S3IM will randomly permute them again. But I wonder how did you apply SSIM loss on it. Since the rays sampled are randomly generated, how does SSIM loss penalize them? If it just penalizes those rays without any processing, won't it make SSIM behave just like S3IM(since rendered pixels are shuffled)?

I try my best but can't figure out how it works. I hope you can shed some light on me, that will be very helpful!

Scenarios for the use of s3im loss

When I use this loss on the MipNeRF360 dataset, I find that the quality of the rendered image deteriorates, is this not applicable to ubounded scenes? Can the same random ray be from different pixels in different training images?

How should I set the weights？？

very nice work! but i do not know how to set the weight?

whether this should be like this

About the use of S3IM.

If I want to directly calculate the S3IM value between two images (3 * 512 * 512), how should I use the S3IM class?

About the virtual patch

Simple but very nice job!

I'm not sure about the definition of virtual patch. Do you mean it is not a real patch because you just random select rays from the dataloader and compose a square patch with reshape method? I'm doubt for the structural constrain provided by this random virtual patch because it actually do not provide accurate structural information.

BTW, could you please comment the dimension and shape of the src_vec and tar_vec, who are the input of S3IM forward function?

How do batch size come into play in the S3IM formulation

Hi there, loved your work and currently trying to add it to some neural field research. I have a question about the batch size of a ray in your formulation:
1/ Suppose a batched ray size (batch, num_rays, 3). How would we construct a patch P? I'm currently iterating through each batch (num_rays, 3) batch times and averaging the results. However I noticed here that the loss varies with different batch size (more batch -> higher losses). Is this consistent with your formulation?
2/ Have you tested with other patch size other than 64x64. Does bigger patch size have any affects on perfomance?

how to use the s3im function on other training processes

very nice job!
I tried to use the s3im loss function on this repo https://github.com/yenchenlin/nerf-pytorch by simply adding the s3im_func:
parameter:
s3im_func = S3IM(kernel_size=4, stride=4, repeat_time=10,
patch_height=32, patch_width=32).cuda()
........
my code:
optimizer.zero_grad()
img_loss = img2mse(rgb, target_s)
img_loss += s3im_func(rgb, target_s)
img_loss.backward()
I compared the fern data and lego data with or without s3im loss, but the result shows that adding the s3im_func loss got worse performance. Especially, it can't reconstruct the Lego data with the same iteration. Do I misunderstand the s3im_func loss or can't use it in the data with a white background? Can you give any suggestions on how to use s3im_func on other training processes?

without s3im_func on lego

with s3im_func on lego

with s3im_func on fern

without s3im_func on fern

Reproducing results

Hi thanks for sharing the great work!

I tried the code on Replica dataset (with TensoRF model), on scan1 to 7 it gives close PSNR values to the reported results, but for scan 8, I got 24.42 PSNR which is far from the reported 39.59 in the paper.

Is it normal? Does it also happen on your machine?

Thanks in advance!

Issue when trying to run DVGO train_replica

There is a typo in the name of the "scripts" folder, its name is "scrips" instead of "scripts".

Also, in "models/DVGO/run.py", I needed to add the following line before the line with from model_components import S3IM:

sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))))

Understanding of the upper s3im

hi，The shape of rgb (srcvec) I entered is (1024,3), and the shape of gt (tar_vec) is also (1024,3). I used the parameter M=10 (repeat_time=10), patch_ Height=32, patch_ Width=32. My understanding is to transform the rgb predicted by my network into an image and gt into an image. After 10 for loops, each loop is in a random order, and the shape of both images is (1, 3, 32，320). Then, calculate s3im based on these two images? Is my understanding correct?

TanksandTemples Dataset.

Hi, would you mind sharing your preprocessed T&T datasets? Thanks!

Unable to converge

i want to add s3im loss in k-planes, but it unable to converge. This is because k-planes adds random background during the training process？

How can I obtain data?if I use my images how can I configure them?

"Could you please explain what 'office0.ply,' '000000_depth.npy,' '000000_depth.png,' '000000_mask.npy,' '000000_normal.npy,' '000000_normal.png,' and '000000_rgb.png' in the dataset within the 'data' directory are? How can I obtain them?"

How to make sure s3im input matching the patch size

How to make sure s3im input matching the patch size? If the best patch size is 64, Is it necessary to modify the shape of my input?
Thank you for your answer.

s3im_func(src,tar)

tar_patch = tar_all.permute(1, 0).reshape(1, 3, self.patch_height, self.patch_width * self.repeat_time)
src_patch = src_all.permute(1, 0).reshape(1, 3, self.patch_height, self.patch_width * self.repeat_time)