yxuhan / adampi Goto Github PK
View Code? Open in Web Editor NEW[SIGGRAPH 2022] Single-View View Synthesis in the Wild with Learned Adaptive Multiplane Images
[SIGGRAPH 2022] Single-View View Synthesis in the Wild with Learned Adaptive Multiplane Images
很棒的工作!
想问一下关于最后render里的变换矩阵K_src, K_tar,怎么调整能够控制视图的规则呢
Thank you for your great work !
I noticed that during the training of the given demo, we only perform random displacements on the X-axis but not random rotations or translations on the y or z-axis. In theory, the latter should make the network more robust?
Why did we choose to warp only on the X-axis? Is it because the latter is more difficult to train?
Hi, congrats for the acceptance at SIGGRAPH 2022. We are having an event on Hugging Face for SIGGRAPH 2022, where you can submit spaces(web demos), models, and datasets for papers for a chance to win prizes. The hub offers free hosting and would make your work more accessible to the rest of the community. Hugging Hub works similar to github where you can push to user profiles or organization accounts, you can add the models/datasets and spaces to this organization:
https://huggingface.co/SIGGRAPH2022
after joining the organization using this link: https://huggingface.co/organizations/SIGGRAPH2022/share/lKvMytVEyvqyWBIbfrcZTeBhNQLXklhIHG
let me know if you need any help with the above steps, thanks
Hi,
Thanks for the great work!
I would like to reproduce some of the numbers reported in the paper. But I had an issue with handling the scale because depth from DPT are normalized, so using poses from the dataset that has real world scales always outputs synthesized images that are very off. Could you elaborate on how do you handle the scale issue in training and testing?
Thanks!
Hi,
Per your hint for training at #4 (comment), I used the same photometric losses in MINE to try to train AdaMPI from scratch on the downsampled LLFF dataset that MINE provides. However, I had Nan errors some time into the training (not immediately). Have you spotted any Nan errors in the training? Do you have any suggestions handling them?
Thanks,
JD
Thank you for your great work!
I noticed that in your paper there is a warp-back synthesis using a NVS network after the inpainting network. But in stage2_dataset.py, there is only inpainting before output. Is the NVS network not released yet?
Hi. Thanks for this great work!! I'm trying to generate 3D photo with my own images on Google Colab following Document for AdaMPI. I can successfully generate video output from your demo photos. But when I try to use my own photo, I get this error:
Traceback (most recent call last):
File "/content/AdaMPI/gen_3dphoto.py", line 27, in <module>
disp = F.interpolate(disp, size=(opt.height, opt.width), mode='bilinear', align_corners=True)
File "/usr/local/lib/python3.9/dist-packages/torch/nn/functional.py", line 3854, in interpolate
raise ValueError(
ValueError: Input and output must have the same number of spatial dimensions, but got input with with spatial dimensions of [1356, 2040, 3] and output size of (256, 384). Please provide input tensor in (N, C, d1, d2, ...,dK) format and output size in (o1, o2, ...,oK) format.
I tried to adjust the input photo to have the same dimension as your demo photo, but still not working. Do you know what might be the problem here? Thanks.
Hi, what a great job!
when i only use warp-back demo in warpback/stage1_dataset.py.
i got this errors,
Traceback (most recent call last):
File "stage1_dataset.py", line 143, in
loader = DataLoader(
File "/public/home/hpc70043/.conda/envs/warpback/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 277, in init
sampler = RandomSampler(dataset, generator=generator) # type: ignore[arg-type]
File "/public/home/hpc70043/.conda/envs/warpback/lib/python3.8/site-packages/torch/utils/data/sampler.py", line 97, in init
raise ValueError("num_samples should be a positive integer "
ValueError: num_samples should be a positive integer value, but got num_samples=0
Need i prepare my own data or just use the toydata you offered?
i see you have already offer the data but come to this error... @yxuhan
In the file i see the pretrained network but how can i train my own network,Plz!
hi han,
Thank you for your good job!
I want to know how you get your sample depth map? I run DPT model, but the result is different with yours. Waiting for your reply. Thanks.
Hi, @yxuhan! Very impressive work on SIGGRAPH 2022!
May I know when the code will be released?
Many thanks!
根据MINE的训练代码,我们设置了rgb loss、loss_ssim_tgt、loss_smooth_tgt、loss_rank损失训练得到的视点结果发现出现断层现象(rgb图像不连续,分层) (合成的新视点深度图如下图所示,有明显的分层,而目标深度是连续的)
请问您对深度损失是如何处理的?
rgb_tgt_valid_mask = torch.ge(tgt_mask_syn, self.config["mpi.valid_mask_threshold"]).to(torch.float32) #输入的tensor进行逐元素比较,并返回一个新的tensor,其中大于等于阈值的元素被设置为1,小于阈值的元素被设置为0
loss_map = torch.abs(tgt_imgs_syn - self.tgt_imgs) * rgb_tgt_valid_mask # tgt_imgs_syn self.tgt_imgs 取值范围都是0-1
loss_rgb_tgt = loss_map.mean() # L1 loss
#2. L1 loss at tgt depth frame
loss_depth_tgt = torch.mean(torch.abs(
tgt_disparity_syn - self.tgt_disp)) #F.instance_norm()
Hello author, I don't quite understand the purpose of pre-blending here. Could you provide an explanation?
Line 70 in 08df946
Will the training code be open source?
Hello,
may I ask you how and which parameters need to be changes to achieve simple zoom?
Hi,
I was checking the warpback functionality. I am wondering
mask range: tensor(-14.9876, device='cuda:0') tensor(6.9739, device='cuda:0')
mask range: tensor(-14.3851, device='cuda:0') tensor(27.4305, device='cuda:0')
mask range: tensor(-4.3844, device='cuda:0') tensor(3.6911, device='cuda:0')
Thanks in advance,
def gen_translate_path(num_frames=2, r_x=0.65, r_y=0., r_z=0):
poses = torch.eye(4).repeat(num_frames, 1, 1)
poses[1, 0, 3] = r_x
poses[1, 1, 3] = r_y
poses[1, 2, 3] = r_z
Thanks for releasing your nice work of novel view synthesis from a single input image!
I hope you to help me for performing novel view synthesis correctly with your codes.
The problem I found is that it synthesizes awakard set of images (video) for input images even for the provided inputs.
I had experimented for the provided images using command below.
python gen_3dphoto.py --img_path images/0810.png --save_path ./results/0810.mp4
This seems to work fine.
However, the result with the command below does not seem to work and looks very awkward.
python gen_3dphoto.py --img_path images/0801.png --save_path ./results/0801.mp4
I suspected that the width and height result in the problem, so I tuned them for 512 x 384 (H x W) which are similar to the height, width ratio of the image (0801.png).
python gen_3dphoto.py --img_path images/0801.png --save_path ./results/0801.mp4 --width 512 --height 384
Unfortunately, it does not seem to bring about the problem as shown below.
I'm really thank you to tell me if you know the reason of the awkward results.
And, I have one more question.
When I change the rendering path from on the xz-plane to on the xy-plane (z-axis is the viewing direction), it seems the results show little bit awkward images, such as strecthed pixels on the boundary of the object.
# in utils/utils.py
# from
swing_path_list = gen_swing_path()
# change to
swing_path_list = gen_swing_path(r_x=0.3, r_y=0.3, r_z=0.)
If you know the reason for the stretched pixels , please let me know.
Thanks!
Best wishes,
Jin.
If I input an image as the left eye image and generate the right eye image through your code, if the distance between the human eyes is 6.5 cm, how should I modify these parameters?
def gen_translate_path(num_frames=2, r_x=0.065, r_y=0., r_z=0): poses = torch.eye(4).repeat(num_frames, 1, 1) poses[1, 0, 3] = r_x poses[1, 1, 3] = r_y poses[1, 2, 3] = r_z return poses
Here I set the pose of the camera:
`camera path pose: tensor([[[ 1.0000, 0.0000, 0.0000, 0.0000],
[ 0.0000, 1.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 1.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, 1.0000]],
[[ 1.0000, 0.0000, 0.0000, 0.0650],
[ 0.0000, 1.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 1.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, 1.0000]]])`
The difference between the two images I got is very small, and the left and right views cannot be combined.
I want to ask if I do this, is there a problem? @yxuhan
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.