Thanks for the amazing work. I noticed that after finetuning to white bg, the output i

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

What camera intrinsic used for finetuning the gray bg Zero123Plus? about instantmesh HOT 4 CLOSED

cwchenwang commented on June 30, 2024 1

What camera intrinsic used for finetuning the gray bg Zero123Plus?

from instantmesh.

Comments (4)

bluestyle97 commented on June 30, 2024 1

@cwchenwang Hi, we aimed to strictly follow the camera setting of Zero123++ v1.2 (fov=30) during fine-tuning. We asked the authors of Zero123++ about the object normalization and camera distance in this issue. The original answer was that the object should be normalized into a unit cube (it has been correced to unit sphere), which was an unintentional mistake resulting in larger objects in the rendered image.

This will not influence the reconstruction results in most cases. However, if the shape of the object is close to a cube, it will occupy a very large region in the generated image and make the reconstruction result cropped since it exceeds the [-1, 1] representation range of the triplane. To alleviate this issue temporarily, you can parse a smaller --scale argument in run.py to decrease the size of the reconstructed object. We plan to fix the object normalization issue and provide a new model in the future.

from instantmesh.

msingh27 commented on June 30, 2024

@bluestyle97 thanks a lot for open-sourcing the codebase for fine-tuning the zero123++ models.
I am also facing some issues while rendering the 6 views for a 3d model in blender.
It would be great if you could tell more about the creation of training dataset for zero123++ models:
Specifically the camera_distance and normalization of the 3d model for blender rendering.
I am modifying the blender script from here
but for some objects the views doesn't look like zero123++ (scaling issues, camera distance issue).

I am using this for validation views in blender:

def set_camera_location_validation(camera, view_i):
    # cam_distance = (0.5 / np.tan(np.radians(30/2))) # not sure if this is correct
    cam_distance = 2.0
    azimuths = np.array([30, 90, 150, 210, 270, 330])
    elevations = np.array([20, -10, 20, -10, 20, -10])
    azimuths = np.deg2rad(azimuths)
    elevations = np.deg2rad(elevations)

    x = cam_distance * np.cos(elevations) * np.cos(azimuths)
    y = cam_distance * np.cos(elevations) * np.sin(azimuths)
    z = cam_distance * np.sin(elevations)

    camera.location = x[view_i], y[view_i], z[view_i]

    # adjust orientation
    direction = - camera.location
    rot_quat = direction.to_track_quat('-Z', 'Y')
    camera.rotation_euler = rot_quat.to_euler()
    return camera

#for normalization using normalize_scene function 
normalize_scene(box_scale=2) # this is unit cube normalization , maybe sphere normalization is required

# camera_setup
cam.data.lens = 30 # 24 default for openLRM?

Any suggestions would be super helpful
cc: @cwchenwang
Thanks :D

from instantmesh.

mengxuyiGit commented on June 30, 2024

@bluestyle97 thanks a lot for open-sourcing the codebase for fine-tuning the zero123++ models. I am also facing some issues while rendering the 6 views for a 3d model in blender. It would be great if you could tell more about the creation of training dataset for zero123++ models: Specifically the camera_distance and normalization of the 3d model for blender rendering. I am modifying the blender script from here but for some objects the views doesn't look like zero123++ (scaling issues, camera distance issue).

I am using this for validation views in blender:
def set_camera_location_validation(camera, view_i):
    # cam_distance = (0.5 / np.tan(np.radians(30/2))) # not sure if this is correct
    cam_distance = 2.0
    azimuths = np.array([30, 90, 150, 210, 270, 330])
    elevations = np.array([20, -10, 20, -10, 20, -10])
    azimuths = np.deg2rad(azimuths)
    elevations = np.deg2rad(elevations)

    x = cam_distance * np.cos(elevations) * np.cos(azimuths)
    y = cam_distance * np.cos(elevations) * np.sin(azimuths)
    z = cam_distance * np.sin(elevations)

    camera.location = x[view_i], y[view_i], z[view_i]

    # adjust orientation
    direction = - camera.location
    rot_quat = direction.to_track_quat('-Z', 'Y')
    camera.rotation_euler = rot_quat.to_euler()
    return camera

#for normalization using normalize_scene function 
normalize_scene(box_scale=2) # this is unit cube normalization , maybe sphere normalization is required

# camera_setup
cam.data.lens = 30 # 24 default for openLRM?
Any suggestions would be super helpful cc: @cwchenwang Thanks :D

Hi, have you found a proper scale to reproduce the results as shown in the InstantMesh? Thanks!

from instantmesh.

msingh27 commented on June 30, 2024

@mengxuyiGit
I think updating these parameters in blender script from openlrm can generate the images that are consistent with zero123++ 6 view images:

# Cam setup
fov = 49.13
cam.data.lens = 49.13
cam_distance = (0.5 / np.tan(np.radians(fov/2)))

# Cube normalization -> Sphere normalization
scale = box_scale / max(bbox_max - bbox_min) --> 
scale = box_scale / np.linalg.norm(bbox_max - bbox_min)

# Random normalization
normalize_scene(box_scale=0.8)

Not sure if these params were used by instantmesh authors for training cc: @bluestyle97

from instantmesh.

What camera intrinsic used for finetuning the gray bg Zero123Plus? about instantmesh HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent