Giter VIP home page Giter VIP logo

Comments (4)

bluestyle97 avatar bluestyle97 commented on June 30, 2024 1

@cwchenwang Hi, we aimed to strictly follow the camera setting of Zero123++ v1.2 (fov=30) during fine-tuning. We asked the authors of Zero123++ about the object normalization and camera distance in this issue. The original answer was that the object should be normalized into a unit cube (it has been correced to unit sphere), which was an unintentional mistake resulting in larger objects in the rendered image.

This will not influence the reconstruction results in most cases. However, if the shape of the object is close to a cube, it will occupy a very large region in the generated image and make the reconstruction result cropped since it exceeds the [-1, 1] representation range of the triplane. To alleviate this issue temporarily, you can parse a smaller --scale argument in run.py to decrease the size of the reconstructed object. We plan to fix the object normalization issue and provide a new model in the future.

from instantmesh.

msingh27 avatar msingh27 commented on June 30, 2024

@bluestyle97 thanks a lot for open-sourcing the codebase for fine-tuning the zero123++ models.
I am also facing some issues while rendering the 6 views for a 3d model in blender.
It would be great if you could tell more about the creation of training dataset for zero123++ models:
Specifically the camera_distance and normalization of the 3d model for blender rendering.
I am modifying the blender script from here
but for some objects the views doesn't look like zero123++ (scaling issues, camera distance issue).

I am using this for validation views in blender:

def set_camera_location_validation(camera, view_i):
    # cam_distance = (0.5 / np.tan(np.radians(30/2))) # not sure if this is correct
    cam_distance = 2.0
    azimuths = np.array([30, 90, 150, 210, 270, 330])
    elevations = np.array([20, -10, 20, -10, 20, -10])
    azimuths = np.deg2rad(azimuths)
    elevations = np.deg2rad(elevations)

    x = cam_distance * np.cos(elevations) * np.cos(azimuths)
    y = cam_distance * np.cos(elevations) * np.sin(azimuths)
    z = cam_distance * np.sin(elevations)

    camera.location = x[view_i], y[view_i], z[view_i]

    # adjust orientation
    direction = - camera.location
    rot_quat = direction.to_track_quat('-Z', 'Y')
    camera.rotation_euler = rot_quat.to_euler()
    return camera

#for normalization using normalize_scene function 
normalize_scene(box_scale=2) # this is unit cube normalization , maybe sphere normalization is required

# camera_setup
cam.data.lens = 30 # 24 default for openLRM?

Any suggestions would be super helpful
cc: @cwchenwang
Thanks :D

from instantmesh.

mengxuyiGit avatar mengxuyiGit commented on June 30, 2024

@bluestyle97 thanks a lot for open-sourcing the codebase for fine-tuning the zero123++ models. I am also facing some issues while rendering the 6 views for a 3d model in blender. It would be great if you could tell more about the creation of training dataset for zero123++ models: Specifically the camera_distance and normalization of the 3d model for blender rendering. I am modifying the blender script from here but for some objects the views doesn't look like zero123++ (scaling issues, camera distance issue).

I am using this for validation views in blender:

def set_camera_location_validation(camera, view_i):
    # cam_distance = (0.5 / np.tan(np.radians(30/2))) # not sure if this is correct
    cam_distance = 2.0
    azimuths = np.array([30, 90, 150, 210, 270, 330])
    elevations = np.array([20, -10, 20, -10, 20, -10])
    azimuths = np.deg2rad(azimuths)
    elevations = np.deg2rad(elevations)

    x = cam_distance * np.cos(elevations) * np.cos(azimuths)
    y = cam_distance * np.cos(elevations) * np.sin(azimuths)
    z = cam_distance * np.sin(elevations)

    camera.location = x[view_i], y[view_i], z[view_i]

    # adjust orientation
    direction = - camera.location
    rot_quat = direction.to_track_quat('-Z', 'Y')
    camera.rotation_euler = rot_quat.to_euler()
    return camera

#for normalization using normalize_scene function 
normalize_scene(box_scale=2) # this is unit cube normalization , maybe sphere normalization is required

# camera_setup
cam.data.lens = 30 # 24 default for openLRM?

Any suggestions would be super helpful cc: @cwchenwang Thanks :D

Hi, have you found a proper scale to reproduce the results as shown in the InstantMesh? Thanks!

from instantmesh.

msingh27 avatar msingh27 commented on June 30, 2024

@mengxuyiGit
I think updating these parameters in blender script from openlrm can generate the images that are consistent with zero123++ 6 view images:

# Cam setup
fov = 49.13
cam.data.lens = 49.13
cam_distance = (0.5 / np.tan(np.radians(fov/2)))

# Cube normalization -> Sphere normalization
scale = box_scale / max(bbox_max - bbox_min) --> 
scale = box_scale / np.linalg.norm(bbox_max - bbox_min)

# Random normalization
normalize_scene(box_scale=0.8)

Not sure if these params were used by instantmesh authors for training cc: @bluestyle97

from instantmesh.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.