Comments (4)
@cwchenwang Hi, we aimed to strictly follow the camera setting of Zero123++ v1.2 (fov=30) during fine-tuning. We asked the authors of Zero123++ about the object normalization and camera distance in this issue. The original answer was that the object should be normalized into a unit cube (it has been correced to unit sphere), which was an unintentional mistake resulting in larger objects in the rendered image.
This will not influence the reconstruction results in most cases. However, if the shape of the object is close to a cube, it will occupy a very large region in the generated image and make the reconstruction result cropped
since it exceeds the [-1, 1] representation range of the triplane. To alleviate this issue temporarily, you can parse a smaller --scale
argument in run.py
to decrease the size of the reconstructed object. We plan to fix the object normalization issue and provide a new model in the future.
from instantmesh.
@bluestyle97 thanks a lot for open-sourcing the codebase for fine-tuning the zero123++ models.
I am also facing some issues while rendering the 6 views for a 3d model in blender.
It would be great if you could tell more about the creation of training dataset for zero123++ models:
Specifically the camera_distance and normalization of the 3d model for blender rendering.
I am modifying the blender script from here
but for some objects the views doesn't look like zero123++ (scaling issues, camera distance issue).
I am using this for validation views in blender:
def set_camera_location_validation(camera, view_i):
# cam_distance = (0.5 / np.tan(np.radians(30/2))) # not sure if this is correct
cam_distance = 2.0
azimuths = np.array([30, 90, 150, 210, 270, 330])
elevations = np.array([20, -10, 20, -10, 20, -10])
azimuths = np.deg2rad(azimuths)
elevations = np.deg2rad(elevations)
x = cam_distance * np.cos(elevations) * np.cos(azimuths)
y = cam_distance * np.cos(elevations) * np.sin(azimuths)
z = cam_distance * np.sin(elevations)
camera.location = x[view_i], y[view_i], z[view_i]
# adjust orientation
direction = - camera.location
rot_quat = direction.to_track_quat('-Z', 'Y')
camera.rotation_euler = rot_quat.to_euler()
return camera
#for normalization using normalize_scene function
normalize_scene(box_scale=2) # this is unit cube normalization , maybe sphere normalization is required
# camera_setup
cam.data.lens = 30 # 24 default for openLRM?
Any suggestions would be super helpful
cc: @cwchenwang
Thanks :D
from instantmesh.
@bluestyle97 thanks a lot for open-sourcing the codebase for fine-tuning the zero123++ models. I am also facing some issues while rendering the 6 views for a 3d model in blender. It would be great if you could tell more about the creation of training dataset for zero123++ models: Specifically the camera_distance and normalization of the 3d model for blender rendering. I am modifying the blender script from here but for some objects the views doesn't look like zero123++ (scaling issues, camera distance issue).
I am using this for validation views in blender:
def set_camera_location_validation(camera, view_i): # cam_distance = (0.5 / np.tan(np.radians(30/2))) # not sure if this is correct cam_distance = 2.0 azimuths = np.array([30, 90, 150, 210, 270, 330]) elevations = np.array([20, -10, 20, -10, 20, -10]) azimuths = np.deg2rad(azimuths) elevations = np.deg2rad(elevations) x = cam_distance * np.cos(elevations) * np.cos(azimuths) y = cam_distance * np.cos(elevations) * np.sin(azimuths) z = cam_distance * np.sin(elevations) camera.location = x[view_i], y[view_i], z[view_i] # adjust orientation direction = - camera.location rot_quat = direction.to_track_quat('-Z', 'Y') camera.rotation_euler = rot_quat.to_euler() return camera #for normalization using normalize_scene function normalize_scene(box_scale=2) # this is unit cube normalization , maybe sphere normalization is required # camera_setup cam.data.lens = 30 # 24 default for openLRM?
Any suggestions would be super helpful cc: @cwchenwang Thanks :D
Hi, have you found a proper scale to reproduce the results as shown in the InstantMesh? Thanks!
from instantmesh.
@mengxuyiGit
I think updating these parameters in blender script from openlrm can generate the images that are consistent with zero123++ 6 view images:
# Cam setup
fov = 49.13
cam.data.lens = 49.13
cam_distance = (0.5 / np.tan(np.radians(fov/2)))
# Cube normalization -> Sphere normalization
scale = box_scale / max(bbox_max - bbox_min) -->
scale = box_scale / np.linalg.norm(bbox_max - bbox_min)
# Random normalization
normalize_scene(box_scale=0.8)
Not sure if these params were used by instantmesh authors for training cc: @bluestyle97
from instantmesh.
Related Issues (20)
- could you provide the rendering scripts to evalutate ? HOT 1
- 8GB of models downloading every time I launch app.py
- can providing good quality of image will not impact quality HOT 2
- INTERNAL ASSERT FAILED after generating multiviews
- Training CUDA Out Of Memory error HOT 5
- How to get the quantitative results as reported in the paper HOT 1
- Training time?
- RuntimeError: handle_0 INTERNAL ASSERT FAILED at "../c10/cuda/driver_api.cpp":15, please report a bug to PyTorch.` HOT 2
- not all arguments converted during string formatting log HOT 2
- How to Improve Texture Quality and is there a way to tweak the model generation part by providing the normal maps?
- Torch cache not releasing
- Command '['ninja', '-v']' returned non-zero exit status 1
- Loading diffusion model ... Error HOT 1
- IndexError: too many indices for tensor of dimension 2
- Using other multi view models? HOT 1
- Getting errors when running the demo.
- Improve texture quality HOT 2
- Train on nerf of stage 1 with custom dataset HOT 3
- Number of epochs before seeing a decrease in the loss HOT 2
- Fine-Tuning Zero123++ pre-defined range of query image.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from instantmesh.