Giter VIP home page Giter VIP logo

river-zhang / gta Goto Github PK

View Code? Open in Web Editor NEW
90.0 7.0 4.0 2.37 MB

[NeurIPS 23] Official repository for NeurIPS 2023 paper "Global-correlated 3D-decoupling Transformer for Clothed Avatar Reconstruction"

Home Page: https://river-zhang.github.io/GTA-projectpage/

Python 98.14% Shell 0.53% GLSL 1.34%
3d clothed-humans clothed-people-digitalization digital human reconstruction vision neurips-2023 python pytorch

gta's People

Contributors

river-zhang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

gta's Issues

The test results seems to be inconsistent with the paper

I tried your code in Ubuntu20.04 cuda 11.8 pytorch2.0.1
I have got the results:

GTA.mp4

but the corresponding results in your paper is much better obviously, as shown in the following:
GTA

I want to know if my result is reasonable。

ViTencoder input

I found that the front/back normal maps are also used as input to the encoder and image to generate three-plane features. I want to know why? Will the result be improved?
image
Reading the code, I found that after obtaining the three-plane feature map, it was concatenated with the normal feature.
image
I only input the image through VitPose's pre-trained ViTencoder model to get the image features, and then also through the three decoders to get the three-plane features and splice with the normal features. Is that all right?

inference time

Hi, thanks for your great work.
I read your paper, but didn't see any mention of inference time for a single image.
Do you have a rough idea of what it would be on a modern GPU?

thanks!

About the version of pymeshlab

During inferencing, I first installed the current pymeshlab version of 2023.12 and encountered the
AttributeError: 'pymeshlab.pmeshlab.MeshSet' object has no attribute 'laplacian_smooth'

Then I changed the pymeshlab to 2022 and finished the inference successfully.
Maybe the new version of pymeshlab is mismatched with the code.

Strange surface of inferenced results

Thanks for your great work!
I encountered some problems during inferencing.
Would you please help me?
My inference results have strange surfaces just as #7

I noticed that an ERROR occurred, although it didn't stop the inference progress:

Resume MLP weights from ./data/ckpt/GTA.ckpt
Resume normal model from ./data/ckpt/normal.ckpt
Using pixie as HPS Estimator

Dataset Size: 5
  0%|                                                                                                                                                            | 0/5 [00:00<?, ?it/s]
2024-03-02 16:02:28.809516226 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:515 CreateExecutionProviderInstance] Failed to create TensorrtExecutionProvider. 
Please reference https://onnxruntime.ai/docs/execution-providers/TensorRT-ExecutionProvider.html#requirements to ensure all dependencies are met.
1eca7a73c3c61d9debde493de37c7d99:   0%|                                                                                                                          | 0/5 [00:06<?, ?it/s
Body Fitting --- normal: 0.089 | silhouette: 0.043 | Total: 0.132:  12%|█████████▎                                                                    | 12/100 [00:01<00:13,  6.32it/s]
1eca7a73c3c61d9debde493de37c7d99:   0%|                                                                                                                          | 0/5 [00:08<?, ?it/s]

Is it normal that this error ocurred during inferencing?

I tried to change the onnxruntime-gpu and TensorRT's version but it didn't work.

My environment is:
CUDA 11.7
pytorch 1.13.1
onnxruntime-gpu 1.14
TensorRT 8.5.3.1

there is no .npy file

Thanks for your sharing of this work.
When i tried to run the infer.py, i found there is no .npy files and directories in certain place:
image

Question about numbers, evaluation.

Hello I have two questions regarding the test results in your two papers GTA and SIFU.

  1. I see Issue #5 and your explanation in it. But I still don't understand why your GTA numbers for THuman 2.0 is different.
    In GTA paper, it is Chamfer 0.814, P2S 0.862, Normal 0.055. In SIFU, it is 0.73, 0.72, 0.04.

  2. I noticed that in both of your papers evaluation code, you are using GT front and back normal. This is different from ICON's evaluation protocol where they use estimated normal. (YuliangXiu/ICON#183)
    If using estimated normal, your GTA numbers for THuman 2.0 should be 1.12, 1.12, 0.065.

Could you please clarify these two points? Thank you!

Ask for training code

Excellent work! I would like to ask if the training code is included in the repository?

About PNSR

Amazing work! Can you provide code for calculating PNSR or tell me where to find the relevant code?

Expecting a demo

Hi, River-Zhang
I'm studying papers on human body reconstruction, and I have read your paper. It is a very nice work! May I ask when will you update the open source? Expecting your demo~

THuman 2.0 evaluation protocol

Hi authors, I have a question regarding the THuman 2.0 evaluation protocol in your Table 1.

  • How do you create train/test split?
  • For the test set, how many views do you render per subject, and what is the FOV?

Thank you in advance!

About HGPIFu

When estimating the human body geometry, the query operation is performed in HGPIFuNet.
image
The first step is to project the sampled point set onto the image plane. But I found that the transforms parameter is None.
image
image
So in xyz = self.projection(points, calibs, transforms), only the points are rotated and translated.
Are all points in the world coordinate system? The projection operation only converts points from the world coordinate system to the camera coordinate system after rotation and translation, and does not project further to the image plane. Please give me some help.

about SMPLX model

In which part of your code do you use a Pixie-like model to estimate SMPLX parameters? I have read your code, and it seems that in the training process, you used the SMPLX parameters obtained from the THuman2.0 dataset as the prior enhancement query. Only when infer was an image, since it was not the image of the dataset, PIXIE model was used to predict the SMPLX parameters as the prior enhancement query. Is my understanding correct?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.