Giter VIP home page Giter VIP logo

Comments (7)

wang-zidu avatar wang-zidu commented on August 27, 2024 3

I tried various 4x4 matrices from these values (focal=1015, znear=5, zfar=15), but none of them match the same result as your code.

reduce unnecessary calculations

for the processor it's nanoseconds, for the programmer and those who will use the repo it's a headache.

I believe you can directly refer to /util/nv_diffrast.py to get the result you want. The camera parameters have also been verified by using PyTorch3D. Let me explain further:

Starting from line 442 of model/recon.py, we assume the homogenous 3D point v = (x,y,z,1) is one of the points in v3d. First, we calculate the perspective projection matrix:

$$ {\rm{Projection - Matrix}} = \begin{bmatrix} {\frac{1}{{\tan (fov/2)}}} & 0 & 0 & 0 \\ 0 & {\frac{1}{{\tan (fov/2)}}} & 0 & 0 \\ 0 & 0 & {\frac{{znear + zfar}}{{znear - zfar}}} & {\frac{{2 \cdot znear \cdot zfar}}{{znear - zfar}}} \\ 0 & 0 & -1 & 0 \end{bmatrix} = \begin{bmatrix} {\frac{{1015}}{{112}}} & 0 & 0 & 0 \\ 0 & {\frac{{1015}}{{112}}} & 0 & 0 \\ 0 & 0 & -2 & -15 \\ 0 & 0 & -1 & 0 \end{bmatrix} $$

Invert the z direction of v to correspond to the camera's coordinate system. (In /util/nv_diffrast.py, you might also find that the y direction of v is inverted as well, which is done to adapt to the renderer.) So we get (x, y, -z, 1), and then perform the perspective projection to obtain:

$$ v' = \begin{bmatrix} {\frac{{1015}}{{112}}} & 0 & 0 & 0 \\ 0 & {\frac{{1015}}{{112}}} & 0 & 0 \\ 0 & 0 & -2 & -15 \\ 0 & 0 & -1 & 0 \end{bmatrix} \cdot \begin{bmatrix} x\\ y\\ - z\\ 1 \end{bmatrix}=\begin{bmatrix} {\frac{{1015}}{{112}}}x\\ {\frac{{1015}}{{112}}}y\\ 2z -15\\ z \end{bmatrix} $$

Homogenize the coordinates:

$$v'' =\begin{bmatrix} {\frac{{1015x}}{{112z}}}\\ {\frac{{1015y}}{{112z}}}\\ {2 - \frac{{15}}{z}}\\ 1 \end{bmatrix} $$

Finally, the coordinate in ndc space is converted to the image plane:

$${v_{image}} =\begin{bmatrix} {\frac{{v'{'_x} + 1}}{2} \cdot 224}\\ {\frac{{v'{'_y} + 1}}{2} \cdot 224} \end{bmatrix} =\begin{bmatrix} {1015\frac{{x}}{z}} + 112\\ {1015\frac{{y}}{z}} + 112 \end{bmatrix}$$

You will find that this result is consistent with the result obtained using self.persc_proj and homogenizing in model/recon.py.

I believe the above process is detailed enough and hope it helps you. The cause of your incorrect results could be due to some axis inversions (such as the common y-flip in images). This is likely because of different coordinate system definitions used in different rendering methods. Usually, you just need to visualize the results and check the steps where inversion is needed.

from 3ddfa-v3.

wang-zidu avatar wang-zidu commented on August 27, 2024

Thank you for your support, this issue is valuable.

The parameters of the perspective projection camera we use are: focal=1015, znear=5, zfar=15. The camera is located at (0, 0, 10) and faces the negative direction of the z-axis. The rendered image size is 224×224.

I completely agree with what you said about the 4x4 projection matrix being fundamental to some rendering calculations. I believe you can find the process you mentioned in /util/nv_diffrast.py, which is the calculation method for transforming homogenous 3D points (x, y, z, 1).

However, in model/recon.py, the purpose of self.persc_proj is merely to obtain the x and y coordinates without involving the rendering process, so z is not needed (otherwise the calculation would be redundant). You can verify that the v2d obtained using self.persc_proj are consistent with the first two dimensions of screen coordinates obtained by transforming the vertex_ndc using /util/nv_diffrast.py. In short, self.persc_proj is just a way to slightly reduce unnecessary calculations, a similar approach is common in HRN, Deep3D, etc. Hope this helps.

from 3ddfa-v3.

iperov avatar iperov commented on August 27, 2024

I tried various 4x4 matrices from these values (focal=1015, znear=5, zfar=15), but none of them match the same result as your code.

reduce unnecessary calculations

for the processor it's nanoseconds, for the programmer and those who will use the repo it's a headache.

from 3ddfa-v3.

ElliotQi avatar ElliotQi commented on August 27, 2024

@wang-zidu Hi, thanks for your excellent work. I wonder if this work supports CPU inference. I'm trying to use cpu device but get an error with nvdiffrast (RasterizeCudaContext could not use cpu device)

from 3ddfa-v3.

wang-zidu avatar wang-zidu commented on August 27, 2024

@wang-zidu Hi, thanks for your excellent work. I wonder if this work supports CPU inference. I'm trying to use cpu device but get an error with nvdiffrast (RasterizeCudaContext could not use cpu device)

Thank you for your feedback. nvdiffrast can be replaced with a simpler renderer such as face3d. You can try replacing it, or if you only need the mesh results, you can simply remove the corresponding nvdiffrast content. If I have time, I will address this request as soon as possible and let you know.

from 3ddfa-v3.

wang-zidu avatar wang-zidu commented on August 27, 2024

@wang-zidu Hi, thanks for your excellent work. I wonder if this work supports CPU inference. I'm trying to use cpu device but get an error with nvdiffrast (RasterizeCudaContext could not use cpu device)

Thank you for your support and patience. We have updated a new fast CPU renderer (based on face3d). Now the entire work supports CPU inference (using RetinaFace for face box).

from 3ddfa-v3.

emlcpfx avatar emlcpfx commented on August 27, 2024

I tried various 4x4 matrices from these values (focal=1015, znear=5, zfar=15), but none of them match the same result as your code.

reduce unnecessary calculations

for the processor it's nanoseconds, for the programmer and those who will use the repo it's a headache.

Did you figure out how to ‘export’ the camera in a way that can be imported to a 3D DCC?

from 3ddfa-v3.

Related Issues (17)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.