Thank you for your brilliant work! I have read your excellent paper -ACSM and woul

how do you parameterize part transforms in articulation. about acsm HOT 15 CLOSED

nileshkulkarni commented on July 2, 2024

how do you parameterize part transforms in articulation.

from acsm.

Comments (15)

nileshkulkarni commented on July 2, 2024

Hi,
Thank you for your interest in our work. Here are the answers to your questions.

We parameterize the rigid transform for every part as an axis angle transformation. The axis of the part is a parameter in the network is not predicted per image. We predict a rotation angle (theta) corresponding to every part. So you can interpret the rigid transform as (x0, y0, z0) and an angle. Here are more details for it.
We can convert this axis angle to a rotation matrix let's call it R_{0} corresponding to part 0.
Given R_{0} as the rotation matrix for the zeroth part, we can transform all the vertices by doing p_new = R_{0} * p_old. See #1 (comment) for more details on to blend vertices to get the final mesh.
We learn the angle by using ResNet to encode the image and predict the angle corresponding to each part of the mesh.

I hope this helps!

Best,
Nilesh

from acsm.

anslt commented on July 2, 2024

Thanks for your reply. I could easily understand the model from your answer!

Thus, the axis is fix for each category and we need to find the axis before the training.
Do I have a right understanding?

from acsm.

nileshkulkarni commented on July 2, 2024

Yes, it is a learned parameter in the network we initialize it to be the y-axis and then it gets learned from there.

from acsm.

anslt commented on July 2, 2024

Hello Nilesh,
I would like to further ask for the translation in the part transform.
How to apply the translation to the part transformation?
It applies to different parts for different translations or apply the whole object with one translation?
If we apply the translation to each part, would it make the part leave the main body?

Furthermore, you suggest to apply regularization losses (entropy) on transformation.
Thus, we need a probability for 8 different transformation. How to obtain this probability?

Thank you for your reply.

from acsm.

nileshkulkarni commented on July 2, 2024

There is translation prediction for every vertex. You can consider the transformation (R, T) and can be applied as
new_verts = R*verts + T

So the entropy regularization is applied by predicting a probability associated with the camera pose prediction. You can refer to this for more details. https://github.com/nileshkulkarni/csm/blob/848fa12039551de6c7ba796685f568a8bba65ab2/csm/nnutils/icn_net.py#L206-L230

Best,
Nilesh

from acsm.

anslt commented on July 2, 2024

Do we use a resnet to learn translation or it like axis to act as bias?
If there are some problem for learning translation, does one of the parts like leave the body?

from acsm.

nileshkulkarni commented on July 2, 2024

We use renset to learn translation as every object might have a different? It doesn't happen that the parts leave the body a) the mask doesn't allow things to move in an arbitrary manner, and we also have an L2 regularization loss on the predicted translation which prevents it from predicting high values that might make the shape look unreasonable.

from acsm.

anslt commented on July 2, 2024

Thanks for your reply.
Thus, in my understanding, int the articulation part, you use resnet to learn translation and angle in the rotation?
If it has multiple cameras, you also need a probability with each articulation?

from acsm.

nileshkulkarni commented on July 2, 2024

So the articulation prediction is a single prediction. There are no multiple predictions for it in our model. So if "horse" has 8 parts we will predict 8 angles and 8 translations.

Whereas the camera pose prediction is done by predicting multiple cameras, and hence this prediction has the probability term like the canonical surface mapping paper

from acsm.

anslt commented on July 2, 2024

Thanks for your reply, do you use the same ResNet for part transformation and camera prediction?
Also, since the norm axis = (x0, y0, z0) needs to be 1. How do you optimize the axis with this constrained condition?

from acsm.

nileshkulkarni commented on July 2, 2024

yes we use the same ResNet for transformation and camera prediction.

This is my module that predicts the rotation as axis angle prediction.


class QuatPredictorSingleAxis(nn.Module):
    def __init__(self,nz_feat, nz_rot=2, ):
        super(QuatPredictorSingleAxis, self).__init__()
        self.pred_layer = nn.Linear(nz_feat, nz_rot)
        self.axis = nn.Parameter(torch.FloatTensor([1,0,0]))
    def forward(self, feat):
        vec = self.pred_layer.forward(feat)
        vec = torch.nn.functional.normalize(vec)
        angle = torch.atan2(vec[:,1], vec[:,0]).unsqueeze(-1)
        self.axis.data = torch.nn.functional.normalize(self.axis.unsqueeze(0)).squeeze(0).data
        axis = self.axis.unsqueeze(0).repeat(len(angle), 1)
        quat = axang2quat(angle, axis)
        return quat

def axang2quat(angle, axis):
    cangle = torch.cos(angle/2)
    sangle = torch.sin(angle/2)
    qw = cangle
    qx =  axis[...,None,0]*sangle
    qy = axis[...,None,1]*sangle
    qz = axis[...,None,2]*sangle
    quat = torch.cat([qw, qx, qy, qz], dim=-1)
    return quat

from acsm.

anslt commented on July 2, 2024

Thanks for your prompt answer！

I saw your answer in #1 for the last question about the rotation center.
Assume we rotate the body for axis = [0,1,0] and angle = pi / 2 with no translation in "horse".
Thus, except for the rotation center for the body part, the other point has changed.
Based on this, we want to next rotate the neck. The rotation center for the neck is changed due to the rotation for the body, right?

from acsm.

nileshkulkarni commented on July 2, 2024

Hi @anslt,
I think I said something incorrectly earlier. We have 8 different transform predictors corresponding to each camera pose prediction.
#2 (comment). There are multiple predictions for transform from our model.

Answering your other question.
So the rotations are applied in a bottom-up fashion, where you first apply the next transformation and then you apply the body transformation, so the rotation center does not change. You apply the rotation for the children first and then for the parent.

from acsm.

anslt commented on July 2, 2024

Thanks, I have done this part.
The next question is about the weights of the loss for the translation for part transformation?

from acsm.

nileshkulkarni commented on July 2, 2024

Hi @anslt,

Not sure what you mean, if you are referring to the lambda corresponding the translation regularization then in my case the value was 10.0.

Best,
Nilesh

from acsm.

how do you parameterize part transforms in articulation. about acsm HOT 15 CLOSED

Comments (15)

Related Issues (9)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent