Thanks for your great work! I have some questions about the network structure. By comp

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Inconsistent with the original Blazepose full model about tf-blazepose HOT 2 OPEN

vietanhdev commented on August 26, 2024

Inconsistent with the original Blazepose full model

from tf-blazepose.

Comments (2)

vietanhdev commented on August 26, 2024

Hello,
Our implementation is a modified version of the original model.
First, for identity_1, we don't know the exact purpose of this branch. This architecture is for tracking, so we guess that this branch predicts if there is a person in the image. I also verified this assumption by running the pre-trained model with the following code:

import tensorflow as tf
import cv2
import numpy as np

model = tf.keras.models.load_model('saved_model_full_pose_landmark_39kp')
cap = cv2.VideoCapture(0)

while True:
    _, origin = cap.read()
    img = cv2.resize(origin, (256, 256))
    img = img.astype(float)
    img = (img - 127) / 255
    img = np.array([img])

    heatmap, classify, regress = model.predict(img)
    confidence = np.reshape(classify, (1,))[0]
    print(confidence)

For identity_2, as explained here, they have 4 outputs for each keypoints:

x and y: Landmark coordinates normalized to [0.0, 1.0] by the image width and height respectively.
z: Should be discarded as currently the model is not fully trained to predict depth, but this is something on the roadmap.
visibility: A value in [0.0, 1.0] indicating the likelihood of the landmark being visible (present and not occluded) in the image.

That's why their output size is 4 * number_of_keypoints. In the pre-trained model we used to implement this repo, number_of_keypoints = 39, so we have 4 * 39 = 156 outputs. I removed z dimension from keypoints, the shape of the output is 3 * number_of_keypoints.

Another difference between our model and the original model is that the heatmap output of our model has the shape of (128, 128, number_of_keypoints) while the original model only has the shape of (128, 128, 1). We are using output from heatmap for the keypoints. In the future, we will modify this design.

from tf-blazepose.

jizhu1023 commented on August 26, 2024

@vietanhdev Thanks for your reply, which address my issues well! The other thing I am confused about is why there are 39 keypoints but not 33 or 35 keypoints as mentioned by the paper? By looking into the code in Mediapipe, I found the keypoint 34-35 are auxiliary_landmarks for ROI generation and keypoint 36-39 are not used. I further visualize the locations of keypoint 36-39 and found they are the same with some keypoints on hands.

from tf-blazepose.

Recommend Projects

Inconsistent with the original Blazepose full model about tf-blazepose HOT 2 OPEN

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent