Hi, I've just have a question for the two parameters : lm_in

This is what I explain here: <a href="https://github.com/geaxgx/depthai_blazepose#infe

Question about depthai_blazepose HOT 8 CLOSED

geaxgx commented on July 18, 2024

Question

from depthai_blazepose.

Comments (8)

geaxgx commented on July 18, 2024

Hi, lm_input_length x lm_input_length (resp. pd_input_length x pd_input_length) correspond to the shape of the image used as input by the landmark (resp. pose detection) neural network. So, for instance, the landmark NN is fed with 256x256 images. The raw outputs of the NN are landmark coordinates between 0 and 256 which are then normalized between 0 and 1 by dividing them by 256.

from depthai_blazepose.

bruno-darochac commented on July 18, 2024

Hi, thanks for your answer !

I just want to understand someting. You have in parameter, the possibility to set or not the xyz mode. But at least, without turning on this mode, you got the depth of the landmarks in the image. But what I don't understand is that you seems to create the stereo capture only when the xyz mode is on on the function create_pipeline.

So my questions are, how did you manage to get de depth of each landmarks ? And in approach to my usecase, is it possible to get the depth raw data in your program or should I create a new pipeline ?

from depthai_blazepose.

geaxgx commented on July 18, 2024

This is what I explain here: https://github.com/geaxgx/depthai_blazepose#inferred-3d-vs-measured-3d

The important point to understand is that the mediapipe landmark model is able to infer 3D landmarks from 2D images (I call these as "inferred 3D landmarks"). That may sound like a bit of magic, but I guess google used synthetic data to train their model. Of course, you can't expect real accuracy from these landmarks. Also these landmarks are not absolute (the model can't tell you that for instance the left shoulder is at 5 meters from the camera) but relative to a reference point, the middle point between the hips (the model can estimate a relative delta x,y,z relative to the mid hips).
Now, if you set the xyz mode when running my script, you can IN ADDITION to the inferred 3D landmarks, get from the depth raw data, the real 3D absolute position of this reference point. So you have on one side the body landmarks relative to the reference point and on the other side the absolute position of the reference point. By combining these 2 types of information, you can get an estimatation of the 3D absolute position of each landmark.
This combination task is done there:

depthai_blazepose/BlazeposeRenderer.py

Line 123 in a3ce15a

if self.show_3d == "mixed":

            if self.show_3d == "mixed":  
                if body.xyz_ref:
                    """
                    Beware, the y value of landmarks_world coordinates is negative for landmarks 
                    above the mid hips (like shoulders) and negative for landmarks below (like feet).
                    The y value of (x,y,z) coordinates given by depth sensor is negative in the lower part
                    of the image and positive in the upper part.
                    """
                    translation = body.xyz / 1000
                    translation[1] = -translation[1]
                    if body.xyz_ref == "mid_hips":                   
                        points = points + translation
                    elif body.xyz_ref == "mid_shoulders":
                        mid_hips_to_mid_shoulders = np.mean([
                            points[mpu.KEYPOINT_DICT['right_shoulder']],
                            points[mpu.KEYPOINT_DICT['left_shoulder']]],
                            axis=0) 
                        points = points + translation - mid_hips_to_mid_shoulders

body.xyz contains the absolute 3D position of the reference point as measured from the raw depth data.
The reference point is the mid hips whenever the mid hips is visible in the image, otherwise we use the middle of the shoulders as the reference point.

So, note that by setting the xyz mode, my script configure the pipeline to get the depth raw data, but I rely on this depth data to measure the position of only one point, the reference point. FYI, the first link above explain why I cannot directly measure from this depth data the absolute position of every landmark.

I hope I managed to make things a bit clearer :-)

from depthai_blazepose.

bruno-darochac commented on July 18, 2024

Many thanks ! Your explaination were perfectly clear :-D

And your script inspire me for making something more accurate for my use case. And now I understand better the workflow of your script 👍

I got a last question, have you try to manage the little clitch that the model have ? Like maybe tresholding the changes between two frame to avoid those little clitch ? And if I want to try to make something is this in your template_manager_script.py that I should take a look ?

from depthai_blazepose.

geaxgx commented on July 18, 2024

I am currently on holydays, only reading mails once in a while. What do you mean by "clitch" ? If you mean that the drawn landmarks are not superimposed on their corresponding body parts (specially noticeable with the head landmarks), I am afraid that it can't be improved. It is due to the process of converting the original tflite model into openvino float 16 model, whare precision and accuracy have been lost.

from depthai_blazepose.

bruno-darochac commented on July 18, 2024

Oh ! enjoy your vacations ! And thanks to take time to answer me.

What I mean by clitch is that even if the body is static, the landmarks move like for 1 or 2 pixel between t and t+1 frames, and it looks like unstable.

from depthai_blazepose.

geaxgx commented on July 18, 2024

Thanks :-)
You can try to play with the parameters of the smoothing filter:

depthai_blazepose/BlazeposeDepthai.py

Line 186 in a3ce15a

self.filter_landmarks = mpu.LandmarksSmoothingFilter(

depthai_blazepose/BlazeposeDepthaiEdge.py

Line 155 in a3ce15a

self.filter_landmarks = mpu.LandmarksSmoothingFilter(

from depthai_blazepose.

bruno-darochac commented on July 18, 2024

I'll give a look a this so ! Many thanks for your availability. I think that I have all the keys to continue my project aahah

I'll mention you in the thanks of my bachelor !
Enjoy your vacation :-)

from depthai_blazepose.

Question about depthai_blazepose HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent