hnuzhy / directmhp Goto Github PK

View Code? Open in Web Editor NEW

85.0 2.0 8.0 42.21 MB

Codes for my paper "DirectMHP: Direct 2D Multi-Person Head Pose Estimation with Full-range Angles"

License: GNU General Public License v3.0

Python 93.72% Shell 0.73% C++ 4.45% Cython 1.04% CMake 0.06%

head-detection multi-task-learning head-pose-estimation yolov5 multi-person-head-pose-estimation

directmhp's Introduction

Hi! Dear Visitor. 😃 I'm Huayi Zhou, a PhD student in Shanghai Jiao Tong University.

❤️ I'm doing my research of Computer Vision, Pose Estimation, Transfer Learning and Digital Education. See CV_DL_Gather
🚀 I'm exploring the practical and landable applications of advanced AI algorithms in the traditional classroom. See StuArt
⭐ I'm a faithful follower of YOLO series algorithms for their simple yet efficient design. See SSDA-YOLO, DirectMHP and BPJDet
👍 I'm recently focusing on the semi-supervised learning (SSL) for its data/label efficient feature. See MultiAugs, SemiUHPE

directmhp's People

Contributors

Stargazers

Watchers

Forkers

rlczddl jie311 guttappa1238 rick9chen stanley17932 baituhuangyu 0mayank

directmhp's Issues

构建AGORA-HPE数据集

你好，我在生成AGORA_HPE第四步（得到train和coco_style_train.json）时，遇到以下问题：

然后发现list(data.keys())中没有['cam_j3d']、['camR']、['camT']、['camK']：

请问这种情况有遇到吗？该如何解决呢？

About application method.

Hello. I have one question about the application method.
Is it possible to add an object tracking feature to this program?
If that is possible, please let me know if there is any way to do this.

Could you add a demo script to run a webcam instead of a video?

I am trying to run a webcam instead of a video but I failed.
Could you add a script to run your model in a real-time video instead of a video?

关于模型训练的问题

您好，使用agora数据集进行训练后，得到了narrow-range和full-range两个测试的结果

请问，这两个测试结果都是使用下面的损失函数训练的模型吗

RuntimeError: Sizes of tensors must match except in dimension 2. Got 1 and 2 (The offending index is 0)

I am trying to run your code using a webcam.

This is the code after modifying:

# $ python3 demos/video.py --weights runs/DirectMHP/agora_m_1280_e300_t40_lw010/weights/best.pt --data data/agora_coco.yaml --device 0 --conf-thres 0.3 --start 0 --thickness 3
import sys
from pathlib import Path

FILE = Path(__file__).absolute()
sys.path.append(FILE.parents[1].as_posix())

import argparse
import torch
import cv2
import yaml
import imageio
from tqdm import tqdm
import os.path as osp
import numpy as np

from utils.torch_utils import select_device, time_sync
from utils.general import check_img_size, scale_coords, non_max_suppression
from utils.datasets import LoadImages
from utils.plots import plot_3axis_Zaxis
from models.experimental import attempt_load

if __name__ == '__main__':
    parser = argparse.ArgumentParser()

    # video options
    # parser.add_argument('-p', '--video-path', default='', help='path to video file')
    parser.add_argument('--cam',
                        dest='cam_id', help='Camera device id to use [0]',
                        default=0, type=int)

    parser.add_argument('--data', type=str,
                        default='/home/redhwan/2/HPE/DirectMHP/data/agora_coco.yaml'
                        # default = 'data/agora_coco.yaml'
                        )
    parser.add_argument('--imgsz', type=int, default=1280)
    parser.add_argument('--save-size', type=int, default=1080)
    parser.add_argument('--weights', default='/home/redhwan/2/HPE/DirectMHP/runs/DirectMHP/agora_m_1280_e300_t40_lw010/weights/best.pt')
    parser.add_argument('--device', default=0, help='cuda device, i.e. 0 or cpu')
    parser.add_argument('--conf-thres', type=float, default=0.7, help='confidence threshold')
    parser.add_argument('--iou-thres', type=float, default=0.45, help='NMS IoU threshold')
    parser.add_argument('--scales', type=float, nargs='+', default=[1])

    parser.add_argument('--start', type=int, default=0, help='start time (s)')
    parser.add_argument('--end', type=int, default=-1, help='end time (s), -1 for remainder of video')
    parser.add_argument('--color', type=int, nargs='+', default=[255, 255, 255], help='head bbox color')
    parser.add_argument('--thickness', type=int, default=2, help='thickness of Euler angle lines')
    parser.add_argument('--alpha', type=float, default=0.4, help='head bbox and head pose alpha')

    parser.add_argument('--display', action='store_true', help='display inference results')
    parser.add_argument('--fps-size', type=int, default=1)
    parser.add_argument('--gif', action='store_true', help='create gif')
    parser.add_argument('--gif-size', type=int, nargs='+', default=[480, 270])

    args = parser.parse_args()

    with open(args.data) as f:
        data = yaml.safe_load(f)  # load data dict

    device = select_device(args.device, batch_size=1)
    print('Using device: {}'.format(device))

    model = attempt_load(args.weights, map_location=device)  # load FP32 model
    # stride = int(model.stride.max())  # model stride
    # imgsz = check_img_size(args.imgsz, s=stride)  # check image size
    # if device.type != 'cpu':
    #     # model(torch.zeros(1, 3, imgsz, imgsz).to(device).type_as(next(model.parameters())))  # run once
    #     model(torch.zeros(1, 3, imgsz, imgsz).to(device))  # run once
    # frames = []
    cap = cv2.VideoCapture(0)
    while True:
        ret, img = cap.read()
        # if ret:
        #     frames.append(img)
        # dataset = LoadImages(frame, img_size=imgsz, stride=stride, auto=True)
        # video = np.stack(frames, axis=0)  # dimensions (T, H, W, C)
        # dataset = LoadImages(video, img_size=imgsz, stride=stride, auto=True)

        # if device.type != 'cpu':
        #     model(torch.zeros(1, 3, imgsz, imgsz).to(device).type_as(next(model.parameters())))  # run once

        # cap = dataset.cap
        # cap.set(cv2.CAP_PROP_POS_MSEC, args.start * 1000)
        # fps = cap.get(cv2.CAP_PROP_FPS)
        # if args.end == -1:
        #     n = int(cap.get(cv2.CAP_PROP_FRAME_COUNT) - fps * args.start)
        # else:
        #     n = int(fps * (args.end - args.start))
        # h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
        # w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
        # gif_frames = []
        # out_path = '{}_{}'.format(osp.splitext(args.video_path)[0], "DirectMHP")
        # out_path = '{}_{}'.format(osp.splitext(args.video_path)[0], "DirectMHP")
        # print("fps:", fps, "\t total frames:", n, "\t out_path:", out_path)

        # write_video = not args.display and not args.gif
        # if write_video:
        #     # writer = cv2.VideoWriter(out_path + '.mp4', cv2.VideoWriter_fourcc(*'mp4v'), fps, (w, h))
        #     writer = cv2.VideoWriter(out_path + '.mp4', cv2.VideoWriter_fourcc(*'mp4v'), fps,
        #                              (int(args.save_size * w / h), args.save_size))

        # dataset = tqdm(dataset, desc='Running inference', total=n)
        # t0 = time_sync()
        # for i, (path, img, im0, _) in enumerate(dataset):
        print(len(img.shape))
        img = torch.from_numpy(img).to(device)
        # img = img / 255.0  # 0 - 255 to 0.0 - 1.0
        # if len(img.shape) == 3:
        #     img = img[None]  # expand for batch dim

        out_ori = model(img, augment=True, scales=args.scales)[0]
        # out_ori = model(img, augment=True, scales=args.scales)[1]
        # out_ori = model(img, size=321)
        out = non_max_suppression(out_ori, args.conf_thres, args.iou_thres, num_angles=data['num_angles'])
        # predictions (Array[N, 9]), x1, y1, x2, y2, conf, class, pitch, yaw, roll
        bboxes = scale_coords(img.shape[2:], out[0][:, :4], im0.shape[:2]).cpu().numpy()  # native-space pred
        scores = out[0][:, 4].cpu().numpy()
        pitchs_yaws_rolls = out[0][:, 6:].cpu().numpy()  # N*3

        im0_copy = im0.copy()

        # draw head bboxes and pose
        for j, [x1, y1, x2, y2] in enumerate(bboxes):
            im0_copy = cv2.rectangle(im0_copy, (int(x1), int(y1)), (int(x2), int(y2)),
                                     args.color, thickness=args.thickness)
            # im0_copy = cv2.putText(im0_copy, str(round(scores[j], 3)), (int(x1), int(y1)),
            # cv2.FONT_HERSHEY_PLAIN, 0.7, (255,255,255), thickness=2)
            pitch = (pitchs_yaws_rolls[j][0] - 0.5) * 180
            yaw = (pitchs_yaws_rolls[j][1] - 0.5) * 360
            roll = (pitchs_yaws_rolls[j][2] - 0.5) * 180
            im0_copy = plot_3axis_Zaxis(im0_copy, yaw, pitch, roll, tdx=(x1 + x2) / 2, tdy=(y1 + y2) / 2,
                                        size=max(y2 - y1, x2 - x1) * 0.8, thickness=args.thickness)

        im0 = cv2.addWeighted(im0, args.alpha, im0_copy, 1 - args.alpha, gamma=0)

        if i == 0:
            t = time_sync() - t0
        else:
            t = time_sync() - t1

        if not args.gif and args.fps_size:
            cv2.putText(im0, '{:.1f} FPS'.format(1 / t), (5 * args.fps_size, 25 * args.fps_size),
                        cv2.FONT_HERSHEY_SIMPLEX, args.fps_size, (255, 255, 255), thickness=2 * args.fps_size)

        if args.gif:
            gif_img = cv2.cvtColor(cv2.resize(im0, dsize=tuple(args.gif_size)), cv2.COLOR_RGB2BGR)
            if args.fps_size:
                cv2.putText(gif_img, '{:.1f} FPS'.format(1 / t), (5 * args.fps_size, 25 * args.fps_size),
                            cv2.FONT_HERSHEY_SIMPLEX, args.fps_size, (255, 255, 255), thickness=2 * args.fps_size)
            gif_frames.append(gif_img)
        elif write_video:
            im0 = cv2.resize(im0, dsize=(int(args.save_size * w / h), args.save_size))
            writer.write(im0)
        else:
            cv2.imshow('', im0)
            cv2.waitKey(1)

        t1 = time_sync()
        if i == n - 1:
            break

    cv2.destroyAllWindows()
    cap.release()
    if write_video:
        writer.release()

    if args.gif:
        print('Saving GIF...')
        with imageio.get_writer(out_path + '.gif', mode="I", fps=fps) as writer:
            for idx, frame in tqdm(enumerate(gif_frames)):
                writer.append_data(frame)

I got this error:

/usr/bin/python3.8 /home/redhwan/2/HPE/DirectMHP/demos/demo_Redhwan.py 
Using device: cuda:0
3
Traceback (most recent call last):
  File "/home/redhwan/2/HPE/DirectMHP/demos/demo_Redhwan.py", line 111, in <module>
    out_ori = model(img, augment=True, scales=args.scales)[0]
  File "/home/redhwan/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/redhwan/2/HPE/DirectMHP/models/yolo.py", line 131, in forward
    return self.forward_augment(x, s=scales, f=flips)  # augmented inference, None
  File "/home/redhwan/2/HPE/DirectMHP/models/yolo.py", line 142, in forward_augment
    yi, train_out_i = self.forward_once(xi)  # forward
  File "/home/redhwan/2/HPE/DirectMHP/models/yolo.py", line 167, in forward_once
    x = m(x)  # run
  File "/home/redhwan/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/redhwan/2/HPE/DirectMHP/models/common.py", line 206, in forward
    return self.conv(torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1))
RuntimeError: Sizes of tensors must match except in dimension 2. Got 1 and 2 (The offending index is 0)

NameError: Error: not a valid dataset name

Hi, Hnuzhy
When I try to run :
python train.py --lr 0.0001 --dataset AGORA --data_dir datasets/AGORA/train --filename_list datasets/AGORA/files_train.txt --num_epochs 100 --batch_size 256 --gpu 0

I got this error:

 Traceback (most recent call last):
  File "train.py", line 133, in <module>
    pose_dataset = datasets.getDataset(
  File "/home/redhwan/2/HPE/6DRepNet/sixdrepnet/datasets.py", line 323, in getDataset
    raise NameError('Error: not a valid dataset name')
NameError: Error: not a valid dataset name

When I checked the code, I found no class for AGORA and CMU datasets.

Could you provide your code for that, please?

Hi, under the disturbance again, now I have a custom task to estimate the posture of the head. If I want to train this model to estimate the three angles (pitch, yam, roll), In order to get label information similar to Agora, which method or label toolkit can I utilize?

FileNotFoundError: [Errno 2] No such file or directory: './HeadCube3D/images/300W_LP/AFW_1051618982_1_0_flip.jpg'

Could you provide the code to generate train_300W_LP.json, val_AFLW2000.json and BIWI_test.json because the names of images are different?
It will be really helpful.

models usage

Thank you for providing the codes and pre-trained models.

I am looking through the demos folder and saw that SixDRepNet_AGORA_bs256_e100_epoch_last.pth and SixDRepNet_CMU_bs256_e100_epoch_last.pth can be used in image_vis3d_6DRepNet.py. However, when I looked into image_vis3d.py and video_vis3d.py, it seems like it is using yolov5m. In that case, where am I to use the models stated under Single HPE Task Pretrained on WiderFace and Finetuning on 300W-LP?

./demos/image_vis3d_6DRepNet.py fails to predict.

I tried to predict HPE using images that include one person, but it fails to predict in both your DirectMHP and 6DRepNet.
Could you create code to predict the head pose for images that include one person, please?

About dataset prepare

hello,I have a question about prepareing yolov5_labels in Single HPE datasets.
Could you please tell me how do I get yolov5_labels?

About AGORA_HPE and CMU_HPE

Could you please provide the tree diagrams of AGORA_HPE and CMU_HPE? I want to compare the results I generate to see if they are comprehensive. I would appreciate it.

About inference

Hello, I'm having some problems with inference. I can successfully infer using the photos you provided but not on my dataset. It's still the original photo.

In addition, I want to use this to judge where people are looking. Is it possible to do this?
Here is the image I use.

How can I export to ONNX？

关于AGORA-HPE数据集的制作问题

你好，我对上述公式可以得到的M_r的具体物理含义不是很能理解，这个是头部相对于相机的旋转和偏移吗？如果是相对于头部的旋转和偏移，那么依据相机图片坐标点与头部三维关键点之间的映射关系，是否能够利用三维头部关键点坐标得到图片中相应二维关键点的坐标。

AttributeError: 'NoneType' object has no attribute 'shape'

I am trying to run the convert_coco_style_img2pose.py.

I got this error:

/usr/bin/python3.8 /home/redhwan/2/HPE/DirectMHP/exps/convert_coco_style_img2pose.py 
  0%|          | 0/3226 [00:00<?, ?it/s]Loading frames paths... -->  /home/redhwan/2/HPE/img2pose/annotations/WIDER_val_annotations.txt
100%|██████████| 3226/3226 [00:15<00:00, 211.59it/s]
WiderFace-img2pose (val): original images-->3226, original face instances-->39697
Processing val-set ...
  0%|          | 0/3226 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/home/redhwan/2/HPE/DirectMHP/exps/convert_coco_style_img2pose.py", line 413, in <module>
    total_images_origin = convert_to_coco_style(
  File "/home/redhwan/2/HPE/DirectMHP/exps/convert_coco_style_img2pose.py", line 324, in convert_to_coco_style
    img_h, img_w, img_c = img.shape
AttributeError: 'NoneType' object has no attribute 'shape'

Thank you so much for your help.

test.py for sixdrepnet

I read your paper, then I implemented the whole of your code.
It is more than amazing.

But I faced one issue.

When I tested the sixdrepnet code. Your test.py code is modified a little bit to be fit for full range angles compared to the original code.

I used the Fine-tuned models for the original work. your code gives the same result when batch_size = 1 Yaw: 3.6289, Pitch: 4.9066, Roll: 3.3740, MAE: 3.9698
and gives a different result when batch_size = 64 Yaw: 2.0235, Pitch: 1.1259, Roll: 2.1734, MAE: 1.7742

Please, check your code and update it.

project_joints --imgFolder demo/images/train --loadPrecomputed demo/Cam/train_Cam \ --modeltype SMPLX --kid_template_path utils/smplx_kid_template.npy --modelFolder demo/model \ --gt_model_path demo/GT_fits/ --imgWidth 1280 --imgHeight 720

when i run the command:

project_joints --imgFolder demo/images/train --loadPrecomputed demo/Cam/train_Cam \
  --modeltype SMPLX --kid_template_path utils/smplx_kid_template.npy --modelFolder demo/model \
  --gt_model_path demo/GT_fits/ --imgWidth 1280 --imgHeight 720

it will show the error:

DEBUG:matplotlib:matplotlib data path: /home/huangzhen/anaconda3/envs/py_3.8/lib/python3.8/site-packages/matplotlib/mpl-data
DEBUG:matplotlib:CONFIGDIR=/home/huangzhen/.config/matplotlib
DEBUG:matplotlib:interactive is False
DEBUG:matplotlib:platform is linux
DEBUG:matplotlib:CACHEDIR=/home/huangzhen/.cache/matplotlib
DEBUG:matplotlib.font_manager:Using fontManager instance from /home/huangzhen/.cache/matplotlib/fontlist-v330.json
0it [00:00, ?it/s]INFO:root:Processing 0th dataframe
INFO:root:Generating Ground truth joints
  0%|                                                                                                                                                                                                             | 0/1453 [00:00<?, ?it/s]
0it [00:05, ?it/s]                                                                                                                                                                                                | 0/1453 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/home/huangzhen/anaconda3/envs/py_3.8/bin/project_joints", line 8, in <module>
    sys.exit(project_joints())
  File "/home/huangzhen/anaconda3/envs/py_3.8/lib/python3.8/site-packages/agora_evaluation/cli.py", line 26, in project_joints
    run_projection(sys.argv[1:])
  File "/home/huangzhen/anaconda3/envs/py_3.8/lib/python3.8/site-packages/agora_evaluation/project_joints.py", line 77, in run_projection
    df = add_joints_verts_in_dataframe(args, df, store_joints)
  File "/home/huangzhen/anaconda3/envs/py_3.8/lib/python3.8/site-packages/agora_evaluation/get_joints_verts_from_dataframe.py", line 240, in add_joints_verts_in_dataframe
    gt_verts_cam_2d, gt_verts_cam_3d, gt_joints_cam_2d, gt_joints_cam_3d = get_projected_joints(args, df,
  File "/home/huangzhen/anaconda3/envs/py_3.8/lib/python3.8/site-packages/agora_evaluation/get_joints_verts_from_dataframe.py", line 209, in get_projected_joints
    gt_joints_local, gt_verts_local = get_smplx_vertices(args.numBetas, kid_flag, gt, gender,
  File "/home/huangzhen/anaconda3/envs/py_3.8/lib/python3.8/site-packages/agora_evaluation/get_joints_verts_from_dataframe.py", line 105, in get_smplx_vertices
    smplx_gt = model_gt(
  File "/home/huangzhen/anaconda3/envs/py_3.8/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/huangzhen/anaconda3/envs/py_3.8/lib/python3.8/site-packages/smplx/body_models.py", line 1242, in forward
    vertices, joints = lbs(shape_components, full_pose, self.v_template,
  File "/home/huangzhen/anaconda3/envs/py_3.8/lib/python3.8/site-packages/smplx/lbs.py", line 209, in lbs
    v_shaped = v_template + blend_shapes(betas, shapedirs)
  File "/home/huangzhen/anaconda3/envs/py_3.8/lib/python3.8/site-packages/smplx/lbs.py", line 295, in blend_shapes
    blend_shape = torch.einsum('bl,mkl->bmk', [betas, shape_disps])
  File "/home/huangzhen/anaconda3/envs/py_3.8/lib/python3.8/site-packages/torch/functional.py", line 241, in einsum
    return torch._C._VariableFunctions.einsum(equation, operands)
RuntimeError: size of dimension does not match previous size, operand 1, dim 2

can u help me to check what is wrong?

FileNotFoundError: [Errno 2] No such file or directory: './HPE/annotations/coco_style_validation.json'

hi
Step 4 here, I am running the code $ python data_process_hpe.py to generate (coco_style_train.json / coco_style_validation.json).
I think that you forget the function to get them.

Thank you in advance for your help.

The results are different a little bit.

I trained your model; it took 48 hours for 300 epochs as you did.
Finally, it is working fine with me.
But, the results are different a little bit.

You report that results should be:

# result
narrow-range: mAP=82.0, [MAE, Pitch, Yaw, Roll]: 11.7567, 11.8002, 12.3257, 11.1441
full-range:   mAP=74.5, [MAE, Pitch, Yaw, Roll]: 13.2259, 12.9754, 14.9826, 11.7196

I confused that my results are different.

/usr/bin/python3.8 /home/redhwan/2/HPE/DirectMHP/val.py 
val: data=data/agora_coco.yaml, weights=runs/train/exp55/weights/best.pt, batch_size=1, imgsz=1280, task=val, device=0, conf_thres=0.001, iou_thres=0.65, scales=[1], flips=[None], rect=False, half=False, pad=0, json_name=, frontal_face=False
YOLOv5 🚀 2023-2-4 torch 1.8.1+cu101 CUDA:0 (GeForce RTX 2080, 7973.9375MB)

Fusing layers... 
Model Summary: 291 layers, 12360844 parameters, 0 gradients
val: Scanning '/home/redhwan/2/HPE/DirectMHP/exps/AGORA/HPE/yolov5_labels_coco/img_txt/validation.cache' images and labels... 1070 found, 0 missing, 0 empty, 0 corrupted: 100%|██████████| 1070/1070 [00:00<?, ?it/s]
Processing val images: 100%|██████████| 1070/1070 [00:43<00:00, 24.43it/s]
loading annotations into memory...
Done (t=0.28s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.67s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=4.89s).
Accumulating evaluation results...
DONE (t=0.44s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.738
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.934
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.870
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.633
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.792
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.862
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.123
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.750
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.793
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.707
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.843
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.896
100%|██████████| 7505/7505 [00:00<00:00, 66598.09it/s]
left bbox number: 6542 / 7505; MAE: 13.4848, [pitch_error, yaw_error, roll_error]: 13.2115, 15.2932, 11.9496
left backward bbox number: 3196 / 6542; MAE: 14.9882, [pitch_error, yaw_error, roll_error]: 14.3756, 18.0422, 12.5469
left frontal bbox number: 3346 / 6542; MAE: 12.0487, [pitch_error, yaw_error, roll_error]: 12.0995, 12.6675, 11.3792
Speed: 0.995ms pre-process, 16.379ms inference, 0.707ms NMS per image at shape (1, 3, 1280, 1280)

Process finished with exit code 0

How to know the results for narrow-range and full-range in my case?

Thank you in advance for your help.

Fig. 4. Pose label distribution of three Euler angles in datasets 300W-LP&AFLW2000 and BIWI.

Hello, Hnuzhy,

First of all, thank you so much for your sharing your work here.

It is amazing.

I already read your paper and implemented your code as you guided us in your README.md.

I did not find the code for Pose label distribution of three Euler angles in datasets 300W-LP&AFLW2000 and BIWI.
Could you share it here, please?

Thank you in advance!

RuntimeError: size of dimension does not match previous size, operand 1, dim 2

I got this error:

$ project_joints --imgFolder demo/images/train --loadPrecomputed demo/Cam/train_Cam   --modeltype SMPLX --kid_template_path utils/smplx_kid_template.npy --modelFolder demo/model   --gt_model_path demo/GT_fits/ --imgWidth 1280 --imgHeight 720
DEBUG:matplotlib:matplotlib data path: /home/redhwan/.local/lib/python3.8/site-packages/matplotlib/mpl-data
DEBUG:matplotlib:CONFIGDIR=/home/redhwan/.config/matplotlib
DEBUG:matplotlib:interactive is False
DEBUG:matplotlib:platform is linux
DEBUG:matplotlib:CACHEDIR=/home/redhwan/.cache/matplotlib
DEBUG:matplotlib.font_manager:Using fontManager instance from /home/redhwan/.cache/matplotlib/fontlist-v330.json
0it [00:00, ?it/s]INFO:root:Processing 0th dataframe
INFO:root:Generating Ground truth joints
  0%|▏                                                                                        | 4/1453 [00:01<11:08,  2.17it/s]
0it [00:05, ?it/s]                                                                            | 4/1453 [00:01<09:02,  2.67it/s]
Traceback (most recent call last):
  File "/home/redhwan/.local/bin/project_joints", line 8, in <module>
    sys.exit(project_joints())
  File "/home/redhwan/.local/lib/python3.8/site-packages/agora_evaluation/cli.py", line 26, in project_joints
    run_projection(sys.argv[1:])
  File "/home/redhwan/.local/lib/python3.8/site-packages/agora_evaluation/project_joints.py", line 77, in run_projection
    df = add_joints_verts_in_dataframe(args, df, store_joints)
  File "/home/redhwan/.local/lib/python3.8/site-packages/agora_evaluation/get_joints_verts_from_dataframe.py", line 240, in add_joints_verts_in_dataframe
    gt_verts_cam_2d, gt_verts_cam_3d, gt_joints_cam_2d, gt_joints_cam_3d = get_projected_joints(args, df,
  File "/home/redhwan/.local/lib/python3.8/site-packages/agora_evaluation/get_joints_verts_from_dataframe.py", line 209, in get_projected_joints
    gt_joints_local, gt_verts_local = get_smplx_vertices(args.numBetas, kid_flag, gt, gender,
  File "/home/redhwan/.local/lib/python3.8/site-packages/agora_evaluation/get_joints_verts_from_dataframe.py", line 105, in get_smplx_vertices
    smplx_gt = model_gt(
  File "/home/redhwan/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/redhwan/.local/lib/python3.8/site-packages/smplx/body_models.py", line 1242, in forward
    vertices, joints = lbs(shape_components, full_pose, self.v_template,
  File "/home/redhwan/.local/lib/python3.8/site-packages/smplx/lbs.py", line 209, in lbs
    v_shaped = v_template + blend_shapes(betas, shapedirs)
  File "/home/redhwan/.local/lib/python3.8/site-packages/smplx/lbs.py", line 295, in blend_shapes
    blend_shape = torch.einsum('bl,mkl->bmk', [betas, shape_disps])
  File "/home/redhwan/.local/lib/python3.8/site-packages/torch/functional.py", line 241, in einsum
    return torch._C._VariableFunctions.einsum(equation, operands)
RuntimeError: size of dimension does not match previous size, operand 1, dim 2

Note: it was working fine with validation data.
How we can solve this issue?

My system is:
Python 3.8
ubuntu 20.04

Where's the pretrained weight of DirectMHP?

Hi, thank u for your awesome job! Can u provide the pretrained weight, e.g., runs/DirectMHP/agora_m_1280_e300_t40_lw010/weights/best.pt?

About plot_3axis_Zaxis

Hello, I have questions about demos/image.py.

def plot_3axis_Zaxis(img, yaw, pitch, roll, tdx=None, tdy=None, size=50., limited=True, thickness=2):
    # Input is a cv2 image
    # pose_params: (pitch, yaw, roll, tdx, tdy)
    # Where (tdx, tdy) is the translation of the face.
    # For pose we have [pitch yaw roll tdx tdy tdz scale_factor]

    p = pitch * np.pi / 180
    y = -(yaw * np.pi / 180)
    r = roll * np.pi / 180

When drawing the three axes, why is only the Yaw angle multiplied by a minus sign?

Also, is the head posture, the output of the neural network, an Eulerian angular representation (z-x-y system) with the camera coordinate system (image below) rotated in the order of Z-axis, Y-axis and X-axis?

关于构建AGORA-HPE数据集

step2步骤中声明需要从官方网站中获得npz版本的SMPL-X模型，并将其命名为SMPLX_MALE.npz, SMPLX_FEMALE.npz 和 SMPLX_NEUTRAL.npz。但是官方网站中，npz版本的模型只有neutral SMPL-X 。

能够详细说明下如何从官方网站中获得SMPLX_MALE.npz, SMPLX_FEMALE.npz 和 SMPLX_NEUTRAL.npz这三个模型吗

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 64: invalid start byte

Hello, Hnuzhy

When running code compare_error_analysis_v2.py

I got this error:

Traceback (most recent call last):
  File "/home/redhwan/2/HPE/DirectMHP/exps/compare_error_analysis_v2.py", line 269, in <module>
    main(args)
  File "/home/redhwan/2/HPE/DirectMHP/exps/compare_error_analysis_v2.py", line 108, in main
    pd_results_list = json.load(json_f)
  File "/usr/lib/python3.8/json/__init__.py", line 293, in load
    return loads(fp.read(),
  File "/usr/lib/python3.8/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 64: invalid start byte

My parser is:

if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='3DDFA inference pipeline')
    
    parser.add_argument('--root-imgdir', default='/home/redhwan/2/HPE/DirectMHP/exps/AGORA/HPE/images/validation',
                        help='root path to multiple images')
    parser.add_argument('--json-file', default='/home/redhwan/2/HPE/DirectMHP/runs/DirectMHP/agora_m_1280_e300_t40_lw010/weights/best.pt',
                        help='json file path that contains multiple images and their head bboxes')
    parser.add_argument('--anno-file', default='/home/redhwan/2/HPE/DirectMHP/exps/AGORA/HPE/annotations/coco_style_validation.json',
                        help='json file path that contains ground-truths of validation set')
    parser.add_argument('-m', '--mode', default='cpu', type=str, help='gpu or cpu mode')
    parser.add_argument('--debug',  action='store_true', help='whether set into debug mode')
    
    args = parser.parse_args()
    main(args)

Your explanation is :
you used /datasdc/zhouhuayi/face_related/DirectMHP/runs/DirectMHP/agora_m_1280_e300_t40_lw010/weights/val_best_c0.001_i0.65_pd_full.json

How did you get this --json-file?

Thank you so much in advance?

FileNotFoundError: [Errno 2] No such file or directory: './datasets/AGORA/HPE/images/validation/2000100001.jpg'

hi, I first run the script data_process_hpe.py to generate coco_style_train.json file and coco_style_validation.json file, I had
set dubug = False and has_parsed = False. But I got a error: FileNotFoundError: [Errno 2] No such file or directory: './datasets/AGORA/HPE/images/validation/2000100001.jpg' . How can I solve this problem? Thank you in advance for your help.

模型test的问题

你好在加载[agora_s_1280_e300_t40_lw010_best.pt]模型参数到网络中时出现了以下问题。

hnuzhy / directmhp Goto Github PK

directmhp's Introduction

Hi! Dear Visitor. 😃 I'm Huayi Zhou, a PhD student in Shanghai Jiao Tong University.

directmhp's People

Contributors

Stargazers

Watchers

Forkers

directmhp's Issues

Recommend Projects

Recommend Topics

Recommend Org