Giter VIP home page Giter VIP logo

ilovepose / fast-human-pose-estimation.pytorch Goto Github PK

View Code? Open in Web Editor NEW
394.0 17.0 66.0 320 KB

Official pytorch Code for CVPR2019 paper "Fast Human Pose Estimation" https://arxiv.org/abs/1811.05419

License: MIT License

Makefile 0.03% Python 33.77% Cuda 63.60% C++ 0.03% Shell 2.58%
human-pose-estimation deep-learning coco-keypoints-detection mpii-dataset mscoco-keypoint fast-pose-distillation knowledge-distillation

fast-human-pose-estimation.pytorch's Introduction

Fast Human Pose Estimation CVPR2019

Introduction

This is an official pytorch implementation of Fast Human Pose Estimation.

In this work, we focus on the two problems

  1. How to reduce the model size and computation using a model-agnostic method.
  2. How to improve the performance of the reduced model.

In our paper

  1. We reduce the model size and computation through reducing the width and depth of a network.
  2. Propose the fast pose distillation (FPD) to improve the performance of the reduced model.

The results on the MPII dataset demonstrate the effectiveness of our approach. We re-implemented the FPD using the HRNet codebase and provided extra evaluation on the COCO dataset. Our method (FPD) can work without ground-truth labels, it can utilize unlabeled images. Illustrating the architecture of the proposed HRNet

For the MPII dataset

  1. We first trained a teacher model (hourglass model, stacks=8, num_features=256, 90.520@MPII [email protected]) and a student model (hourglass model, stacks=4, num_features=128, 89.040@MPII [email protected]).
  2. We then used the teacher model's prediction and the ground-truth label to co-supervisie the student model (hourglass model, stacks=4, num_features=128, 87.934@MPII [email protected]).
  3. Our experiment shows 1.106% gain from FPD.

For the COCO dataset

  1. We first trained a teacher model (HRNet-W48, input size=256x192, 75.0@COCO-Valid-Set AP) and a student model (HRNet-W32, input size=256x192, 74.4@COCO-Valid-Set AP).
  2. We then used the teacher model's prediction and the ground-truth label to co-supervisie the student model (HRNet-W32, input size=256x192, 75.1@COCO-Valid-Set AP).
  3. Our experiment shows 0.7% gain from FPD.

If you want to further improve the performance of the student model.You can remove the supervision of ground-truth label in the FPD when there are unlabeled images.

Main Results

Results on MPII val

Arch Head Shoulder Elbow Wrist Hip Knee Ankle Mean [email protected]
hourglass_teacher 97.169 96.382 90.830 86.466 90.012 86.802 82.664 90.520 38.275
hourglass_student 96.828 95.194 87.728 82.919 87.900 82.551 78.270 87.934 34.634
hourglass_student_FPD* 96.385 94.905 87.847 81.875 87.225 81.906 78.955 87.598 34.359
hourglass_student_FPD 96.930 95.550 89.040 84.444 88.939 84.021 80.703 89.040 36.144

Note:

  • Flip test is used.
  • Input size is 256x256.
  • hourglass_student_FPD* means not using pretrained students.
  • Not using multi-scale test.
  • Batch size is 4.
  • The PCKh metric implemented in the HRNet codebase for MPII dataset is slightly different from that in our paper.
  • The performance of hourglass implemented using pytorch is lower than that implemented using torch(paper).

Results on COCO val2017 with detector having human AP of 56.4 on COCO val2017 dataset

Arch Input size #Params GFLOPs AP Ap .5 AP .75 AP (M) AP (L) AR AR .5 AR .75 AR (M) AR (L)
pose_hrnet_w48_teacher 256x192 63.6M 14.6 0.750 0.906 0.824 0.713 0.819 0.803 0.941 0.867 0.760 0.866
pose_hrnet_w32_student 256x192 28.5M 7.1 0.744 0.905 0.819 0.708 0.810 0.798 0.942 0.865 0.757 0.858
pose_hrnet_w32_student_FPD 256x192 28.5M 7.1 0.751 0.906 0.823 0.714 0.820 0.804 0.943 0.869 0.762 0.865

Note:

Development environment

The code is developed using python 3.5 on Ubuntu 16.04. NVIDIA GPUs are needed. The code is developed and tested using 4 TITAN XP GPU cards. Other platforms or GPU cards are not fully tested.

Quick start

1. Preparation

1.1 Prepare the dataset

For the MPII dataset, the original annotation files are in matlab format. We have converted them into json format, you also need to download them from OneDrive or GoogleDrive. Extract them under {POSE_ROOT}/data, your directory tree should look like this:

${POSE_ROOT}/data/mpii
├── images
└── mpii_human_pose_v1_u12_1.mat
|—— annot
|   |—— gt_valid.mat
└── |—— test.json
    |   |—— train.json
    |   |—— trainval.json
    |   |—— valid.json
    └── images
        |—— 000001163.jpg
        |—— 000003072.jpg

For the COCO dataset, your directory tree should look like this:

${POSE_ROOT}/data/coco
├── annotations
├── images
│   ├── test2017
│   ├── train2017
│   └── val2017
└── person_detection_results

1.2 Prepare the pretrained models

Your directory tree should look like this:

$HOME/models
├── pytorch
│   ├── imagenet
│   │   ├── hrnet_w32-36af842e.pth
│   │   ├── hrnet_w48-8ef0771d.pth
│   │   └── resnet50-19c8e357.pth
│   ├── pose_coco
│   │   ├── pose_hrnet_w32_256x192.pth
│   │   └── pose_hrnet_w48_256x192.pth
│   └── pose_mpii
│       ├── bs4_hourglass_128_4_1_16_0.00025_0_140_87.934_model_best.pth
│       ├── bs4_hourglass_256_8_1_16_0.00025_0_140_90.520_model_best.pth
│       ├── pose_hrnet_w32_256x256.pth
│       └── pose_hrnet_w48_256x256.pth
└── student_FPD
    ├── hourglass_student_FPD*.pth
    ├── hourglass_student_FPD.pth
    └── pose_hrnet_w32_student_FPD.pth

1.3 Prepare the environment

Setting the parameters in the file prepare_env.sh as follows:

# DATASET_ROOT=$HOME/datasets
# COCO_ROOT=${DATASET_ROOT}/MSCOCO
# MPII_ROOT=${DATASET_ROOT}/MPII
# MODELS_ROOT=${DATASET_ROOT}/models

Then execute:

bash prepare_env.sh

If you like, you can prepare the environment step by step

2. How to train the model

2.1 Download the pretrained models and place them like the section 1.2

For MPII dataset: [GoogleDrive] [BaiduDrive]

  1. hourglass student model

  2. hourglass teacher model

For COCO dataset: [GoogleDrive] [BaiduDrive]

  1. HRNet-W32 student model

  2. HRNet-W48 teacher model

2.2 Start training

# COCO dataset training
cd scripts/fpd_coco
bash run_train_hrnet.sh

# MPII dataset training
cd scripts/fpd_mpii
bash run_train_hrnet.sh # using hrnet model
bash run_train_hg.sh # using hourglass model

# General training methods, we also provide script shell
cd scripts/mpii
bash run_train_hrnet.sh # using hrnet model
bash run_train_hg.sh # using hourglass model
bash run_train_resnet.sh # using resnet model
cd scripts/coco
bash run_train_hrnet.sh # using hrnet model
bash run_train_hg.sh # using hourglass model
bash run_train_resnet.sh # using resnet model

3. How to test the model

3.1 Download the trained student models and place them like section 1.2

[GoogleDrive] [BaiduDrive]

For MPII dataset:

hourglass student FPD model

For COCO dataset:

HRNet-W32 student FPD model

3.2 FPD training results and logs

[GoogleDrive] [BaiduDrive]

Note:

  • coco_hrnet_w48_fpd_w32_256x256: pose_hrnet_w32_student_FPD model training resutls.

  • mpii_hourglass_8_256_fpd_hg_4_128_not_pretrained: hourglass_student_FPD* model training resutls.

  • mpii_hourglass_8_256_fpd_hg_4_128_pretrained: hourglass_student_FPD model training resutls.

Citation

If you use our code or models in your research, please cite with:

@InProceedings{Zhang_2019_CVPR,
author = {Zhang, Feng and Zhu, Xiatian and Ye, Mao},
title = {Fast Human Pose Estimation},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2019}
}

Discussion forum

ILovePose

Unoffical implementations

Fast_Human_Pose_Estimation_Pytorch

Acknowledgement

Thanks for the open-source HRNet

fast-human-pose-estimation.pytorch's People

Contributors

djangogo avatar huandrew avatar xizero00 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fast-human-pose-estimation.pytorch's Issues

Deployment Cost Computation

Thanks for the open source of your work. It helps me a lot.
I check the deployment cost reported in your paper, but you mention little about how to compute it. Would u mind tell me the computation method? @HuAndrew

Inference time?

Hi, and thank you for making this code available.
Is this code suitable for real time use?
What fps am I likely to get with 1 gtx 1080 gpu?

Thanks!

About target_weight

if self.use_target_weight:
loss += 0.5 * self.criterion(
heatmap_pred.mul(target_weight[:, idx]),
heatmap_gt.mul(target_weight[:, idx])
)
else:
loss += 0.5 * self.criterion(heatmap_pred, heatmap_gt)

Q1,target_weight is introduced to keep balance between teacher and student?
Q2,loss += 0.5 * self.criterion, why the loss * 0.5?

imm in pytorch

I saw on issue #8 in the repository of thomasjakab/imm that you reproduced this work with pytorch.
Do you plan to release this imm pytorch implementation ? Or can you provide the calculation of perceptual loss (corresponding to the code you attached in your question).

Thanks for your help and congratulations for "fast-human-pose"

Training HRNet-w32 student model from scratch

Thank you for releasing the code.
I am trying to reproduce the result (training HRNet w32 student network on MSCOCO dataset).
It seems that the student network is already pretrained on MSCOCO with 74.4 mAP.

I wonder if the student network can be trained from scratch with FPD setting (pretrained on imagenet / rather than pre-trained on MSCOCO).

Have you tried training the student network from scratch?

RuntimeError: Only tensors and (possibly nested) tuples of tensors are supported as inputs or outputs of traced functions

Python: 3.7.6
CUDA Version 10.2.89
pytorch 1.0.0 py3.7_cuda10.0.130_cudnn7.4.1_1 [cuda100] pytorch
opencv 3.4.2 py37h6fd60c2_1
torchvision 0.2.1 py_2 pytorch

I am trying to train with bash script/mpii/run_train_hg.sh , but I immediately got the following errors. It may or may not have something to do with [https://github.com/pytorch/pytorch/issues/24904]

......
RESUME: False
  SHUFFLE: True
  WD: 0.0001
WORKERS: 24
/home/panicpanda/miniconda3/lib/python3.7/site-packages/torch/nn/modules/upsampling.py:129: UserWarning: nn.Upsample is deprecated. Use nn.functional.interpolate instead.
  warnings.warn("nn.{} is deprecated. Use nn.functional.interpolate instead.".format(self.name))
Only tensors and (possibly nested) tuples of tensors are supported as inputs or outputs of traced functions (toIValue at /opt/conda/conda-bld/pytorch_1544202130060/work/torch/csrc/jit/pybind_utils.h:91)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7f46a88c0cc5 in /home/panicpanda/miniconda3/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x343be6 (0x7f46e9278be6 in /home/panicpanda/miniconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #2: <unknown function> + 0x3439eb (0x7f46e92789eb in /home/panicpanda/miniconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #3: <unknown function> + 0x343dca (0x7f46e9278dca in /home/panicpanda/miniconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0x39a18c (0x7f46e92cf18c in /home/panicpanda/miniconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #5: <unknown function> + 0x3a6325 (0x7f46e92db325 in /home/panicpanda/miniconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #6: <unknown function> + 0x112176 (0x7f46e9047176 in /home/panicpanda/miniconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
<omitting python frames>
frame #29: __libc_start_main + 0xe7 (0x7f46f9fb6b97 in /lib/x86_64-linux-gnu/libc.so.6)

Error occurs, No graph saved
Traceback (most recent call last):
  File "tools/train.py", line 255, in <module>
    main()
  File "tools/train.py", line 143, in main
    writer_dict['writer'].add_graph(model, (dump_input, ))
  File "/home/panicpanda/miniconda3/lib/python3.7/site-packages/tensorboardX/writer.py", line 804, in add_graph
    self._get_file_writer().add_graph(graph(model, input_to_model, verbose, profile_with_cuda, **kwargs))
  File "/home/panicpanda/miniconda3/lib/python3.7/site-packages/tensorboardX/pytorch_graph.py", line 335, in graph
    raise e
  File "/home/panicpanda/miniconda3/lib/python3.7/site-packages/tensorboardX/pytorch_graph.py", line 326, in graph
    trace = torch.jit.trace(model, args)
  File "/home/panicpanda/miniconda3/lib/python3.7/site-packages/torch/jit/__init__.py", line 635, in trace
    var_lookup_fn, _force_outplace)
RuntimeError: Only tensors and (possibly nested) tuples of tensors are supported as inputs or outputs of traced functions (toIValue at /opt/conda/conda-bld/pytorch_1544202130060/work/torch/csrc/jit/pybind_utils.h:91)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7f46a88c0cc5 in /home/panicpanda/miniconda3/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x343be6 (0x7f46e9278be6 in /home/panicpanda/miniconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #2: <unknown function> + 0x3439eb (0x7f46e92789eb in /home/panicpanda/miniconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #3: <unknown function> + 0x343dca (0x7f46e9278dca in /home/panicpanda/miniconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0x39a18c (0x7f46e92cf18c in /home/panicpanda/miniconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #5: <unknown function> + 0x3a6325 (0x7f46e92db325 in /home/panicpanda/miniconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #6: <unknown function> + 0x112176 (0x7f46e9047176 in /home/panicpanda/miniconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
<omitting python frames>
frame #29: __libc_start_main + 0xe7 (0x7f46f9fb6b97 in /lib/x86_64-linux-gnu/libc.so.6)

mpii dataset json file

Thanks for your great work!
When I tried to run the training on mpii dataset, mpii/anno.json is required. However, original mpii dataset annotation is in mat form. So where can I find this json file?

Testing the code on real world images

Hi,

I want to run the code on my own dataset images. I am able to get the keypoint predictions but they are not aligned correctly on the input image. I think the issue is with the center and scale values used in the get_final_preds function. My input image size is 854x480x3. I resize the images to 256x192x3 before feeding to the model. I am using w32_256x192_adam_lr1e-3.yaml as the config file and pose_hrnet_w32_student_FPD.pth as the pretrained model. What should be the values for the variables c and s to be used in get_final_preds in function.py? Any suggestions?

Thanks

how to use the test

image

  1. How to test the model
    3.1 Download the trained student models and place them like section 1.2
    [GoogleDrive] [BaiduDrive]

For MPII dataset:

hourglass student FPD model

For COCO dataset:

HRNet-W32 student FPD model

3.2 FPD training results and logs
[GoogleDrive] [BaiduDrive]

which shell ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.