Giter VIP home page Giter VIP logo

mspn's Introduction

Rethinking on Multi-Stage Networks for Human Pose Estimation

This repo is also linked to github.

Introduction

This is a pytorch realization of MSPN proposed in Rethinking on Multi-Stage Networks for Human Pose Estimation . which wins 2018 COCO Keypoints Challenge. The original repo is based on the inner deep learning framework (MegBrain) in Megvii Inc.

In this work, we design an effective network MSPN to fulfill human pose estimation task.

Existing pose estimation approaches fall into two categories: single-stage and multi-stage methods. While multistage methods are seemingly more suited for the task, their performance in current practice is not as good as singlestage methods. This work studies this issue. We argue that the current multi-stage methods’ unsatisfactory performance comes from the insufficiency in various design choices. We propose several improvements, including the single-stage module design, cross stage feature aggregation, and coarse-tofine supervision.

Overview of MSPN.

The resulting method establishes the new state-of-the-art on both MS COCO and MPII Human Pose dataset, justifying the effectiveness of a multi-stage architecture.

Results

Results on COCO val dataset

Model Input Size AP AP50 AP75 APM APL AR AR50 AR75 ARM ARL
1-stg MSPN 256x192 71.5 90.1 78.4 67.4 77.5 77.0 93.2 83.1 72.6 83.1
2-stg MSPN 256x192 74.5 91.2 81.2 70.5 80.4 79.7 94.2 85.6 75.4 85.7
3-stg MSPN 256x192 75.2 91.5 82.2 71.1 81.1 80.3 94.3 86.4 76.0 86.4
4-stg MSPN 256x192 75.9 91.8 82.9 72.0 81.6 81.1 94.9 87.1 76.9 87.0
4-stg MSPN 384x288 76.9 91.8 83.2 72.7 83.1 81.8 94.8 87.3 77.4 87.8

Results on COCO test-dev dataset

Model Input Size AP AP50 AP75 APM APL AR AR50 AR75 ARM ARL
4-stg MSPN 384x288 76.1 93.4 83.8 72.3 81.5 81.6 96.3 88.1 77.5 87.1
4-stg MSPN+ 384x288 78.1 94.1 85.9 74.5 83.3 83.1 96.7 89.8 79.3 88.2

Results on MPII dataset

Model Split Input Size Head Shoulder Elbow Wrist Hip Knee Ankle Mean
4-stg MSPN val 256x256 96.8 96.5 92.0 87.0 89.9 88.0 84.0 91.1
4-stg MSPN test 256x256 98.4 97.1 93.2 89.2 92.0 90.1 85.5 92.6

Note

  • + means using model ensemble.

Repo Structure

This repo is organized as following:

$MSPN_HOME
|-- cvpack
|
|-- dataset
|   |-- COCO
|   |   |-- det_json
|   |   |-- gt_json
|   |   |-- images
|   |       |-- train2014
|   |       |-- val2014
|   |
|   |-- MPII
|       |-- det_json
|       |-- gt_json
|       |-- images
|   
|-- lib
|   |-- models
|   |-- utils
|
|-- exps
|   |-- exp1
|   |-- exp2
|   |-- ...
|
|-- model_logs
|
|-- README.md
|-- requirements.txt

Quick Start

Installation

  1. Install Pytorch referring to Pytorch website.

  2. Clone this repo, and config MSPN_HOME in /etc/profile or ~/.bashrc, e.g.

export MSPN_HOME='/path/of/your/cloned/repo'
export PYTHONPATH=$PYTHONPATH:$MSPN_HOME
  1. Install requirements:
pip3 install -r requirements.txt
  1. Install COCOAPI referring to cocoapi website, or:
git clone https://github.com/cocodataset/cocoapi.git $MSPN_HOME/lib/COCOAPI
cd $MSPN_HOME/lib/COCOAPI/PythonAPI
make install

Dataset

COCO

  1. Download images from COCO website, and put train2014/val2014 splits into $MSPN_HOME/dataset/COCO/images/ respectively.

  2. Download ground truth from Google Drive, and put it into $MSPN_HOME/dataset/COCO/gt_json/.

  3. Download detection result from Google Drive, and put it into $MSPN_HOME/dataset/COCO/det_json/.

MPII

  1. Download images from MPII website, and put images into $MSPN_HOME/dataset/MPII/images/.

  2. Download ground truth from Google Drive, and put it into $MSPN_HOME/dataset/MPII/gt_json/.

  3. Download detection result from Google Drive, and put it into $MSPN_HOME/dataset/MPII/det_json/.

Model

Download ImageNet pretained ResNet-50 model from Google Drive, and put it into $MSPN_HOME/lib/models/. For your convenience, We also provide a well-trained 2-stage MSPN model for COCO.

Log

Create a directory to save logs and models:

mkdir $MSPN_HOME/model_logs

Train

Go to specified experiment repository, e.g.

cd $MSPN_HOME/exps/mspn.2xstg.coco

and run:

python config.py -log
python -m torch.distributed.launch --nproc_per_node=gpu_num train.py

the gpu_num is the number of gpus.

Test

python -m torch.distributed.launch --nproc_per_node=gpu_num test.py -i iter_num

the gpu_num is the number of gpus, and iter_num is the iteration number you want to test.

Citation

Please considering citing our projects in your publications if they help your research.

@article{li2019rethinking,
  title={Rethinking on Multi-Stage Networks for Human Pose Estimation},
  author={Li, Wenbo and Wang, Zhicheng and Yin, Binyi and Peng, Qixiang and Du, Yuming and Xiao, Tianzi and Yu, Gang and Lu, Hongtao and Wei, Yichen and Sun, Jian},
  journal={arXiv preprint arXiv:1901.00148},
  year={2019}
}

@inproceedings{chen2018cascaded,
  title={Cascaded pyramid network for multi-person pose estimation},
  author={Chen, Yilun and Wang, Zhicheng and Peng, Yuxiang and Zhang, Zhiqiang and Yu, Gang and Sun, Jian},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={7103--7112},
  year={2018}
}

And the code of Cascaded Pyramid Network is also available.

Contact

You can contact us by email published in our paper or [email protected].

mspn's People

Contributors

megvii-wzc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mspn's Issues

Test on a Single custom image

How to replace the COCO Val images with a custom images folder? Do we have to replace the data loader in the coco.py file?

About MPII test.json, lack of instances

In your test.json, it only has 7000+ instances, but we convert the offical h5 file to get a test.json, it has 11000+ instances. So how should we use your test.json? It seems that you changed the 'scale' and 'center' in your json compared to the offical's. If ture, can you tell us how you modify the 'scale' and 'center'?

Error while starting training in single GPU setting

https://github.com/megvii-detection/MSPN/blob/a84f750aaa34e32ded49c44dda6e73a6538c4fde/cvpack/torch_modeling/engine/engine.py#L56

The variable engine.local_rank is not being set when training with single GPU. I have fixed that by setting local_rank in non distributed setting like this:

        if self.distributed:
            self.local_rank = self.args.local_rank
            self.world_size = int(os.environ['WORLD_SIZE'])
            self.world_rank = int(os.environ['RANK'])
            torch.cuda.set_device(self.local_rank)
            dist.init_process_group(backend="nccl", init_method='env://')
            dist.barrier()
            self.devices = [i for i in range(self.world_size)]
        else:
            # todo check non-distributed training
            self.local_rank = self.args.local_rank
            self.world_rank = 1
            self.devices = parse_torch_devices(self.args.devices)

Can you please let me know if this is the right way to do it?

FYI, when not doing this I am getting error that engine has no property named local_rank

I want to use this model to test a single image. What should I do?

I want to use this model to test a single image. What should I do? How should I preprocess a single image and input it into the network?
I take the network model out and do a single image test, but there are always errors when I input the image. What should I do?

Why the AP is always 0?

 $ ~/MSPN/exps/mspn.2xstg.coco$ python -m torch.distributed.launch --nproc_per_node=1 test.py -i 10000
loading annotations into memory...
Done (t=5.33s)
creating index...
index created!
2760/2760 [19:29<00:00,  2.36it/s]
2021-04-05 12:39:01 deepserver3 COCO[11713] INFO Accumulating ...
2021-04-05 12:39:01 deepserver3 COCO[11713] INFO Dumping results ...
2021-04-05 12:39:06 deepserver3 COCO[11713] INFO Get all results.
Loading and preparing results...
DONE (t=3.67s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *keypoints*
DONE (t=9.53s).
Accumulating evaluation results...
DONE (t=0.33s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets= 20 ] = 0.000
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets= 20 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.001
 Average Recall     (AR) @[ IoU=0.50      | area=   all | maxDets= 20 ] = 0.002
 Average Recall     (AR) @[ IoU=0.75      | area=   all | maxDets= 20 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.001

a question about test

I try to test by my own data, and the json file is setting same as coco.
This is the test result.

~/MSPN/MSPN_HOME/exps/mspn.2xstg.selfdata$ python -m torch.distributed.launch --nproc_per_node=1 test.py -i 140800
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
100%|██████████████████████████████████████████████████████████████████████████| 4/4 [00:01<00:00, 2.29it/s]
2021-02-04 23:29:52 cuda-500-240jp SELF[9481] INFO Accumulating ...
2021-02-04 23:29:52 cuda-500-240jp SELF[9481] INFO Dumping results ...
results 108
2021-02-04 23:29:52 cuda-500-240jp SELF[9481] INFO Get all results.
Loading and preparing results...
DONE (t=0.01s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type keypoints
DONE (t=0.03s).
Accumulating evaluation results...
DONE (t=0.00s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.000
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.000
Average Recall (AR) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.000
Average Recall (AR) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = -1.000

And I check the results.json file like this:

[{"image_id": 108, "category_id": 1, "keypoints":
[388.43601989746094, -142.69078063964844, 1.340456247329712,
388.43601989746094, -142.69078063964844, 1.3361315727233887,
388.43601989746094, -142.69078063964844, 1.334479570388794,
388.43601989746094, -142.69078063964844, 1.3392943143844604,
388.43601989746094, -142.69078063964844, 1.3306572437286377,
388.43601989746094, -142.69078063964844, 1.3493973016738892,
388.43601989746094, -142.69078063964844, 1.3441457748413086,
388.43601989746094, -142.69078063964844, 1.3449093103408813,
388.43601989746094, -142.69078063964844, 1.3436510562896729,
388.43601989746094, -142.69078063964844, 1.350746989250183,
388.43601989746094, -142.69078063964844, 1.3383842706680298,
388.43601989746094, -142.69078063964844, 1.329496145248413,
388.43601989746094, -142.69078063964844, 1.3269803524017334,
388.43601989746094, 650.0357818603516, 1.2743829488754272,
388.43601989746094, 650.0357818603516, 1.2706503868103027,
388.43601989746094, 650.0357818603516, 1.3351771831512451,
388.43601989746094, 650.0357818603516, 1.3317900896072388],
"score": 0.0},

There are the same result of keypoint location, I think it is not work right. But

model_file = os.path.join(cfg.OUTPUT_DIR, "iter-{}.pth".format(args.iter))

the model_file is right, and I also try to test by "mspn_2xstg_coco.pth", which download from googledrive.

Can you help me? thankyou

Getting wrong key-points detected from running test.py

Hi,
I ran "python -m torch.distributed.launch --nproc_per_node=1 test.py -i 1" successfully.

After finished running, I am reading 'results.json' file and plotting the keypoints with images using OpenCV.

But keypoints detected were not correct and its plotting multiple keypoints in one image.

Thanks,
Bala

why test precision > 1?

OrderedDict([('Head', 1.193724420190996), ('Shoulder', 0.0), ('Elbow', 0.0), ('Wrist', 0.034376074252320386), ('Hip', 0.034614053305642094), ('Knee', 0.403153383578306), ('Ankle', 1.2752569625040582), ('Mean', 0.3070517824616185), ('[email protected]', 0.010408534998698933)])

About a Trainning quesstion

Hello,very nice to get this well done job. I have meet a problem when I runing the trainning command according to given. like this:
Traceback (most recent call last):
File "train.py", line 117, in
main()
File "train.py", line 53, in main
data_loader = get_train_loader(cfg, num_gpu=num_gpu, is_dist=True)
File "/home/zhong/MSPN-master/lib/utils/dataloader.py", line 31, in
get_train_loader
sampler = torch_samplers.DistributedSampler(dataset, shuffle=is_shuffle)
File "/home/zhong/MSPN-master/cvpack/dataset/torch_samplers/distributed.py", line 29, in init
num_replicas = dist.get_world_size()
File "/usr/local/lib/python3.5/dist-packages/torch/distributed/distributed_c10d.py", line 584, in get_world_size
return _get_group_size(group)
File "/usr/local/lib/python3.5/dist-packages/torch/distributed/distributed_c10d.py", line 200, in _get_group_size
_check_default_pg()
File "/usr/local/lib/python3.5/dist-packages/torch/distributed/distributed_c10d.py", line 191, in _check_default_pg
"Default process group is not initialized"
AssertionError: Default process group is not initialized
Traceback (most recent call last):
File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.5/dist-packages/torch/distributed/launch.py", line 235, in
main()
File "/usr/local/lib/python3.5/dist-packages/torch/distributed/launch.py", line 231, in main
cmd=process.args)
subprocess.CalledProcessError: Command '['/usr/bin/python', '-u', 'train.py', '--local_rank=0']' returned non-zero exit status 1

About seting base_lr problems

Hello! I want to retrain using other dataset, when I reset the base_lr (base_lr==1e-3)in the "config.py" file. but base_lr doesn't changed as I set in the printed log . How should I reset the bas_lr correctly?

subprocess.CalledProcessError

When I run the command python -m torch.distributed.launch --nproc_per_node=4 train.py

Traceback (most recent call last):
  File "/home/yh/anaconda3/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/yh/anaconda3/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/yh/anaconda3/lib/python3.7/site-packages/torch/distributed/launch.py", line 253, in <module>
    main()
  File "/home/yh/anaconda3/lib/python3.7/site-packages/torch/distributed/launch.py", line 249, in main
    cmd=cmd)
subprocess.CalledProcessError: Command '['/home/yh/anaconda3/bin/python', '-u', 'train.py', '--local_rank=0']' returned non-zero exit status 1.

How to solve this problem,thx!!!

TypeError: 'list' object is not callable

Hi, running your codes occurs following problem, could you help me, thanks a lot!

2019-06-19 23:03:14 cvlab-think train[24177] WARNING A exception occurred during Engine initialization, give up running process
2019-06-19 23:03:14 cvlab-think train[24178] WARNING A exception occurred during Engine initialization, give up running process
Traceback (most recent call last):
  File "train.py", line 121, in <module>
    main()
  File "train.py", line 50, in main
    broadcast_buffers=False, )
  File "/home/andrew/projects/MSPN/venv/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 288, in __init__
    self._ddp_init_helper()
  File "/home/andrew/projects/MSPN/venv/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 341, in _ddp_init_helper
    self._passing_sync_batchnorm_handle(self._module_copies)
  File "/home/andrew/projects/MSPN/venv/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 452, in _passing_sync_batchnorm_handle
    for layer in module.modules():
TypeError: 'list' object is not callable
Traceback (most recent call last):
  File "train.py", line 121, in <module>
    main()
  File "train.py", line 50, in main
    broadcast_buffers=False, )
  File "/home/andrew/projects/MSPN/venv/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 288, in __init__
    self._ddp_init_helper()
  File "/home/andrew/projects/MSPN/venv/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 341, in _ddp_init_helper
    self._passing_sync_batchnorm_handle(self._module_copies)
  File "/home/andrew/projects/MSPN/venv/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 452, in _passing_sync_batchnorm_handle
    for layer in module.modules():
TypeError: 'list' object is not callable
Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/andrew/projects/MSPN/venv/lib/python3.6/site-packages/torch/distributed/launch.py", line 235, in <module>
    main()
  File "/home/andrew/projects/MSPN/venv/lib/python3.6/site-packages/torch/distributed/launch.py", line 231, in main
    cmd=process.args)
subprocess.CalledProcessError: Command '['/home/andrew/projects/MSPN/venv/bin/python', '-u', 'train.py', '--local_rank=0']' returned non-zero exit status 1.
~/projects/MSPN ~/projects/MSPN/scripts
~/projects/MSPN/scripts

AttributeError: 'Engine' object has no attribute 'local_rank'

When running 'python3 -m torch.distributed.launch --nproc_per_node=1 train.py', I got the following issue:

Traceback (most recent call last):
File "train.py", line 119, in
main()
File "train.py", line 87, in main
if engine.local_rank == 0:
AttributeError: 'Engine' object has no attribute 'local_rank'

I noticed that in the folder 'cvpack/torch_modeling/engine', there exists the engine.py.
Line 87: if engine.local_rank == 0:
The error happened in the line 87.

Could I know how to fix this issue? Thank you very much.

About a Test question

After I installed the MSPN according to README.md and run the Test command " python -m torch.distributed.launch --nproc_per_node=1 test.py -i 19200" .
the all results of test are zero,like this:
loading annotations into memory...
################
Done (t=5.50s)
creating index...

index created!
100%|███████████████████████████████████████| 2760/2760 [25:13<00:00, 1.82it/s]
2019-08-01 16:10:33 zhong COCO[12891] INFO Accumulating ...
2019-08-01 16:10:33 zhong COCO[12891] INFO Dumping results ...
2019-08-01 16:10:39 zhong COCO[12891] INFO Get all results.
Loading and preparing results...
DONE (t=3.50s)
creating index...

index created!
Running per image evaluation...
Evaluate annotation type keypoints
DONE (t=10.82s).
Accumulating evaluation results...
DONE (t=0.33s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.000
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.003
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.000
Average Recall (AR) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.001
Average Recall (AR) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.000

all the test data is downed form your Google Driver.

An OOM problem in training with 1080Ti

I try to test this code in Nvidia 1080Ti with mini-batch size 32 as mentioned in paper. But an OOM problem appears in first iteration training. Are you decreasing the batch size, or do some changes?

what are the licences of this project?

Hi,

Everything is in the question ! I'm currently trying to build a project and maybe use MSPN as backbone for this project. I'd like to know what are the regulations surrounding this project.

Thanks for your help !

Greg

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.