Giter VIP home page Giter VIP logo

motionagformer's Introduction

[WACV 2024] MotionAGFormer: Enhancing 3D Human Pose Estimation With a Transformer-GCNFormer Network

PyTorch arXiv Paper Explanation

PWC PWC

This is the official PyTorch implementation of the paper "MotionAGFormer: Enhancing 3D Human Pose Estimation With a Transformer-GCNFormer Network" (WACV 2024).

Environment

The project is developed under the following environment:

  • Python 3.8.10
  • PyTorch 2.0.0
  • CUDA 12.2

For installation of the project dependencies, please run:

pip install -r requirements.txt

Dataset

Human3.6M

Preprocessing

  1. Download the fine-tuned Stacked Hourglass detections of MotionBERT's preprocessed H3.6M data here and unzip it to 'data/motion3d'.
  2. Slice the motion clips by running the following python code in data/preprocess directory:

For MotionAGFormer-Base and MotionAGFormer-Large:

python h36m.py  --n-frames 243

For MotionAGFormer-Small:

python h36m.py --n-frames 81

For MotionAGFormer-XSmall:

python h36m.py --n-frames 27

Visualization

Run the following command in the data/preprocess directory (it expects 243 frames):

python visualize.py --dataset h36m --sequence-number <AN ARBITRARY NUMBER>

This should create a gif file named h36m_pose<SEQ_NUMBER>.gif within data directory.

MPI-INF-3DHP

Preprocessing

Please refer to P-STMO for dataset setup. After preprocessing, the generated .npz files (data_train_3dhp.npz and data_test_3dhp.npz) should be located at data/motion3d directory.

Visualization

Run it same as the visualization for Human3.6M, but --dataset should be set to mpi.

Training

After dataset preparation, you can train the model as follows:

Human3.6M

You can train Human3.6M with the following command:

python train.py --config <PATH-TO-CONFIG>

where config files are located at configs/h36m. You can also use weight and biases for logging the training and validation error by adding --use-wandb at the end. In case of using it, you can set the name using --wandb-name. e.g.:

python train.py --config configs/h36m/MotionAGFormer-base.yaml --use-wandb --wandb-name MotionAGFormer-base

MPI-INF-3DHP

You can train MPI-INF-3DHP with the following command:

python train_3dhp.py --config <PATH-TO-CONFIG>

where config files are located at configs/mpi. Like Human3.6M, weight and biases can be used.

Evaluation

Method # frames # Params # MACs H3.6M weights MPI-INF-3DHP weights
MotionAGFormer-XS 27 2.2M 1.0G download download
MotionAGFormer-S 81 4.8M 6.6G download download
MotionAGFormer-B 243 | 81 11.7M 48.3G | 16G download download
MotionAGFormer-L 243 | 81 19.0M 78.3G | 26G download download

After downloading the weight from table above, you can evaluate Human3.6M models by:

python train.py --eval-only --checkpoint <CHECKPOINT-DIRECTORY> --checkpoint-file <CHECKPOINT-FILE-NAME> --config <PATH-TO-CONFIG>

For example if MotionAGFormer-L of H.36M is downloaded and put in checkpoint directory, then we can run:

python train.py --eval-only --checkpoint checkpoint --checkpoint-file motionagformer-l-h36m.pth.tr --config configs/h36m/MotionAGFormer-large.yaml

Similarly, MPI-INF-3DHP can be evaluated as follows:

python train_3dhp.py --eval-only --checkpoint <CHECKPOINT-DIRECTORY> --checkpoint-file <CHECKPOINT-FILE-NAME> --config <PATH-TO-CONFIG>

Demo

Our demo is a modified version of the one provided by MHFormer repository. First, you need to download YOLOv3 and HRNet pretrained models here and put it in the './demo/lib/checkpoint' directory. Next, download our base model checkpoint from here and put it in the './checkpoint' directory. Then, you need to put your in-the-wild videos in the './demo/video' directory.

Run the command below:

python demo/vis.py --video sample_video.mp4

Sample demo output:

Acknowledgement

Our code refers to the following repositories:

We thank the authors for releasing their codes.

Citation

If you find our work useful for your project, please consider citing the paper:

@inproceedings{motionagformer2024,
  title     =   {MotionAGFormer: Enhancing 3D Human Pose Estimation with a Transformer-GCNFormer Network}, 
  author    =   {Soroush Mehraban, Vida Adeli, Babak Taati},
  booktitle =   {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  year      =   {2024}
}

motionagformer's People

Contributors

soroushmehraban avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.