Giter VIP home page Giter VIP logo

mv3dpose's Introduction

mv3dpose

Off-the-shelf Multiple Person Multiple View 3D Pose Estimation.

out

Cite

If this repository is useful to you, please cite:

@inproceedings{tanke2019iterative,
  title={Iterative Greedy Matching for 3D Human Pose Tracking from Multiple Views},
  author={Tanke, Julian and Gall, Juergen},
  booktitle={German Conference on Pattern Recognition},
  year={2019}
}

Abstract

In this work we propose an approach for estimating 3D human poses of multiple people from a set of calibrated cameras. Estimating 3D human poses from multiple views has several compelling properties: human poses are estimated within a global coordinate space and multiple cameras provide an extended field of view which helps in resolving ambiguities, occlusions and motion blurs. Our approach builds upon a real-time 2D multi-person pose estimation system and greedily solves the association problem between multiple views. We utilize bipartite matching to track multiple people over multiple frames. This proofs to be especially efficient as problems associated with greedy matching such as occlusion can be easily resolved in 3D. Our approach achieves state-of-the-art results on popular benchmarks and may serve as a baseline for future work.

Install

This project requires nvidia-docker and drivers that support cuda 10.

Clone this repository with its submodules as follows:

git clone --recursive https://github.com/jutanke/mv3dpose.git

Usage

Your dataset must reside in a pre-defined folder structure:

  • dataset
    • dataset.json
    • cameras
      • camera00
        • frame00xxxxxxm.json
      • camera01
        • frame00xxxxxxm.json
      • ...
      • camera_n
        • frame00xxxxxxm.json
    • videos
      • camera00
        • frame00xxxxxxm.png
      • camera01
        • frame00xxxxxxm.png
      • ...
      • camera_n
        • frame00xxxxxxm.png

The file names per frame utilize the following schema:

"frame%09d.{png/json}"

The camera json files follow two types of structures: A simple camera with only the projection matrix and width and height:

{
  "P" : [ 3 x 4 ],
  "w" : int(width),
  "h" : int(height)
}

or a more complex camera setup with distortion coefficients. This camera is based on OpenCV.

{
  "K" : [ 3 x 3 ], /* intrinsic paramters */
  "rvec": [ 1 x 3 ], /* rotation vector */
  "tvec": [ 1 x 3 ], /* translation vector */
  "discCoef": [ 1 x 5 ], /* distortion coefficient */
  "w" : int(width),
  "h" : int(height)
}

The system expects a camera for each view at each point in time. If your dataset uses fixed cameras you will need to simply repeat them for all frames.

The dataset.json file contains general information for the model:

{
  "n_cameras": int(#cameras), /* number of cameras */
  "scale_to_mm": 1, /* scales the calibration to mm */
}

The variable scale_to_mm is needed as we operate in [mm] but calibrations might be in other metrics. For example, when the calibration is done in meters, scale_to_mm must be set to 1000.

optional Parameters

  • valid_frames: if frames do not start at 0 and/or are not continious you can set a list of frames here
  • epi_threshold: epipolar line distance threshold in PIXEL
  • max_distance_between_tracks: maximal distance in [mm] between tracks so that they can be associated
  • min_track_length: drop any track which is shorter than min_track_length frames
  • last_seen_delay: allow to skip last_seen_delay frames for connecting a lost track
  • smoothing_sigma: sigma value for Gaussian smoothing of tracks
  • smoothing_interpolation_range: define how far fill-ins should be reaching
  • do_smoothing: should smoothing be done at all? (Default is True)

Run the system

./mvpose.sh /path/to/your/dataset

The resulting tracks will be in your dataset folder under tracks3d, each track represents a single person. The files are organised as follows:

{
  "J": int(joint number), /* number of joints */
  "frames": [int, int], /* ordered list of the frames where this track is residing */
  "poses": [ n_frames x J x 3 ] /* 3D poses, 3d location OR None, if joint is missing */
}

mv3dpose's People

Contributors

jutanke avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.