Giter VIP home page Giter VIP logo

hand-cnn's Introduction

Contextual Attention for Hand Detection in the Wild

This repository contains the code and datasets related to the following paper:

Contextual Attention for Hand Detection in the Wild, International Conference on Computer Vision (ICCV), 2019.

Please also see the website related to the project.

Contents

This repository contains the following:

  • A Keras implementation of the proposed attention method.
  • Code to train and evaluate the Hand-CNN. This code is based on Matterport's Keras implementation of Mask-RCNN.
  • Two large-scale annotated hand datasets, TV-Hand and COCO-Hand, containing hands in unconstrained images for training and evaluation.
  • Models trained on TV-Hand and COCO-Hand datasets.

Installation

Install the required dependecies as in requirements.txt.

Data annotation format

Create annotations in a .txt file in the follwing format:

/path/to/image/, x_min, x_max, y_min, y_max, x1, y1, x2, y2, x3, y3, y4, hand, where

  • (x1, y1), (x2, y2), (x3, y3) and (x4, y4) are the coordinates of the polygon (rotated) bounding box for a hand in the anti-clockwise order.
  • The wrist is given by the line joining (x1, y1) and (x2, y2).
  • x_min = min(x1, x2, x3, x4), x_max = max(x1, x2, x3, x4), y_min = min(y1, y2, y3, y4), y_max = max(y1, y2, y3, y4).

Folder structure

The code is organized in the following structure:

datasets/
  annotations/
    train_annotations.txt
    val_annotations.txt
    test_annotations.txt

mrcnn/
  config.py
  model.py
  parallel_model.py
  utils.py
  visualize.py
  contextual_attention.py

samples/
  hand/
    results/
    hand.py
    load_weights.py

model/
  mask_rcnn_coco.h5
  trained_weights.h5

requirements.txt
setup.cfg
setup.py
  • ./datasets/annotations/: create the annotations for the training, validation and testing data in the format as described above.
  • ./mrcnn/: this directory contains the main Mask-RCNN code. The file ./mrcnn/model.py is modified to add a new orientation branch to the Mask-RCNN to predict orientation of the hand.
  • ./mrcnn/contextual_attention.py: this file implements the proposed attention method.
  • ./samples/hand/hand.py: code for training and evaluating Hand-CNN.
  • ./model/mask_rcnn_coco.h5: pretrained Mask-RCNN weights on COCO dataset.
  • ./model/trained_weights.h5: model trained on TV-Hand and COCO-Hand data.

Models

Download the models from models and place them in ./model/.

Training

Use the following command to train Hand-CNN:

python -W ignore samples/hand/hand.py --weight coco --command train

  • --weight: If coco, starts training using the pretrained COCO weights. Additionally, /path/to/weights/ can also be speciifed to start training from other weights.

The training set to be used can be specified in train function of the file ./samples/hand/hand.py.

Detection

Use the following to run detection on images and visualize them:

python -W ignore detect.py --image_dir /path/to/folder_containing_images/

The outputs will be stored ./outputs/.

Evaluation

Use the following command to to evaluate a trained Hand-CNN:

python -W ignore samples/hand/hand.py --weight path/to/weights --command test --testset oxford

  • --weight: Path to the trained weights to be used for evaluation.

Datasets

As a part of this project we release two datasets, TV-Hand and COCO-Hand. The TV-Hand dataset contains hand annotations for 9.5K image frames extracted from the ActionThread dataset. The COCO-Hand dataset contains annotations for 25K images of the Microsoft's COCO dataset.

Some images and annotations from the TV-Hand data

Some images and annotations from the COCO-Hand data

Citation

If you use the code or datasets in this repository, please cite the following paper:

@article{Hand-CNN,
  title={Contextual Attention for Hand Detection in the Wild},
  author={Supreeth Narasimhaswamy and Zhengwei Wei and Yang Wang and Justin Zhang and Minh Hoai},
  booktitle={International Conference on Computer Vision (ICCV)},
  year={2019},
  url={https://arxiv.org/pdf/1904.04882.pdf} 
}

hand-cnn's People

Contributors

supreethn avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.