Giter VIP home page Giter VIP logo

human-object-relation-network's Introduction

Human-object Relation Network for Action Recognition in Still Images

Introduction

Source Codes for the ICME 2020 paper: "Human-object Relation Network for Action Recognition in Still Images".

View in this repo or IEEE Digital Library.

Surrounding object information has been widely used for action recognition. However, the relation between human and object, as an important cue, is usually ignored in the still image action recognition field. In this paper, we propose a novel approach for action recognition. The key to ours is a human-object relation module. By using the appearance as well as the spatial location of human and object, the module can compute the pair-wise relation information between human and object to enhance features for action classification and can be trained jointly with our action recognition network. Experimental results on two popular datasets demonstrate the effectiveness of the proposed approach. Moreover, our method yields the new state-of-the-art results of 92.8% and 94.6% mAP on the PASCAL VOC 2012 Action and Stanford 40 Actions datasets respectively.

Installation

This project is developed on Python 3.6 with MXNet framework.

Python Packages

mxnet==1.6.0
gluoncv==0.7.0 [optional]
pycocotools==2.0 [optional]
numpy==1.15.4
matplotlib==2.2.2
tqdm==4.23.4

The optional packages are only required if you want to detect object bounding boxes for your own dataset.

Datasets

Name Dataset Download Link Detected Object BBoxes
VOC 2012 Dataset Website Dropbox OR Baidu Net Disk (PassCode: z53z)
Stanford 40 Dataset Website Dropbox OR Baidu Net Disk (PassCode: z53z)
HICO Dataset Website Dropbox OR Baidu Net Disk (PassCode: z53z)

Note: For easy to use, we provide the object bounding boxes used in our paper, which are detected by Faster RCNN.

  1. VOC 2012 dataset:
    1.1 Download the dataset and extract it to ~/Data/.
    1.2 Download the BBoxes and extracted it to ~/Data/VOCdevkit/VOC2012/.

  2. Stanford 40 dataset:
    2.1 Download the dataset and extract it to ~/Data/.
    2.2 Download the BBoxes and extracted it to ~/Data/Stanford40/.

  3. HICO dataset:
    3.1 Download the dataset and extract it to ~/Data/.
    3.2 Move all images in ~/Data/hico/images/train2015 and ~/Data/hico/images/test2015 into its parent folder ~/Data/hico/images/.
    3.3 Download the BBoxes and extracted it to ~/Data/hico/.

Training

  1. Download the pretrained ResNet-50/101 weights, put the weights into ~/.mxnet/models/.
  2. Execute the shell script in ./experiments/[dataset]/, such as:
    sh ./experiments/VOC2012/train.sh
    

Evaluation

  1. Download the pretrained Models or prepare your trained models.
  2. Modify the parameter file path in the test.sh below ./experiments/[dataset]/.
  3. Execute the testing script, such as:
    sh ./experiments/VOC2012/test.sh
    

Models & Results

Pretrained Models: Dropbox OR Baidu Net Disk (PassCode: kjok)

File Name Dataset Split Backbone mAP(%)
horelation_resnet50_v1d_voc_2012.params VOC 2012 Val ResNet-50 91.9
horelation_resnet50_v1d_stanford_40.params Stanford 40 Test ResNet-50 93.1
horelation_resnet101_v1d_stanford_40.params Stanford 40 Test ResNet-101 94.6
horelation_resnet50_v1d_hico.params HICO Test ResNet-50 42.6

Citation

If you feel our code or models helps in your research, kindly cite our papers:

@INPROCEEDINGS{horelation,
author={Wentao Ma and Shuang Liang},
booktitle={2020 IEEE International Conference on Multimedia and Expo (ICME)},
title={Human-Object Relation Network For Action Recognition In Still Images},
year={2020}}

Disclaimer

This repository used code from MXNet, Gluon CV.

human-object-relation-network's People

Contributors

walterma avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

human-object-relation-network's Issues

Unable to access pan.baidu

Thank you for releasing codes. I am unable to access the model weights and detection files that are saved in pan.baidu. How can I access these files?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.