Giter VIP home page Giter VIP logo

bcnet's Introduction

Temporal Action Proposal Generation with Background Constraint (AAAI 2022)

This repository contains the source code of the BCNet (Background Constraint Network).

TODOs

  • add Inference code
  • add Training code

Overview

Temporal action proposal generation (TAPG) is a challenging task that aims to locate action instances in untrimmed videos with temporal boundaries. To evaluate the confidence of proposals, the existing works typically predict action score of proposals that are supervised by the temporal Intersection-over-Union (tIoU) between proposal and the ground-truth. In this paper, we innovatively propose a general auxiliary Background Constraint idea to further suppress low-quality proposals, by utilizing the background prediction score to restrict the confidence of proposals. In this way, the Background Constraint concept can be easily plug-and-played into existing TAPG methods (e.g., BMN, GTAD). From this perspective, we propose the Background Constraint Network (BCNet) to further take advantage of the rich information of action and background. Specifically, we introduce an Action-Background Interaction module for reliable confidence evaluation, which models the inconsistency between action and background by attention mechanisms at the frame and clip levels. Extensive experiments are conducted on two popular benchmarks,i.e., ActivityNet-1.3 and THUMOS14. The results demonstrate that our method outperforms state-of-the-art methods. Equipped with the existing action classifier, our method also achieves remarkable performance on the temporal action localization task.

Installation

  • Create conda environment

    conda create -n bcnet python=3.7 -y
    source activate bcnet
  • Requirements

    git clone https://github.com/happy-lifi/BCNet.git
    cd BCNet
    pip install -r requirements.txt

Data setup

We use the features provided by G-TAD. To reproduce the results in THUMOS14 without further changes:

  • Download the data from GooogleDrive or OneDrive.

  • Place it into a folder named TSN_pretrain_avepool_allfrms_hdf5 inside data/thumos_feature.

You could also pass the folder containing the HDF5 files if the script admits the following argument --feature_path.

Training

sh ./run/bcn_train.sh

Testing

sh ./run/bcn_infer.sh

THUMOS Results

Method Feature [email protected] [email protected] [email protected] [email protected] [email protected] checkpoint
BCNet TSN 67.4 61.0 52.5 42.4 29.9 [GooogleDrive]

Acknowledgement

We especially thank the contributors of BMN and G-TAD for providing helpful code.

Citing

@article{yang2021temporal,
  title={Temporal Action Proposal Generation with Background Constraint},
  author={Yang, Haosen and Wu, Wenhao and Wang, Lining and Jin, Sheng and Xia, Boyang and Yao, Hongxun and Huang, Hujie},
  journal={Proceedings of the AAAI Conference on Artificial Intelligence},
  year={2022}
}

Contact

For any question, please file an issue or contact

Haosen Yang: [email protected]
Wenhao Wu: [email protected]

bcnet's People

Contributors

happy-hsy avatar whwu95 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

bcnet's Issues

About the background label setting

The label of the background score is generated using temporal Intersection-over-Anchor in the paper, while the maximum iou is used in the code. Why?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.