Giter VIP home page Giter VIP logo

cvpr2023_p-mil's Introduction

Proposal-based Multiple Instance Learning for Weakly-supervised Temporal Action Localization (CVPR 2023)

Huan Ren, Wenfei Yang, Tianzhu Zhang, Yongdong Zhang (USTC)

arxiv CVPR2023 project

Requirements

  • Python 3.8
  • Pytorch 1.8.0
  • CUDA 11.1

Required packages are listed in requirements.txt. You can install by running:

conda create -n P-MIL python=3.8
conda activate P-MIL
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge
pip3 install -r requirements.txt

Data Preparation

  1. Prepare THUMOS14 dataset.

    • We recommend using features and annotations provided by W-TALC or CO2-Net.
    • You can also get access of it from Google Drive.
  2. Prepare proposals generated from pre-trained S-MIL model.

    • We recommend using their official codes (such as CO2-Net) to generate proposals.
    • You can just download the proposals used in our paper from Google Drive.
  3. Place the features and annotations inside a data/Thumos14reduced/ folder and proposals inside a proposals folder. Make sure the data structure is as below.

    ├── data
        └── Thumos14reduced
            ├── Thumos14reduced-I3D-JOINTFeatures.npy
            └── Thumos14reduced-Annotations
                ├── Ambiguous_test.txt
                ├── classlist.npy
                ├── duration.npy
                ├── extracted_fps.npy
                ├── labels_all.npy
                ├── labels.npy
                ├── original_fps.npy
                ├── segments.npy
                ├── subset.npy
                └── videoname.npy
    ├── proposals
        ├── detection_result_base_test.json
        ├── detection_result_base_train.json

Running

Training

CUDA_VISIBLE_DEVICES=0 python main.py --run_type train

Testing

The pre-trained model can be downloaded from Google Drive, which is then placed inside a checkpoints folder.

CUDA_VISIBLE_DEVICES=0 python main.py --run_type test --pretrained_ckpt checkpoints/best_model.pkl

Results

The experimental results on THUMOS14 are as below. Note that the performance of checkpoints we provided is slightly different from the orignal paper!

Method \ mAP@IoU (%) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 AVG
P-MIL 70.8 66.5 57.8 48.6 39.8 27.0 14.3 46.4

Citation

@InProceedings{Ren_2023_CVPR,
    author    = {Ren, Huan and Yang, Wenfei and Zhang, Tianzhu and Zhang, Yongdong},
    title     = {Proposal-Based Multiple Instance Learning for Weakly-Supervised Temporal Action Localization},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2023},
    pages     = {2394-2404}
}

Acknowledgement

We referenced the repos below for the code.

cvpr2023_p-mil's People

Contributors

renhuan1999 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

xd-mu mymuli

cvpr2023_p-mil's Issues

Abnormal results

Hello author, during the training process, I found that the mAP values are all very small. Is it normal to have them * 100?

proposals generation

Hello author, first of all, thank you for your excellent work. Can you tell me how to obtain the JSON result file for proposals? Similar to the template you provided for CO2Net.

Feature Extraction

Hi, congrats for the work.

Can you share a feature extraction script for testing our own videos?

ask for proposals

Hello author, first of all, thank you for your excellent work. Can you tell me how to obtain the JSON result file for proposals? Similar to the template you provided for CO2Net.

ActivityNet V1.3数据集的缺失问题

您好!
我在您的dataset.py文件里面,有看到-I3D-JOINTFeatures.npy文件:

self.path_to_features = os.path.join(args.dataset_root, self.dataset_name + "-I3D-JOINTFeatures.npy")

我想问一下,您是否可以提供一下ActivityNet1.3-I3D-JOINTFeatures.npy文件和ActivityNet1.3-Annotation文件夹里面的文件呢?

或者我如何制作一个总的ActivityNet1.3-I3D-JOINTFeatures.np文件以及ActivityNet1.3-Annotation文件夹里面的classlist.npy、labels_all.npy等文件呢?

冒昧打扰,十分抱歉!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.