Giter VIP home page Giter VIP logo

mars's Introduction

MARS: Motion-Augmented RGB Stream for Action Recognition

By Nieves Crasto, Philippe Weinzaepfel, Karteek Alahari and Cordelia Schmid

MARS is a strategy to learn a stream that takes only RGB frames as input but leverages both appearance and motion information from them. This is achieved by training a network to minimize the loss between its features and the Flow stream, along with the cross entropy loss for recognition. For more details, please refer to our CVPR 2019 paper and our website.

We release the testing code along trained models.

Citing MARS

@inproceedings{crasto2019mars,
  title={{MARS: Motion-Augmented RGB Stream for Action Recognition}},
  author={Crasto, Nieves and Weinzaepfel, Philippe and Alahari, Karteek and Schmid, Cordelia},
  booktitle={CVPR},
  year={2019}
}

Contents

  1. Requirements
  2. Datasets
  3. Models
  4. Testing

Requirements

conda install pytorch torchvision cudatoolkit=9.0 -c pytorch
  • ffmpeg version 3.2.4

  • OpenCV with GPU support (will not be providing support in compiling this part)

  • Directory tree

   dataset/
       HMDB51/ 
           ../(dirs of class names)
               ../(dirs of video names)
       HMDB51_labels/
   results/
       test.txt
   trained_models/
       HMDB51/
           ../(.pth files)

Datasets

python utils1/extract_frames.py path_to_video_files path_to_extracted_frames start_class end_class
  • To extract optical flows + frames from videos
    • Build
    export OPENCV=path_where_opencv_is_installed
    
    g++ -std=c++11 tvl1_videoframes.cpp -o tvl1_videoframes -I${OPENCV}include/opencv4/ -L${OPENCV}lib64 -lopencv_objdetect -lopencv_features2d -lopencv_imgproc -lopencv_highgui -lopencv_core -lopencv_imgcodecs -lopencv_cudaoptflow -lopencv_cudaarithm
    
    python utils1/extract_frames_flows.py path_to_video_files path_to_extracted_flows_frames start_class end_class gpu_id
    

Models

Trained models can be found here. The names of the models are in the form of

stream_dataset_frames.pth     

RGB_Kinetics_16f.pth indicates --modality RGB --dataset Kinetics --sample_duration 16

For HMDB51 and UCF101, we have only provided trained models for the first split.

Testing script

For RGB stream:

python test_single_stream.py --batch_size 1 --n_classes 51 --model resnext --model_depth 101 \
--log 0 --dataset HMDB51 --modality RGB --sample_duration 16 --split 1 --only_RGB  \
--resume_path1 "trained_models/HMDB51/RGB_HMDB51_16f.pth" \
--frame_dir "dataset/HMDB51" \
--annotation_path "dataset/HMDB51_labels" \
--result_path "results/"

For Flow stream:

python test_single_stream.py --batch_size 1 --n_classes 51 --model resnext --model_depth 101 \
--log 0 --dataset HMDB51 --modality Flow --sample_duration 16 --split 1  \
--resume_path1 "trained_models/HMDB51/Flow_HMDB51_16f.pth" \
--frame_dir "dataset/HMDB51" \
--annotation_path "dataset/HMDB51_labels" \
--result_path "results/"

For single stream MARS:

python test_single_stream.py --batch_size 1 --n_classes 51 --model resnext --model_depth 101 \
--log 0 --dataset HMDB51 --modality RGB --sample_duration 16 --split 1 --only_RGB  \
--resume_path1 "trained_models/HMDB51/MARS_HMDB51_16f.pth" \
--frame_dir "dataset/HMDB51" \
--annotation_path "dataset/HMDB51_labels" \
--result_path "results/"

For two streams RGB+MARS:

python test_two_stream.py --batch_size 1 --n_classes 51 --model resnext --model_depth 101 \
--log 0 --dataset HMDB51 --modality RGB --sample_duration 16 --split 1 --only_RGB  \
--resume_path1 "trained_models/HMDB51/RGB_HMDB51_16f.pth" \
--resume_path2 "trained_models/HMDB51/MARS_HMDB51_16f.pth" \
--frame_dir "dataset/HMDB51" \
--annotation_path "dataset/HMDB51_labels" \
--result_path "results/"

For two streams RGB+Flow:

python test_two_stream.py --batch_size 1 --n_classes 51 --model resnext --model_depth 101 \
--log 0 --dataset HMDB51 --modality RGB_Flow --sample_duration 16 --split 1 \
--resume_path1 "trained_models/HMDB51/RGB_HMDB51_16f.pth" \
--resume_path2 "trained_models/HMDB51/Flow_HMDB51_16f.pth" \
--frame_dir "dataset/HMDB51/HMDB51_frames/" \
--annotation_path "dataset/HMDB51_labels" \
--result_path "results/"

mars's People

Contributors

craston avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.