Giter VIP home page Giter VIP logo

afft's Introduction

Anticipative Feature Fusion Transformer for Multi-Modal Action Anticipation (WACV 2023)

PWC

This repository contains the official source code and data for our AFFT paper. If you find our code or paper useful, please consider citing:

Z. Zhong, D. Schneider, M. Voit, R. Stiefelhagen and J. Beyerer. Anticipative Feature Fusion Transformer for Multi-Modal Action Anticipation. In WACV, 2023.

@InProceedings{Zhong_2023_WACV,
    author    = {Zhong, Zeyun and Schneider, David and Voit, Michael and Stiefelhagen, Rainer and Beyerer, J\"urgen},
    title     = {Anticipative Feature Fusion Transformer for Multi-Modal Action Anticipation},
    booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    month     = {January},
    year      = {2023},
    pages     = {6068-6077}
}

Installation

First clone the repo and set up the required packages in a conda environment.

$ git clone https://github.com/zeyun-zhong/AFFT.git
$ conda env create -f environment.yaml python=3.7
$ conda activate afft

Download Data

Dataset features

AFFT works on pre-extracted features, so you will need to download the features first. You can download the TSN-features from RULSTM for EK100 and for EGTEA Gaze+. The RGB-Swin features are available here and audio features are available here.

Please make sure that your data structure follows the structure shown below. Note that dataset_root_dir in config.yaml should be changed to your specific data path.

Dataset root path (e.g., /home/user/datasets)
├── epickitchens100
│   └── features
│       │── rgb
│       │   └── data.mdb
│       │── rgb_omnivore
│       │   └── data.mdb
│       │── obj
│       │   └── data.mdb
│       │── audio
│       │   └── data.mdb
│       └── flow
│           └── data.mdb
└── egtea
    └── features
        │── TSN-C_3_egtea_action_CE_s1_rgb_model_best_fcfull_hd
        │   └── data.mdb
        │── TSN-C_3_egtea_action_CE_s1_flow_model_best_fcfull_hd
        │   └── data.mdb
        │── TSN-C_3_egtea_action_CE_s2_rgb_model_best_fcfull_hd
        │   └── data.mdb
        │── TSN-C_3_egtea_action_CE_s2_flow_model_best_fcfull_hd
        │   └── data.mdb
        │── TSN-C_3_egtea_action_CE_s3_rgb_model_best_fcfull_hd
        │   └── data.mdb
        └── TSN-C_3_egtea_action_CE_s3_flow_model_best_fcfull_hd
            └── data.mdb

If you use a different organization, you would need to edit rulstm_feats_dir in EK100-common and EGTEA-common.

Model Zoo

Dataset Modalities Performance
(Actions)
Config Model
EK100 R-Swin, O, AU, F
R-TSN, O, AU, F
R-TSN, O, F
18.5 (MT5R)
17.0 (MT5R)
16.4 (MT5R)
expts/01_SA-Fuser_ek100_val_Swin.txt
expts/01_SA-Fuser_ek100_val_TSN.txt
expts/01_SA-Fuser_ek100_val_TSN_wo_audio.txt
link
link
link
EGTEA RGB-TSN, Flow 42.5 (Top-1) expts/02_ek100_avt_tsn.txt link

Training

Recall that dataset_root_dir in config.yaml should be changed to your specific path.

EpicKitchens-100

python run.py -c expts/01_SA-Fuser_ek100_train.txt --mode train --nproc_per_node 2

EGTEA Gaze+

python run.py -c expts/06_SA-Fuser_egtea_train.txt --mode train --nproc_per_node 2

Validation

EpicKitchens-100

python run.py -c expts/01_SA-Fuser_ek100_val_TSN_wo_audio.txt --mode test --nproc_per_node 1

EGTEA Gaze+

python run.py -c expts/06_SA-Fuser_egtea_val.txt --mode test --nproc_per_node 1

Test / Challenge (EK100)

# save logits
python run.py -c expts/01_SA-Fuser_ek100_test_TSN_wo_audio.txt --mode test --nproc_per_node 1

# generate test / challenge file
python challenge.py --prefix_h5 test --models fusion_ek100_tsn_wo_audio_4h_18s --weights 1.

License

This codebase is released under the license terms specified in the LICENSE file. Any imported libraries, datasets or other code follows the license terms set by respective authors.

Acknowledgements

Many thanks to Rohit Girdhar and Antonino Furnari for providing their code and data.

afft's People

Contributors

simplexsigil avatar zeyun-zhong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

afft's Issues

Issues with training

Hi, I'm trying to recreate your results, however I'm having some issues with training.

When trying to run the TSN ek100 training, I get this:

python run.py -c expts/00_RGB_TSN_ek100_train.txt --mode train -n 1 

hydra.errors.MissingConfigException: In 'config': Could not find 'model/fuser/cmfuser'

Available options in 'model/fuser':
        CA-Fuser
        MATT
        SA-Fuser
        SA-Fuser_wo_token
        T-SA-Fuser
Config search path:
        provider=hydra, path=pkg://hydra.conf
        provider=main, path=file:///media/lucas/Linux SSD/AFFT/conf
        provider=schema, path=structured://
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 1921100) of binary: /home/lucas/anaconda3/envs/afft/bin/python

When trying to run any of the other options I get an error similar to this:

python run.py -c expts/01_SA-Fuser_ek100_train.txt --mode train --nproc_per_node 2

hydra.errors    .overrides=overrides,MissingConfigException
: In 'config': Could not find 'model/backbone/identity'

Config search path:
        provider=hydra, path=pkg://hydra.conf
        provider=main, path=file:///media/lucas/Linux SSD/AFFT/conf
        provider=schema, path=structured://  File "/home/lucas/anaconda3/envs/afft/lib/python3.7/site-packages/hydra/_internal/defaults_list.py", line 485, in _create_defaults_tree_impl

    config_not_found_error(repo=repo, tree=root)
  File "/home/lucas/anaconda3/envs/afft/lib/python3.7/site-packages/hydra/_internal/defaults_list.py", line 804, in config_not_found_error
    options=options,
hydra.errors.MissingConfigException: In 'config': Could not find 'model/backbone/identity'

Config search path:
        provider=hydra, path=pkg://hydra.conf
        provider=main, path=file:///media/lucas/Linux SSD/AFFT/conf
        provider=schema, path=structured://
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 1923711) of binary: /home/lucas/anaconda3/envs/afft/bin/python

Do you know why this might be happening? Apologies if I'm missing something obvious, and thanks in advance.

Release date

Hi! Thanks for your great work!|
I wondered if you plan to release code in the near time?

I can't find the audio data for the EGTEA Gaze+ dataset

I admire you all doing such wonderful work. I'm interested in the two datasets you used, but the .mp4 files I downloaded from the connection[https://cbs.ic.gatech.edu/fpv/#egtea_gaze_plus] don't contain audio. Is this because the datasets EGTEA Gaze+ don't contain audio? Where can I get the audio data from?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.