Temporal Aggregate Representations for Long-Range Video Understanding

This repository provides official PyTorch implementation for our papers:

F. Sener, D. Singhania and A. Yao, "Temporal Aggregate Representations for Long-Range Video Understanding", ECCV 2020 [paper]

F. Sener, D. Chatterjee and A. Yao, "Technical Report: Temporal Aggregate Representations", arXiv:2106.03152, 2021 [paper]

If you use the code/models hosted in this repository, please cite the following papers:

@inproceedings{sener2020temporal,
  title={Temporal aggregate representations for long-range video understanding},
  author={Sener, Fadime and Singhania, Dipika and Yao, Angela},
  booktitle={European Conference on Computer Vision},
  pages={154--171},
  year={2020},
  organization={Springer}
}

@article{sener2021technical,
  title={Technical Report: Temporal Aggregate Representations},
  author={Sener, Fadime and Chatterjee, Dibyadip and Yao, Angela},
  journal={arXiv preprint arXiv:2106.03152},
  year={2021}
}

Dependencies

Python3
PyTorch
Numpy, Pandas, PIL
lmdb, tqdm

Overview

This repository provides code to train, validate and test our models on the EPIC-KITCHENS-55 an EPIC-KITCHENS-100 datasets for the tasks of action anticipation and action recognition.

Features

Follow the RU-LSTM repository to download the RGB, Flow, Obj features and the train/val/test splits and keep them in the data/ek55 or data/ek100 folder depending on the dataset.

For ROI features we consider the union of the hand-object interaction bbox annotations provided by the authors of EPIC-KICTHENS-100 (link) as input and extract RGB features with TSN as explained here.

Pretrained Models

Pretrained models are available only for the EPIC-KITCHENS-100 dataset trained on it's train split. They are provided in the folders models_anticipation and model_recognition.

Validation

To validate our model, run the following:

EPIC-KITCHENS-55

Action Anticipation

RGB: python main_anticipation.py --mode validate --path_to_data data/ek55 --path_to_models models_anticipation/ek55 --modality rgb --video_feat_dim 1024
Flow: python main_anticipation.py --mode validate --path_to_data data/ek55 --path_to_models models_anticipation/ek55 --modality flow --video_feat_dim 1024
Obj: python main_anticipation.py --mode validate --path_to_data data/ek55 --path_to_models models_anticipation/ek55 --modality obj --video_feat_dim 352
ROI: python main_anticipation.py --mode validate --path_to_data data/ek55 --path_to_models models_anticipation/ek55 --modality roi --video_feat_dim 1024
Late Fusion: python main_anticipation.py --mode validate --path_to_data data/ek55 --path_to_models models_anticipation/ek55 --modality late_fusion

Action Recognition

RGB: python main_recognition.py --mode validate --path_to_data data/ek55 --path_to_models models_recognition/ek55 --modality rgb --video_feat_dim 1024
Flow: python main_recognition.py --mode validate --path_to_data data/ek55 --path_to_models models_recognition/ek55 --modality flow --video_feat_dim 1024
Obj: python main_recognition.py --mode validate --path_to_data data/ek55 --path_to_models models_recognition/ek55 --modality obj --video_feat_dim 352
ROI: python main_recognition.py --mode validate --path_to_data data/ek55 --path_to_models models_recognition/ek55 --modality roi --video_feat_dim 1024
Late Fusion: python main_recognition.py --mode validate --path_to_data data/ek55 --path_to_models models_recognition/ek55 --modality late_fusion

EPIC-KITCHENS-100

Action Anticipation

RGB: python main_anticipation.py --mode validate --ek100 --path_to_data data/ek100 --path_to_models models_anticipation/ek100/ --modality rgb --video_feat_dim 1024 --num_class 3806 --verb_class 97 --noun_class 300
Flow: python main_anticipation.py --mode validate --ek100 --path_to_data data/ek100 --path_to_models models_anticipation/ek100/ --modality flow --video_feat_dim 1024 --num_class 3806 --verb_class 97 --noun_class 300
Obj: python main_anticipation.py --mode validate --ek100 --path_to_data data/ek100 --path_to_models models_anticipation/ek100/ --modality obj --video_feat_dim 352 --num_class 3806 --verb_class 97 --noun_class 300
ROI: python main_anticipation.py --mode validate --ek100 --path_to_data data/ek100 --path_to_models models_anticipation/ek100/ --modality roi --video_feat_dim 1024 --num_class 3806 --verb_class 97 --noun_class 300
Late Fusion: python main_anticipation.py --mode validate --ek100 --path_to_data data/ek100 --path_to_models models_anticipation/ek100/ --modality late_fusion --num_class 3806 --verb_class 97 --noun_class 300

Action Recognition

RGB: python main_recognition.py --mode validate --ek100 --path_to_data data/ek100 --path_to_models models_recognition/ek100/ --modality rgb --video_feat_dim 1024 --num_class 3806 --verb_class 97 --noun_class 300
Flow: python main_recognition.py --mode validate --ek100 --path_to_data data/ek100 --path_to_models models_recognition/ek100/ --modality flow --video_feat_dim 1024 --num_class 3806 --verb_class 97 --noun_class 300
Obj: python main_recognition.py --mode validate --ek100 --path_to_data data/ek100 --path_to_models models_recognition/ek100/ --modality obj --video_feat_dim 352 --num_class 3806 --verb_class 97 --noun_class 300
ROI: python main_recognition.py --mode validate --ek100 --path_to_data data/ek100 --path_to_models models_recognition/ek100/ --modality roi --video_feat_dim 1024 --num_class 3806 --verb_class 97 --noun_class 300
Late Fusion: python main_recognition.py --mode validate --ek100 --path_to_data data/ek100 --path_to_models models_recognition/ek100/ --modality late_fusion --num_class 3806 --verb_class 97 --noun_class 300

Here are the validation results on EPIC-KITCHENS-100 as provided in our paper.

Anticipation
Recognition

Testing and submitting the results to the server

To test your model on the EPIC-100 test split, run the following:

Action Anticipation

mkdir -p jsons/anticipation
python main_anticipation.py --mode test --json_directory jsons/anticipation --ek100 --path_to_data data/ek100 --path_to_models models_anticipation/ek100/ --modality late_fusion --num_class 3806 --verb_class 97 --noun_class 300

Action Recognition

mkdir -p jsons/recognition
python main_recognition.py --mode test --json_directory jsons/recognition--ek100 --path_to_data data/ek100 --path_to_models models_recognition/ek100/ --modality late_fusion --num_class 3806 --verb_class 97 --noun_class 300

Custom Training

To train the model, run the following:

EPIC-KITCHENS-55

Action Anticipation

RGB: python main_anticipation.py --mode train --path_to_data data/ek55 --path_to_models models_anticipation/ek55 --modality rgb --video_feat_dim 1024
Flow: python main_anticipation.py --mode train --path_to_data data/ek55 --path_to_models models_anticipation/ek55 --modality flow --video_feat_dim 1024
Obj: python main_anticipation.py --mode train --path_to_data data/ek55 --path_to_models models_anticipation/ek55 --modality obj --video_feat_dim 352
ROI: python main_anticipation.py --mode train --path_to_data data/ek55 --path_to_models models_anticipation/ek55 --modality roi --video_feat_dim 1024

Action Recognition

RGB: python main_recognition.py --mode train --path_to_data data/ek55 --path_to_models models_recognition/ek55 --modality rgb --video_feat_dim 1024
Flow: python main_recognition.py --mode train --path_to_data data/ek55 --path_to_models models_recognition/ek55 --modality flow --video_feat_dim 1024
Obj: python main_recognition.py --mode train --path_to_data data/ek55 --path_to_models models_recognition/ek55 --modality obj --video_feat_dim 352
ROI: python main_recognition.py --mode train --path_to_data data/ek55 --path_to_models models_recognition/ek55 --modality roi --video_feat_dim 1024

EPIC-KITCHENS-100

Action Anticipation

RGB: python main_anticipation.py --mode train --ek100 --path_to_data data/ek100 --path_to_models models_anticipation/ek100/ --modality rgb --video_feat_dim 1024 --num_class 3806 --verb_class 97 --noun_class 300
Flow: python main_anticipation.py --mode train --ek100 --path_to_data data/ek100 --path_to_models models_anticipation/ek100/ --modality flow --video_feat_dim 1024 --num_class 3806 --verb_class 97 --noun_class 300
Obj: python main_anticipation.py --mode train --ek100 --path_to_data data/ek100 --path_to_models models_anticipation/ek100/ --modality obj --video_feat_dim 352 --num_class 3806 --verb_class 97 --noun_class 300
ROI: python main_anticipation.py --mode train --ek100 --path_to_data data/ek100 --path_to_models models_anticipation/ek100/ --modality roi --video_feat_dim 1024 --num_class 3806 --verb_class 97 --noun_class 300

Action Recognition

RGB: python main_recognition.py --mode train --ek100 --path_to_data data/ek100 --path_to_models models_recognition/ek100/ --modality rgb --video_feat_dim 1024 --num_class 3806 --verb_class 97 --noun_class 300
Flow: python main_recognition.py --mode train --ek100 --path_to_data data/ek100 --path_to_models models_recognition/ek100/ --modality flow --video_feat_dim 1024 --num_class 3806 --verb_class 97 --noun_class 300
Obj: python main_recognition.py --mode train --ek100 --path_to_data data/ek100 --path_to_models models_recognition/ek100/ --modality obj --video_feat_dim 352 --num_class 3806 --verb_class 97 --noun_class 300
ROI: python main_recognition.py --mode train --ek100 --path_to_data data/ek100 --path_to_models models_recognition/ek100/ --modality roi --video_feat_dim 1024 --num_class 3806 --verb_class 97 --noun_class 300

Please refer to the papers for more technical details.

Acknowledgements

This code is based on RU-LSTM, hence grateful to the collaborators/maintainers of that repository.

epic-kitchens_55 error

Hi,

I'm having this errorError running Epic-Kitchens-55. Have you encountered this before? Thanks

Save file name anti_mod_rgb_span_6_s1_5_s2_3_s3_2_recent_2_r1_1.6_r2_1.2_r3_0.8_r4_0.4_bs_10_drop_0.3_lr_0.0001_dimLa_512_dimLi_512_epoc_15_vb_nn
Printing Arguments
Namespace(add_noun_loss=True, add_verb_loss=True, alpha=1, batch_size=10, best_model='best', debug_on=False, display_every=10, dropout_rate=0.3, ek100=False, epochs=15, img_tmpl='frame_{:010d}.jpg', json_directory='tempAgg_ant_rec//models_anticipation/', latent_dim=512, linear_dim=512, lr=0.0001, modality='rgb', mode='train', noun_class=352, noun_loss_weight=1.0, num_class=2513, num_workers=0, past_attention=True, path_to_data='/content/drive/MyDrive/Individual_Project/Models/RULSTM/rulstm-master/RULSTM/data/ek55', path_to_models='models_anticipation/ek55', recent_dim=2, recent_sec1=1.6, recent_sec2=1.2, recent_sec3=0.8, recent_sec4=0.4, resume=False, scale=True, scale_factor=-0.5, schedule_epoch=10, schedule_on=1, span_dim1=5, span_dim2=3, span_dim3=2, spanning_sec=6, task='action_anticipation', topK=1, trainval=False, verb_class=125, verb_loss_weight=1.0, verb_noun_scores=True, video_feat_dim=1024, weight_flow=0.1, weight_obj=0.25, weight_rgb=0.4, weight_roi=0.25)
Populating Dataset: 100% 23493/23493 [00:33<00:00, 694.22it/s]
Populating Dataset: 100% 4979/4979 [00:07<00:00, 689.38it/s]
Add verb losses
Add noun losses
/usr/local/lib/python3.7/dist-packages/torch/optim/lr_scheduler.py:134: UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
"https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning)
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [1,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [4,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:455: nll_loss_backward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [7,0,0] Assertion t >= 0 && t < n_classes failed.
Traceback (most recent call last):
File "main_anticipation.py", line 674, in
main()
File "main_anticipation.py", line 531, in main
start_epoch, start_best_perf, schedule_on)
File "main_anticipation.py", line 400, in train_validation
loss.backward()
File "/usr/local/lib/python3.7/dist-packages/torch/_tensor.py", line 307, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/usr/local/lib/python3.7/dist-packages/torch/autograd/init.py", line 156, in backward
allow_unreachable=True, accumulate_grad=True) # allow_unreachable flag
RuntimeError: CUDA error: device-side assert triggered

dibschat / tempagg Goto Github PK

tempagg's Introduction

Temporal Aggregate Representations for Long-Range Video Understanding

Dependencies

Overview

Features

Pretrained Models

Validation

EPIC-KITCHENS-55

Action Anticipation

Action Recognition

EPIC-KITCHENS-100

Action Anticipation

Action Recognition

Testing and submitting the results to the server

Action Anticipation

Action Recognition

Custom Training

EPIC-KITCHENS-55

Action Anticipation

Action Recognition

EPIC-KITCHENS-100

Action Anticipation

Action Recognition

Acknowledgements

tempagg's People

Contributors

Stargazers

Watchers

Forkers

tempagg's Issues

Recommend Projects

Recommend Topics

Recommend Org