Giter VIP home page Giter VIP logo

object_level_visual_reasoning's Introduction

Object level Visual Reasoning in Videos

This repository contains a Pytorch implementation of "Object level Visual Reasoning in Videos", F. Baradel, N. Neverova, C. Wolf, J. Mille, G. Mori, In ECCV 2018.

Links: Project page | Camera-ready | Complementary Mask Data

Code

We release code for training and testing our implementation. We encourage you to follow the steps below:

Masks

Please visit the following website for downloading the mask predictions.

Requirements

  • pytorch 0.4.0
  • numpy
  • lintel - make sure that you have already installed this library (important for decoding videos on the fly)

Citation

If you find this paper or our implementation useful for your research or if you use the precomputed masks, please cite our paper.

@InProceedings{Baradel_2018_ECCV,
author = {Baradel, Fabien and Neverova, Natalia and Wolf, Christian and Mille, Julien and Mori, Greg},
title = {Object Level Visual Reasoning in Videos},
booktitle = {ECCV},
year = {2018}
}

Acknowledgements

This work was funded by grant Deepvision (ANR-15- CE23-0029, STPGP-479356-15), a joint French/Canadian call by ANR & NSERC.

Licence

MIT License

object_level_visual_reasoning's People

Contributors

fabienbaradel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

object_level_visual_reasoning's Issues

Training / inference time?

Hi Fabien,

May I ask for the average training and inference time of an iteration with batch size 8, and what GPU did you use?

Thank you very much!

Annotation of the epic-55 validation set in the paper.

Thank you and your team for your contribution in bringing such an excellent paper to the video understanding community. Could you provide me with the annotation file for the epic-55 validation set in your article? Although you have provided the division in your article, I'm sorry that I still don't know how to do it myself.

Demo on own video

Thank you for releasing this repo. Would it be possible to use this model to do a demo on my own video? What would the steps to do so be?

Thanks,

I want to use this project on "Something-something"

here have something questions.
1、I want to use this project on "Something-something", but i haven't this mask data.
2、The dataset of "something-something" with 174 classes in 20bn. but I find only 157 classes in your paper. so I don't know that whether "SS" in 20bn .

dataloader for EPIC kitchen

Hi Fabien,

Thanks so much for putting up the code! May I ask if you could also provide the loader file for EPIC kitchen please?

how to train mask-r-cnn

Hi Fabien,
thank you for your works!
the paper doesn't mention how to train a mask-r-cnn since VLOG dataset does has any mask annotation, i wonder if your team has annotate the mask by yourself or use other open datasets to train mask-r-cnn?
thanks again!

Using mask rcnn data

Hi Fabien,

I want to use the mask-rcnn predictions for EPIC-Kitchen dataset. Can you elaborate how to interpret the pickle files provided as Complementary data-masks. For beginning, I would like to know a way to overlay mask and bounding boxes on the corresponding frames. Any tutorial or readme for doing so would be highly appreciated

Thanks,
Nirat

i have a question

Hello,I run your project,but i can only get maskRCNN__png. I can not get other that you show your result in your article.I want to know how i need to do.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.