Giter VIP home page Giter VIP logo

uniformerv2's Introduction

UniFormerV2

This repo is the official implementation of "UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer". By Kunchang Li, Yali Wang, Yinan He, Yizhuo Li, Yi Wang, Limin Wang and Yu Qiao.

Update

07/14/2023

UniFormerV2 has been accepted by ICCV2023! ๐ŸŽ‰

02/13/2023

UniFormerV2 has been integrated into MMAction2. Training code will be provided soon! ๐Ÿ˜„

11/20/2022

We give a video demo in hugging face. Have a try! ๐Ÿ˜„

11/19/2022

We give a blog in Chinese Zhihu.

11/18/2022

All the code, models and configs are provided. Don't hesitate to open an issue if you have any problem! ๐Ÿ™‹๐Ÿป

Introduction

In UniFormerV2, we propose a generic paradigm to build a powerful family of video networks, by arming the pre-trained ViTs with efficient UniFormer designs. It inherits the concise style of the UniFormer block. But it contains brand- new local and global relation aggregators, which allow for preferable accuracy-computation balance by seamlessly integrating advantages from both ViTs and UniFormer. teaser It gets the state-of-the-art recognition performance on 8 popular video benchmarks, including scene-related Kinetics-400/600/700 and Moments in Time, temporal-related Something-Something V1/V2, untrimmed ActivityNet and HACS. In particular, it is the first model to achieve 90% top-1 accuracy on Kinetics-400.

PWC PWC PWC PWC PWC PWC PWC PWC

Model Zoo

All the models can be found in MODEL_ZOO.

Instructions

See INSTRUCTIONS for more details about:

  • Environment installation
  • Dataset preparation
  • Training and validation

Cite Uniformer

If you find this repository useful, please use the following BibTeX entry for citation.

@misc{li2022uniformerv2,
      title={UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer}, 
      author={Kunchang Li and Yali Wang and Yinan He and Yizhuo Li and Yi Wang and Limin Wang and Yu Qiao},
      year={2022},
      eprint={2211.09552},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

License

This project is released under the MIT license. Please see the LICENSE file for more information.

Acknowledgement

This repository is built based on UniFormer and SlowFast repository.

uniformerv2's People

Contributors

andy1621 avatar

Stargazers

ๆŸฏ่€€ๆฐ avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.