Giter VIP home page Giter VIP logo

eva's Introduction

We launch EVA, a vision-centric foundation model to Explore the limits of Visual representation at scAle using only publicly accessible data and academic resources. EVA is a vanilla ViT pre-trained to reconstruct the masked out image-text aligned vision features (i.e., CLIP features) conditioned on visible image patches. Via this pretext task, we can efficiently scale up EVA to one billion parameters, and sets new records on a broad range of representative vision downstream tasks.

EVA is the first open-sourced billion-scale vision foundation model that achieves state-of-the-art performance on a broad range of downstream tasks.

News

$\color{red}{\text{All the code and dozens of state-of-the-art billion-scale models are open-sourced!}}$

Catalog

All EVA model checkpoints are now available at 🤗 Hugging Face Models and BAAI ModelHub (EVA & EVA-CLIP). Try them out!

Summary of EVA's performance

image & video classification

image classificationvideo classification
model#param.IN-1K, e2e ftIN-1K, linearIN-1K, zero-shot12 avg. zero-shotK400K600K700
EVA or EVA-CLIP1.0B89.786.578.575.789.789.882.9

object detection & segmentation

COCO det & ins segLVIS det & ins segsem seg
model#param.det (test)det (val)seg (test)seg (val)detsegCOCO-StuffADE20K
EVA1.0B64.764.555.555.062.255.053.462.3

Citation

If you find our work helpful, please star🌟 this repo and cite📑 our paper. Thanks for your support!

@article{EVA,
  title={EVA: Exploring the Limits of Masked Visual Representation Learning at Scale},
  author={Fang, Yuxin and Wang, Wen and Xie, Binhui and Sun, Quan and Wu, Ledell and Wang, Xinggang and Huang, Tiejun and Wang, Xinlong and Cao, Yue},
  journal={arXiv preprint arXiv:2211.07636},
  year={2022}
}

If you find our open-sourced code & models helpful to your research, please also consider cite📑 this repo.

@misc{EVA_code_models,
  author={Fang, Yuxin and Wang, Wen and Xie, Binhui and Sun, Quan and Wu, Ledell and Wang, Xinggang and Huang, Tiejun and Wang, Xinlong and Cao, Yue},
  title={Code and Models of EVA: Exploring the Limits of Masked Visual Representation Learning at Scale},
  year={2022},
  howpublished = {\url{https://github.com/baaivision/EVA}}
}

License

The content of this project itself is licensed under LICENSE.

Contact

  • For help and issues associated with EVA, or reporting a bug, please open a GitHub Issue. Let's build a better & stronger EVA together :)

  • We are hiring at all levels at BAAI Vision Team, including full-time researchers, engineers and interns. If you are interested in working with us on foundation model, self-supervised learning and multimodal learning, please contact Yue Cao ([email protected]) and Xinlong Wang ([email protected]).

Misc

↳ Stargazers, thank you for your support!

Stargazers repo roster for @baaivision/EVA

↳ Forkers, thank you for your support!

Forkers repo roster for @baaivision/EVA

↳ Star History

Star History Chart

eva's People

Contributors

yuxin-cv avatar caoyue10 avatar camielk avatar quan-sun avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.