Giter VIP home page Giter VIP logo

bike's Introduction

Hi, I'm Wenhao Wu 👋

Wenhao Wu 知乎 github LinkedIn Google Scholar X

Wenhao Wu (吴文灏🇨🇳) is a Ph.D. student in the School of Computer Science at The University of Sydney, supervised by Prof. Wanli Ouyang. I have a close collaboration with Department of Computer Vision Technology (VIS) at Baidu led by Dr. Jingdong Wang (IEEE Fellow). I received my M.S.E degree from Multimedia Laboratory (MMLab@SIAT), University of Chinese Academy of Sciences, supervised by Prof. Shifeng Chen and Prof. Yu Qiao. I was also fortunate to intern/RA at MMLab@CUHK, Baidu, iQIYI, SenseTime, Samsung Research and Chinese Academy of Sciences. I am honored to be awarded the 11th Baidu Scholarship (2023).

My current research interest includes Cross-Modal Learning and Video Understanding. I have published 20+ papers at the top international CV/AI conferences or journals such as CVPR/ICCV/ECCV/AAAI/IJCAI/ACMMM/IJCV.

Wenhao Wu's GitHub stats Top Langs

🔭 Research Interest

My research interests broadly lie in the areas of Computer Vision and Deep Learning, including:

  • Cross-Modal Learning (2022-Present): Video-Language Matching, Multimodal Large Language Model (MLLM)
  • Video Foundation Model (2017-Present): Video Recognition, Efficient Video Tuning
  • Video-related Applications (2017-2022): Video Sampler, Temporal Action Detection, Anomaly Detction in Video
  • Self-supervised Learning (2021-2022): Contrastive Video Learning, Masked Video Modeling
  • Low-level Vision (2021-2022): Image Colorization, Style Transfer, Image Rescaling

🔥 News

  • 2024.01: I am honored to receive the 11th🎖Baidu Scholarship🎖, a prestigious fellowship awarding 200,000 RMB (about $30,000) to a select 10 PhD students worldwide in Artificial Intelligence, selected from thousands of applicants.
  • 2023.11: We release GPT4Vis , which provides a Quantitative Evaluation of GPT-4 for Visual Understanding across images, videos and point clouds, spinning on 16 popular datasets.
  • 2023.11: We release Side4Video , a Spatial-Temporal Side Network for Memory-Efficient Image-to-Video Transfer Learning, which significantly reduces the training memory cost for action recognition (↓75%) and text-video retrieval (↓30%).
  • 2023.08: The extension of Text4Vis has been accepted by IJCV.
  • 2023.07: Two First-author papers (Temporal Modeling: ATM , Cross-Modal Retrieval: UA ) are accepted by ICCV2023.
  • 2023.02: Two First-author papers for video understanding (BIKE , Cap4Video ) are accepted by CVPR 2023. Cap4Video involves GPT to enhance text-video learning, is selected as a 🎉Highlight paper🎉 (Top 2.5%).
  • 2022.11: Two papers (Video Recognition: Text4Vis , Style Transfer: AdaCM) are accepted by AAAI 2023.
  • 2022.07: Three papers (Video Sampling: NSNet, TSQNet, Cross-Modal Learning: CODER) are accepted by ECCV 2022.
  • 2022.06: Our MaMiCo, a new video self-supervised learning work, is accepted by ACMMM 2022 (🎉Oral Presentation🎉).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.