Comments (5)
I appreciate your interest in our work. As per VideoChat2 paper, they have reported an average of 60.4 on MVBench with Mistral-7B LLM. In our case, VideoGPT+ obtains 58.7 average score on MVBench with Phi-3-mini-3.8B LLM.
We have released all the model checkpoints, training, and evaluation codes to reproduce our reported results. I hope this will help.
Please let me know if you have any questions. Thank You.
from videogpt-plus.
can eight v100 GPUs train the model?
from videogpt-plus.
@mmaaz60 From the first picture, VideoGpT+ surpassed VideChat2 with a clearly margin, but VideChat2 with mistral actually got better result as for now.
Current days Video MLLMs actually didn't really care about which LLM size they using...
from videogpt-plus.
can eight v100 GPUs train the model?
I appreciate your interest in our work. As we are using Phi-3-Mini with 3.8B model as LLM, the model can be trained easily on 8 V100 GPUs with 32GB memory per GPU. However, we have to turn off the flash attention as it is not supported for V100 GPUs.
I hope it will help. Good Luck! And please let me know if you face any issues.
from videogpt-plus.
@mmaaz60 From the first picture, VideoGpT+ surpassed VideChat2 with a clearly margin, but VideChat2 with mistral actually got better result as for now.
Current days Video MLLMs actually didn't really care about which LLM size they using...
Thank you for your interest in our work. VideoGPT+ is using Phi-3-mini LLM with only 3.8B parameters, and is relatively weaker as compared to Mistral-7B.
On the other hand, if we compare the Vicuna 7B based models for both VideoGPT+ and VideoChat2, we noticed that VideoChat2 obtains 51.1 on average on MVBench, and our Vicuna 7B based variant obtains 53.1 average score.
Further, there are gains in VCGBench and VCGBench-Diverse evaluations as well.
We acknowledge that VideoChat2 is a strong video conversation model, however, our VideoGPT+ obtains better results on multiple benchmarks as discussed in our technical report and all the codes to reproduce our reported results are released on the GitHub.
from videogpt-plus.
Related Issues (20)
- Where to download the VCGplus 110k original video? HOT 6
- full-parameter or lora? HOT 5
- Performance on MVBench HOT 8
- are you planning to relase the inference codes for VideoGPT-plus_LLaMA3-8B-8k and/or VideoGPT-plus_Vicuna-7B-4k HOT 1
- The webm file from ssv2 can not be loaded HOT 3
- Simple Demo HOT 3
- eval/vcgbench/inference/run_ddp_inference.sh HOT 1
- About pre-training stage. HOT 2
- Detailed Video Descriptions HOT 3
- In what order should I reproduce the paper? HOT 6
- About downloading the datatset? HOT 1
- Question about Training Time HOT 1
- Intermediate descriptions for vcg-plus_112k
- Phi3Model ImportError HOT 2
- You are using a model of type phi3 to instantiate a model of type VideoGPT+. This is not supported for all configurations of models and can yield errors.
- “python setup.py install” for flash-attention reports errors HOT 1
- Where can I find the dense captions for the 112K videos?
- Support for Multi-turn Conversations with Fixed Video Input?
- Inquiry about Costs Associated with Video LLM Benchmarks
- Zero-shot QA evaluation
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from videogpt-plus.