ys-zong / vl-icl Goto Github PK
View Code? Open in Web Editor NEWCode for paper: VL-ICL Bench: The Devil in the Details of Benchmarking Multimodal In-Context Learning
Home Page: https://ys-zong.github.io/VL-ICL/
Code for paper: VL-ICL Bench: The Devil in the Details of Benchmarking Multimodal In-Context Learning
Home Page: https://ys-zong.github.io/VL-ICL/
can I sft MLLM with your train split?
Good job, your work inspire me a lot.
Here is some little question about dataset.
Looking forward to get your responce!
Here is my prompt
Induce the concept from the in-context examples. Answer the question with a single word or phase.
We name this is a slation
We name this is a dax
Based on the in-context examples, we name this is a
gt: blicket
model: School bus
But if i change the last part of question to "Can you name this is a blicket or a dax?"
The Model will say "blicket".
I want to know is this prompt ok for VL-ICL benchmark?
Thanks for the awesome paper and repo!
I was trying out LLaVA and noticed that the model predictions were always empty strings, and was able to narrow it down to this line
VL-ICL/utils/model_inference.py
Line 76 in 6ad043d
It seems LLaVA already outputs only the tokens it generated, and not the whole context tokens + generated tokens.
The fix is quite easy, just decode the whole generated sentence, without truncating it first with the input_token_len.
Is my thinking correct or am I missing something?
Could you please provide the requirements.txt for llava?
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.