mbzuai-oryx / palo Goto Github PK
View Code? Open in Web Editor NEWVision-language conversation in 10 languages including English, Chinese, French, Spanish, Russian, Japanese, Arabic, Hindi, Bengali and Urdu.
License: Apache License 2.0
Vision-language conversation in 10 languages including English, Chinese, French, Spanish, Russian, Japanese, Arabic, Hindi, Bengali and Urdu.
License: Apache License 2.0
Hi @mmaaz60,
Thanks for your great work and open sourcing!
I am trying to evaluate PALO-7B (loaded from transformers) on the multilingual-llava-in-the-wild, but I find the performance is much lower than the reported numbers. Here are the results I got:
Model | English | Chinese |
---|---|---|
PALO-7B (paper) | 64.2 | 55.7 |
PALO-7B (my results) | 54.0 | 43.0 |
Here are the generated content files:
PALO-7B_English_content.json
PALO-7B_Chinese_content.json
Here are the evaluation files with scores:
PALO-7B_English.json
PALO-7B_Chinese.json
Summaries produced by palo/eval/summarize_gpt_review.py
PALO-7B_English
all 54.0 85.2 46.0
llava_bench_complex 62.8 82.5 51.8
llava_bench_conv 52.4 86.5 45.3
llava_bench_detail 40.6 88.7 36.0
PALO-7B_Chinese
all 43.0 86.0 37.0
llava_bench_complex 55.6 82.9 46.1
llava_bench_conv 27.2 88.8 24.1
llava_bench_detail 39.1 88.7 34.7
Is there a significant discrepancy between the content I generated and yours, or there are issues in evaluation? Do you have any idea about this, or share the generated result files with me?
Dear authors,
Thanks for your great work! I am very interesting in the multilingual ability of LMM. Do you have any plan to release the dataset and checkpoint of that paper? They are really helpful to me!
Hi
Will you be releasing the dataset? Specially looking for the Bengali one.
Hi,
I'm really excited to try your pretrained models, but it seems they haven't been integrated into the transformers library yet.
I tried loading MBZUAI/PALO-7B using both the "image-to-text" pipeline and using AutoModel with the latest version of transformers (v4.39.3) and got this error:
The checkpoint you are trying to load has model type
palo but Transformers does not recognize this architecture
Am I missing something?
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.