Comments (15)
对滴
from ask-anything.
您好,图像数据都是用的M3IT中提供的。
from ask-anything.
谢谢,看了下M3IT,里面json中image是一长串字符,如何将它们对应到VideoChat2给出的“train/39065.jpg”这样的形式?
from ask-anything.
我们是根据M3IT给的标注,根据序列idx生成的idx.jpg
from ask-anything.
没太明白...想请教下如何将M3IT中的"image_str"和CLEVR数据集中具体的image名称对应起来呢?
from ask-anything.
image_str是base64字符串,可以直接读取。我们是转成了RGB图像,image名称是根据for循环遍历M3IT中的数据,对应的idx生成的,不是根据原始CLEVR数据得到的。
from ask-anything.
明白了!您的idx对应的是使用datasets加载数据后遍历的idx对吧?
from ask-anything.
好的,感谢您的解答
from ask-anything.
在输出的时候还是遇到了一些问题,还得请教下您。下面是我的code:
import os
import base64
import datasets
save_dir = "clevr_M3IT"
ds = datasets.load_dataset("./datasets/M3IT/", "clevr", split="train", streaming=True)
cur_dir = os.path.join(save_dir, "train")
i = 0
for d in ds:
image = base64.decodebytes(d["image_base64_str"][0].encode())
with open(cur_dir+f"/{i}.jpg", "wb") as fh:
fh.write(image)
i += 1
在输出了一些图片后,我手动看了下部分图片的内容,发现它们并不能和您在HF发布的OpenGVLab/VideoChat2-IT中的QA匹配,比如train/90.jpg,
[ { "a": "The answer is cylinder.", "i": "Analyze the given image and respond to the associated question with a correct answer.", "q": "There is a green object that is behind the small rubber cylinder that is to the left of the matte cylinder to the right of the gray thing; what is its shape?" } ]
from ask-anything.
奇怪,我们这边不是这个图嘞,我让当时处理的小伙伴康康
from ask-anything.
好的,感谢~
from ask-anything.
你好,找小伙伴check了一下,对于某些数据集(如CLEVR),M3IT里给的meta信息里有image_index
,对于其他数据集,通过for循环的index得到
from ask-anything.
原来如此,不过好像在CLEVR的metadata里没有看到image_index,代码是:
ds = datasets.load_dataset("./datasets/M3IT/", "clevr", split="train", streaming=True)
ds.info
from ask-anything.
Related Issues (20)
- 一二阶段训练数据读入问题 HOT 1
- Question about training stage3 in videochat2 HOT 3
- Warnings in loading the models HOT 1
- The final fine-tuned vision encoder HOT 1
- Any instructions for fine-tuning on custom datasets? HOT 1
- stage1无法eval HOT 1
- 第一个step后第三阶段loss变为nan HOT 3
- MVBench local evaluation HOT 4
- Question about the output scores in videochat2
- Question about attention mask
- 您好,急问为什么执行inference demo时,对一定数量的视频生成caption之后程序就会卡死
- Disk Space, GPU Usage, and Training Duration for Stage3
- Hardware configuration conditions? HOT 1
- 'Dataset' object has no attribute 'components' HOT 2
- SH-IDC1-10-140-1-1, 10068 | cannot be retrieved HOT 1
- Is there a bug in mvbench.ipynb check_ans
- typo: assert -> asset for folder name
- How to implement VideoChat2_text HOT 1
- hola queremos follar HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ask-anything.