Comments (9)
Hello @remvanthull,
Thank you so much for your interest in our work! We are planning to release the GranD dataset along with the pre-training code very soon. You'll be able to experiment with training the model from scratch as the release is scheduled for just a week from now!
Stay tuned, Thank you again for your support.
from groundinglmm.
Hi @hanoonaR,
Thank you so much for your response!
That is amazing! Very much looking forward to that 😄 Are you by any chance also planning on releasing the code for the automated pipeline for the GranD dataset as well? I'm very interested in that too! Thanks again!
from groundinglmm.
Hi @remvanthull,
The upcoming release will indeed include the code for the automated pipeline used to construct the GranD dataset. Additionally, we'll provide detailed instructions on how to convert it into the GLAMM dataset format. Thank you.
from groundinglmm.
Hi @remvanthull
We are pleased to announce the release of our dataset and the code for the automatic annotation pipeline. You can access the dataset here and find details on the annotation pipeline here.
Please let us know if you have any questions. Thank you.
from groundinglmm.
Hi @hanoonaR
Thank you so much for letting me know! I am very excited to delve into this right away. Just one more question; are you also releasing instructions on how to pre-train GLaMM on GranD? As currently, all code/models are for fine-tuning the pre-trained models. I would very much like to experiment with training from scratch! 😄
Thank you again!
from groundinglmm.
Hi @remvanthull,
We've pre-trained the model using the GranD dataset along with some open-source datasets.
From GranD, we cover:
- Referring Expression Segmentation: GrandReferRegDataset
- Region-Level Captioning: GrandReferSegmDataset
- Short Captioning: GrandShortCaptionDataset
- Captioning with Groundings: Refer to the GCG dataset for the dataset class. The annotations are the same as you will prepare for Short captioning - using prepare_grand_caption_grounding.py.
- Object-Level Segmentation: Refer to the SemanticSegmentation dataset for the dataset class. The annotations can be prepred using the prepare_object_lvl_data.py.
From Open Source Datasets, we use:
- For Region Understanding: COCO-2017, RefCOCO, RefCOCO+ (Similar to the pretraining of GPT4RoI)
- For Segmentation: We use the datasets defined in SemanticSegmentation datasets.
- For Instruction Following: LLaVA Instruct 150k.
If you encounter any challenges while configuring these datasets, or if you have specific questions related to any of the details provided, please feel free to reach out. Thank you.
from groundinglmm.
Hi @hanoonaR
Thank you for responding so quickly once again! :)
What I was wondering is what checkpoint you used to pre-train the model, aka what I should set the —version to, so that I can replicate this pre-training myself; the checkpoints in the model zoo are all already pre-trained, but I’d like to experiment with pre-training :)
Thank you,
Rachel
from groundinglmm.
We initialize from LLavA 1.5 (as that's our LLM model).
from groundinglmm.
Hi @remvanthull,
Sorry, I forgot to mention that, you will need to set --pretrained
as False. Please let me know if you face any issues in loading the model from LLaVA-1.5 or setting up any dataset. Thank you.
from groundinglmm.
Related Issues (20)
- Some bugs in the GranD_ReferringSegm_ds.py
- Online Demo Down HOT 1
- Fine-tuning Grounded Conversation Generation (GCG) Task HOT 4
- token_positives HOT 2
- assertion error cur_len == total_len HOT 1
- can not install mmcv HOT 2
- Can not find file for glamm_conda_env.zip in the given Google Drive Link HOT 4
- Training on New Data HOT 2
- training V-L and L-P projection layer HOT 1
- Can not download the train.json file for visual genome
- How can I let the model receive multiple images at once HOT 1
- How should I train on the GranD dataset
- How can I finetune on combined tasks?
- Confusing referring segmentation results. HOT 1
- mmcv failed to install HOT 1
- AssertionError when running a demo
- Offline demo error
- Why is it that during the computation of segmentation results, the model() function is used instead of model.generate()? Wouldn't this mean that when predicting the next token, the information viewed is from the actual token rather than the predicted one?
- What are the ‘categories’ in the dataset used for? When would I use them?
- How to Construct a Ground-Truth Test Dataset
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from groundinglmm.