Comments (6)
Hello, i have the same problem, do you know how to fix it now?
from geochat.
Are you sure that your mismatched size is the same as mine?
I remembered that I closed the issue because it was caused by an incorrect change of the original code somewhere by myself.
But I really can't remember the details.
from geochat.
Are you sure that your mismatched size is the same as mine? I remembered that I closed the issue because it was caused by an incorrect change of the original code somewhere by myself. But I really can't remember the details.
I have fixed this problem today, thanks for your answer.
The reason I got this problem is that I used a wrong model to finetune。
from geochat.
Are you sure that your mismatched size is the same as mine? I remembered that I closed the issue because it was caused by an incorrect change of the original code somewhere by myself. But I really can't remember the details.
I have fixed this problem today, thanks for your answer. The reason I got this problem is that I used a wrong model to finetune。
can you help me doing the same . when i use the model geochat 7b it shows the size mismatch and when using llava-v1.5-7b it shows files not found in the directory . i have double crossed from huggingface that each file is present . i am unnable to train my data due to this . if you dont get my question can u pleas explain how can i train my data .
from geochat.
hi @Hoteryoung
after this commend you provided above:
srun --jobid $SLURM_JOBID \
bash -c "python -m torch.distributed.run \
--nproc_per_node $GPUS_PER_NODE \
--nnodes $SLURM_NNODES \
--node_rank $SLURM_PROCID \
--master_addr $MASTER_ADDR \
--master_port $MASTER_PORT \
geochat/train/train_mem.py \
--lora_enable True \
--model_name_or_path $CODE_DIR/llava-v1.5-7b/ \
--version $PROMPT_VERSION \
--data_path $DATASET_DIR/GeoChat_Instruct.json \
--image_folder $DATASET_DIR/share/softwares/kartik/GeoChat_finetuning/final_images_llava/ \
--vision_tower openai/clip-vit-large-patch14-336/ \
--mm_projector_type mlp2x_gelu \
--pretrain_mm_mlp_adapter $CODE_DIR/llava-v1.5-7b/mm_projector.bin \
--mm_vision_select_layer -2 \
--mm_use_im_start_end False \
--mm_use_im_patch_token False \
--image_aspect_ratio pad \
--bf16 True \
--output_dir $OUTPUT_DIR \
--num_train_epochs 1 \
--per_device_train_batch_size 32 \
--per_device_eval_batch_size 4 \
--gradient_accumulation_steps 1 \
--evaluation_strategy 'no' \
--save_strategy 'epoch' \
--save_steps 10000 \
--save_total_limit 1 \
--learning_rate 2e-4 \
--weight_decay 0. \
--warmup_ratio 0.03 \
--lr_scheduler_type 'cosine' \
--logging_steps 1 \
--tf32 True \
--model_max_length 2048 \
--gradient_checkpointing True \
--lazy_preprocess True \
--dataloader_num_workers 16 \
--report_to wandb \
--deepspeed ./scripts/zero2.json"
Do we need further fine-tune? (Since I read from the paper and this repo that there is another grounding fine-tune). Now I am a bit confused about the procedure. It would be super nice to have some suggestions from you regarding this issue.
Bests,
from geochat.
hi @Hoteryoung
after this commend you provided above:
srun --jobid $SLURM_JOBID \ bash -c "python -m torch.distributed.run \ --nproc_per_node $GPUS_PER_NODE \ --nnodes $SLURM_NNODES \ --node_rank $SLURM_PROCID \ --master_addr $MASTER_ADDR \ --master_port $MASTER_PORT \ geochat/train/train_mem.py \ --lora_enable True \ --model_name_or_path $CODE_DIR/llava-v1.5-7b/ \ --version $PROMPT_VERSION \ --data_path $DATASET_DIR/GeoChat_Instruct.json \ --image_folder $DATASET_DIR/share/softwares/kartik/GeoChat_finetuning/final_images_llava/ \ --vision_tower openai/clip-vit-large-patch14-336/ \ --mm_projector_type mlp2x_gelu \ --pretrain_mm_mlp_adapter $CODE_DIR/llava-v1.5-7b/mm_projector.bin \ --mm_vision_select_layer -2 \ --mm_use_im_start_end False \ --mm_use_im_patch_token False \ --image_aspect_ratio pad \ --bf16 True \ --output_dir $OUTPUT_DIR \ --num_train_epochs 1 \ --per_device_train_batch_size 32 \ --per_device_eval_batch_size 4 \ --gradient_accumulation_steps 1 \ --evaluation_strategy 'no' \ --save_strategy 'epoch' \ --save_steps 10000 \ --save_total_limit 1 \ --learning_rate 2e-4 \ --weight_decay 0. \ --warmup_ratio 0.03 \ --lr_scheduler_type 'cosine' \ --logging_steps 1 \ --tf32 True \ --model_max_length 2048 \ --gradient_checkpointing True \ --lazy_preprocess True \ --dataloader_num_workers 16 \ --report_to wandb \ --deepspeed ./scripts/zero2.json"
Do we need further fine-tune? (Since I read from the paper and this repo that there is another grounding fine-tune). Now I am a bit confused about the procedure. It would be super nice to have some suggestions from you regarding this issue.
Bests,
I did no further fine-tune for the lack of details in the paper.
Based on my test, the VQA and scene classification metrics are close to the results reported in the paper. However, there is an obvious gap in the region grounding metric. Just so you know, I did not test the referring task.
from geochat.
Related Issues (20)
- Minimum memory for the training process
- how to run the lora finetuned model? HOT 6
- metrics about region captioning HOT 2
- training data corrupted HOT 1
- is training necessary ?
- Model for visual grounding
- Calculation of metrics
- Evaluation results about Grounding
- The results of MiniGPT in the paper HOT 2
- when training had an error!
- License for Commercial use
- merge lora
- how to finetune on my custom dataset
- training data corrupt
- Using transformers to use geochat directly
- The error encountered when using ZeRO-2 for training.
- Could you describe the procedure of reproduce the GeoChat? HOT 1
- Multi images HOT 1
- Trying to set a tensor of shape torch.Size([577, 1024]) in "weight" (which has shape torch.Size([1297, 1024])), this look incorrect. HOT 2
- Numbers mismatch for RSVQA-HR evaluation dataset?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from geochat.