I use the finetune_lora.sh to train, the context:

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-ho

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

【RuntimeError: Size Mismatch】 about geochat HOT 6 CLOSED

Luo-Z13 commented on August 10, 2024

【RuntimeError: Size Mismatch】

from geochat.

Comments (6)

Luo-Z13 commented on August 10, 2024 1

@Luo-Z13, can you please check what is the name of the "model_type" in your base model and the saved checkpoint in the config.json file? Please change it to "geochat", if it is "llava". Let me know if that works.

Thank you very much, it works now.

from geochat.

KjAeRsTuIsK commented on August 10, 2024

Hi @Luo-Z13, thank you for your interest.
You need to change the image size from 336 to 504.
image = processor.preprocess(image,do_resize=True,crop_size ={'height': 504, 'width': 504},size = {'shortest_edge': 504}, return_tensors='pt')['pixel_values'][0]
can you please change this line in train.py file, line 690,691.
I have made the changes in the codebase as well. Let me know if it works now.

from geochat.

Luo-Z13 commented on August 10, 2024

Hi @Luo-Z13, thank you for your interest. You need to change the image size from 336 to 504. image = processor.preprocess(image,do_resize=True,crop_size ={'height': 504, 'width': 504},size = {'shortest_edge': 504}, return_tensors='pt')['pixel_values'][0] can you please change this line in train.py file, line 690,691. I have made the changes in the codebase as well. Let me know if it works now.

Thank you for the response, the previous issue has now been resolved. However, I am encountering OOM when using 4*A100(40 GB), details are as follows:

  File "/miniconda-3/envs/geochat/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 216, in forward
  File "/miniconda-3/envs/geochat/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 216, in forward
    down_proj = self.down_proj(self.act_fn(self.gate_proj(x)) * self.up_proj(x))
  File "/miniconda-3/envs/geochat/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    down_proj = self.down_proj(self.act_fn(self.gate_proj(x)) * self.up_proj(x))
  File "/miniconda-3/envs/geochat/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/miniconda-3/envs/geochat/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 216, in forward
    down_proj = self.down_proj(self.act_fn(self.gate_proj(x)) * self.up_proj(x))
  File "/miniconda-3/envs/geochat/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/miniconda-3/envs/geochat/lib/python3.10/site-packages/peft/tuners/lora.py", line 822, in forward
    down_proj = self.down_proj(self.act_fn(self.gate_proj(x)) * self.up_proj(x))
  File "/miniconda-3/envs/geochat/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
      File "/miniconda-3/envs/geochat/lib/python3.10/site-packages/peft/tuners/lora.py", line 822, in forward
return forward_call(*args, **kwargs)
  File "/miniconda-3/envs/geochat/lib/python3.10/site-packages/peft/tuners/lora.py", line 822, in forward
    return forward_call(*args, **kwargs)
  File "/miniconda-3/envs/geochat/lib/python3.10/site-packages/peft/tuners/lora.py", line 822, in forward
            self.lora_B[self.active_adapter](self.lora_B[self.active_adapter](self.lora_B[self.active_adapter](


torch.cudatorch.cudatorch.cuda...OutOfMemoryErrorOutOfMemoryErrorOutOfMemoryError: : : CUDA out of memory. Tried to allocate 1.04 GiB (GPU 3; 39.39 GiB total capacity; 29.67 GiB already allocated; 1.02 GiB free; 36.57 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONFCUDA out of memory. Tried to allocate 1.07 GiB (GPU 1; 39.39 GiB total capacity; 30.12 GiB already allocated; 397.12 MiB free; 37.16 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONFCUDA out of memory. Tried to allocate 1.04 GiB (GPU 2; 39.39 GiB total capacity; 29.76 GiB already allocated; 911.12 MiB free; 36.66 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


    self.lora_B[self.active_adapter](
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.06 GiB (GPU 0; 39.39 GiB total capacity; 29.99 GiB already allocated; 719.12 MiB free; 36.89 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

  0%|          | 0/2413 [00:44<?, ?it/s]
[2024-03-04 22:06:58,942] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 176970
[2024-03-04 22:06:59,647] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 176971
[2024-03-04 22:06:59,665] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 176972
[2024-03-04 22:06:59,681] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 176973

from geochat.

Luo-Z13 commented on August 10, 2024

Scripts merge_lora_weights.py seems to have an issue at the beginning (from llava... ?). After I changed it from

from llava.model.builder import load_pretrained_model
from llava.mm_utils import get_model_name_from_path

from geochat.model.builder import load_pretrained_model
from geochat.mm_utils import get_model_name_from_path

an error occurred:
Traceback (most recent call last):
File "GeoChat/scripts/merge_lora_weights.py", line 24, in
merge_lora(args)
File "GeoChat/scripts/merge_lora_weights.py", line 10, in merge_lora
tokenizer, model, image_processor, context_len = load_pretrained_model(args.model_path, args.model_base, model_name, device_map='cpu')
File "GeoChat/geochat/model/builder.py", line 110, in load_pretrained_model
model = AutoModelForCausalLM.from_pretrained(model_base, torch_dtype=torch.float16, low_cpu_mem_usage=True, device_map="auto")
File "miniconda-3/envs/geochat/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 461, in from_pretrained
config, kwargs = AutoConfig.from_pretrained(
File "miniconda-3/envs/geochat/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 998, in from_pretrained
config_class = CONFIG_MAPPING[config_dict["model_type"]]
File "miniconda-3/envs/geochat/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 710, in getitem
raise KeyError(key)
KeyError: 'llava'

from geochat.

KjAeRsTuIsK commented on August 10, 2024

@Luo-Z13, can you please check what is the name of the "model_type" in your base model and the saved checkpoint in the config.json file? Please change it to "geochat", if it is "llava". Let me know if that works.

from geochat.

KjAeRsTuIsK commented on August 10, 2024

Closing this issue for now, please reopen if you find any difficulties.

from geochat.

【RuntimeError: Size Mismatch】 about geochat HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent