Giter VIP home page Giter VIP logo

llama2-lora-fine-tuning's People

Contributors

little51 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

llama2-lora-fine-tuning's Issues

MultiGPU+Deepspeed+4bitQlora

非常感谢作者~
我目前的情况是,当我使用8张GPU+deepspeed zero3+4bit qlora就会报错
和这个一样:microsoft/DeepSpeed#3775
RuntimeError: expected there to be only one unique element in <generator object Init._convert_to_deepspeed_param..all_gather_coalesced.. at 0x7f7019a30890>
在这个讨论串中,作者尝试修改但仍然报错,怀疑deepspeed是不是目前不支持4bit qlora
但是我如果只用一个gpu跑4bit qlora+deepspeed就不会报错
一旦使用多gpu就会跳上面的错误
我看您有提供4bit的量化finetune,但实际默认的参数是使用8bit
想请问是否用成功用两张gpu+deepspeed+4bit qlora成功finetune过?

请教关于微调

我使用了30m的6w数据, 在A10 上面微调 微调参数和你的差不多 batch size 32 , 24小时, 结果还是回答不了好中文。请问有没有什么建议和经验分享下

ValueError: Attention mask should be of size (4, 1, 240, 480), but is torch.Size([4, 1, 240, 240])

I met this issue when fine-tuning the LLaMa-7B-Chat-hf with example dataset:

Traceback (most recent call last):
File "finetune-lora.py", line 656, in
train()
File "finetune-lora.py", line 622, in train
train_result = trainer.train(resume_from_checkpoint=checkpoint)
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/transformers/trainer.py", line 1537, in train
return inner_training_loop(
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/transformers/trainer.py", line 1854, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/transformers/trainer.py", line 2732, in training_step
self.accelerator.backward(loss)
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/accelerate/accelerator.py", line 1905, in backward
loss.backward(**kwargs)
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/torch/_tensor.py", line 488, in backward
torch.autograd.backward(
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/torch/autograd/init.py", line 197, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/torch/autograd/function.py", line 267, in apply
return user_fn(self, *args)
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/torch/utils/checkpoint.py", line 141, in backward
outputs = ctx.run_function(*detached_inputs)
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/accelerate/hooks.py", line 165, in new_forward
output = module._old_forward(*args, **kwargs)
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 789, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/accelerate/hooks.py", line 165, in new_forward
output = module._old_forward(*args, **kwargs)
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 423, in forward
raise ValueError(
ValueError: Attention mask should be of size (4, 1, 240, 480), but is torch.Size([4, 1, 240, 240])

decoder输出长度是有限制吗?

parser.add_argument('--base_model', default="llama-2-7b-chat-hf/", type=str)
parser.add_argument('--lora_weights', default="tloen/alpaca-lora-7b", type=str,
                    help="If None, perform inference on the base model")
parser.add_argument('--load_8bit', default="True", type=bool,
                    help='only use CPU for inference')

You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This is expected, and simply means that the legacy(previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, setlegacy=False. This should only be set if you understand what it means, and thouroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████| 3/3 [00:15<00:00, 5.12s/it] Question: 给我写一个用户登录注册系统,前端用vue,后端用go,数据库用mysql设计,写出代码。 This is a friendly reminder - the current text generation call will exceed the model's predefined maximum length (2048). Depending on the model, you may observe exceptions, performance degradation, or nothing at all.

请问输出长度是有限制吗?但是感觉2048是不是太短了,怎么能修改这个长度呢?

validation_files

validation_files 自己的训练数据,这个要怎么处理

ImportError: cannot import name 'import_path' from '_pytest.doctest'

这个问题怎么解决,pytest是最新的。
Traceback (most recent call last):
File "/data/home/scv9515/A_suke_file/ALLMs/llama2-lora-fine-tuning/finetune-lora.py", line 45, in
from transformers.testing_utils import CaptureLogger
File "/data/home/scv9515/miniconda3/envs/bili/lib/python3.10/site-packages/transformers/testing_utils.py", line 131, in
from _pytest.doctest import (
ImportError: cannot import name 'import_path' from '_pytest.doctest' (/data/home/scv9515/miniconda3/envs/bili/lib/python3.10/site-packages/_pytest/doctest.py)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.