Comments (6)
If you fine-tuning model with full parameters it usually does not cause problems. But if you fine-tuning model with peft method such as Lora, it may cause problems
from trl.
@AIR-hl I see. I can see why that is the case, as full-parameter tuning updates the embeddings of the new token in the input embedding layer. I have one closely related question: how is training model with chat template enabled different from using formatting_func
and 'data_collator'? Conceptually I feel they aim to achieve the same goal, and the later is found easily in a lot of tutorials/code online. However, I feel the official huggingface documentation does not address their distinction explicitly. Is there something special that only using chat template can achieve?
Update: actually there might still be a problem. If setup_chat_format
only adds additional special tokens for the beginning/end of a turn in a dialogue, this is fine. But the current implementation also places the original bos, eos tokens regardless of what model is used. I think this would render pre-training useless.
from trl.
RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
size mismatch for base_model.model.model.embed_tokens.weight: copying a param with shape torch.Size([32002, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).
size mismatch for base_model.model.lm_head.weight: copying a param with shape torch.Size([32002, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).
any idea?
from trl.
RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM: size mismatch for base_model.model.model.embed_tokens.weight: copying a param with shape torch.Size([32002, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]). size mismatch for base_model.model.lm_head.weight: copying a param with shape torch.Size([32002, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).
any idea?
were you using lora to fine-tune your model?
from trl.
@AIR-hl I see. I can see why that is the case, as full-parameter tuning updates the embeddings of the new token in the input embedding layer. I have one closely related question: how is training model with chat template enabled different from using
formatting_func
and 'data_collator'? Conceptually I feel they aim to achieve the same goal, and the later is found easily in a lot of tutorials/code online. However, I feel the official huggingface documentation does not address their distinction explicitly. Is there something special that only using chat template can achieve?Update: actually there might still be a problem. If
setup_chat_format
only adds additional special tokens for the beginning/end of a turn in a dialogue, this is fine. But the current implementation also places the original bos, eos tokens regardless of what model is used. I think this would render pre-training useless.
@zyzhang1130 In fact, the set_chat_format
just provides a convenient way to format chat data in json, you can also customize the chat template based on the existing bos
, eos
tokens of the model. The above is just my understanding.
from trl.
RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM: size mismatch for base_model.model.model.embed_tokens.weight: copying a param with shape torch.Size([32002, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]). size mismatch for base_model.model.lm_head.weight: copying a param with shape torch.Size([32002, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).
any idea?
were you using lora to fine-tune your model?
@zyzhang1130 yes
peft_config = LoraConfig( lora_alpha=16, lora_dropout=0.05, r=16, bias="none", task_type="CAUSAL_LM", base_model_name_or_path=model_id, modules_to_save = ["lm_head", "embed_tokens"] )
from trl.
Related Issues (20)
- Possible bug in SFTTrainer HOT 5
- Overflow with padding left warning.
- TR-DPO : Why is the loss not changing at all, and reward/accuracies and reward/margins always = 0? HOT 1
- scripts/dpo.py : Unable to train custom gpt2 model
- Support packing for pretokenized datasets
- DPO Evalution with WandB triggers a `cannot pickle '_thread.lock' object` failure HOT 11
- Will long text be truncated and split into different examples when using packing?
- Some questions about PPO trainer
- Processing issue in Anthropic HH dataset HOT 1
- prompts are not used in `WinRateCallback` HOT 1
- Use `WinRateCallback` without `ref_model` HOT 3
- Bugs in examples/scripts/chat.py
- CUDA error in PPO Trainer
- RPO Loss Inconsistency
- Incorrect reference responses when using PEFT with PPOTrainer HOT 1
- Online DPO scheduler step before optimizer step HOT 1
- push_to_hub from local model HOT 1
- ImportError: cannot import name 'DDPOConfig' from 'trl' (unknown location)
- Why are instructions not masked when performing VSFT for LLaVa?
- SFTTrainer for non-packed dataset HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from trl.