hiyouga / fastedit Goto Github PK
View Code? Open in Web Editor NEW๐ฉนEditing large language models within 10 secondsโก
License: Apache License 2.0
๐ฉนEditing large language models within 10 secondsโก
License: Apache License 2.0
rt
่ฏท้ฎ็ผ่พๅๆจกๅๆ้ๅฐฑๆฏๅจๅๅบ็กไธไฟฎๆนไบๅ
่ฏท้ฎconfigๅtemplateๆๅฎๆ่ฏดๆๅ๏ผๆ็ดๆฅ็จ่ฟๆ ท็ๆนๅผๆๅฎๆฅ้๏ผ
CUDA_VISIBLE_DEVICES=0 python -m fastedit.editor
--data data/example.json
--model baichuan-inc/Baichuan-7B
--config Baichuan-7B
--template default
Traceback (most recent call last):
File "/home/ec2-user/SageMaker/conda/chatglm_etuning/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/ec2-user/SageMaker/conda/chatglm_etuning/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/ec2-user/SageMaker/FastEdit/fastedit/editor.py", line 71, in
fire.Fire(test_rome)
File "/home/ec2-user/SageMaker/conda/chatglm_etuning/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/home/ec2-user/SageMaker/conda/chatglm_etuning/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/home/ec2-user/SageMaker/conda/chatglm_etuning/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/home/ec2-user/SageMaker/FastEdit/fastedit/editor.py", line 43, in test_rome
hparams = ROMEHyperParams.from_name(config)
File "/home/ec2-user/SageMaker/FastEdit/fastedit/rome/rome_hparams.py", line 91, in from_name
raise NotImplementedError
NotImplementedError
ๆๅฐ็ฝ๏ผๅค่ฐข๏ผ
่ฟ่กๅฝไปค
python -m fastedit.editor --data data/fastedit.json --model ../../weights/Baichuan-13B-Chat/ --config llama-7b --template baichuan
โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ Traceback (most recent call last) โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ /opt/conda/lib/python3.10/runpy.py:196 in _run_module_as_main โ
โ โ
โ 193 โ main_globals = sys.modules["__main__"].__dict__ โ
โ 194 โ if alter_argv: โ
โ 195 โ โ sys.argv[0] = mod_spec.origin โ
โ โฑ 196 โ return _run_code(code, main_globals, None, โ
โ 197 โ โ โ โ โ "__main__", mod_spec) โ
โ 198 โ
โ 199 def run_module(mod_name, init_globals=None, โ
โ โ
โ /opt/conda/lib/python3.10/runpy.py:86 in _run_code โ
โ โ
โ 83 โ โ โ โ โ __loader__ = loader, โ
โ 84 โ โ โ โ โ __package__ = pkg_name, โ
โ 85 โ โ โ โ โ __spec__ = mod_spec) โ
โ โฑ 86 โ exec(code, run_globals) โ
โ 87 โ return run_globals โ
โ 88 โ
โ 89 def _run_module_code(code, init_globals=None, โ
โ โ
โ /workspace/llm/baichuan/code/FastEdit/fastedit/editor.py:79 in <module> โ
โ โ
โ 76 โ
โ 77 โ
โ 78 if __name__ == "__main__": โ
โ โฑ 79 โ fire.Fire(test_rome) โ
โ 80 โ
โ โ
โ /opt/conda/lib/python3.10/site-packages/fire/core.py:141 in Fire โ
โ โ
โ 138 โ context.update(caller_globals) โ
โ 139 โ context.update(caller_locals) โ
โ 140 โ
โ โฑ 141 component_trace = _Fire(component, args, parsed_flag_args, context, name) โ
โ 142 โ
โ 143 if component_trace.HasError(): โ
โ 144 โ _DisplayError(component_trace) โ
โ โ
โ /opt/conda/lib/python3.10/site-packages/fire/core.py:475 in _Fire โ
โ โ
โ 472 โ is_class = inspect.isclass(component) โ
โ 473 โ โ
โ 474 โ try: โ
โ โฑ 475 โ โ component, remaining_args = _CallAndUpdateTrace( โ
โ 476 โ โ โ component, โ
โ 477 โ โ โ remaining_args, โ
โ 478 โ โ โ component_trace, โ
โ โ
โ /opt/conda/lib/python3.10/site-packages/fire/core.py:691 in _CallAndUpdateTrace โ
โ โ
โ 688 โ loop = asyncio.get_event_loop() โ
โ 689 โ component = loop.run_until_complete(fn(*varargs, **kwargs)) โ
โ 690 else: โ
โ โฑ 691 โ component = fn(*varargs, **kwargs) โ
โ 692 โ
โ 693 if treatment == 'class': โ
โ 694 โ action = trace.INSTANTIATED_CLASS โ
โ โ
โ /workspace/llm/baichuan/code/FastEdit/fastedit/editor.py:55 in test_rome โ
โ โ
โ 52 โ โ print("\n\n".join([queries[i] + " " + pre_update_text[i] for i in range(len(quer โ
โ 53 โ โ
โ 54 โ print_loud(f"Applying rome to model") โ
โ โฑ 55 โ model_new, _ = apply_rome_to_model( โ
โ 56 โ โ model_old, โ
โ 57 โ โ tokenizer, โ
โ 58 โ โ requests, โ
โ โ
โ /workspace/llm/baichuan/code/FastEdit/fastedit/rome/rome_main.py:56 in apply_rome_to_model โ
โ โ
โ 53 โ weights_diff = {} โ
โ 54 โ โ
โ 55 โ for request in requests: โ
โ โฑ 56 โ โ deltas = execute_rome(model, tokenizer, request, hparams, batch_first) โ
โ 57 โ โ โ
โ 58 โ โ with torch.no_grad(): โ
โ 59 โ โ โ for w_name, (delta_u, delta_v) in deltas.items(): โ
โ โ
โ /workspace/llm/baichuan/code/FastEdit/fastedit/rome/rome_main.py:97 in execute_rome โ
โ โ
โ 94 โ start_time = time.time() โ
โ 95 โ โ
โ 96 โ # Retrieve weights that user desires to change โ
โ โฑ 97 โ weights = {f"{hparams.rewrite_module_tmp.format(layer)}.weight": โ
โ 98 โ โ โ nethook.get_parameter(model, f"{hparams.rewrite_module_tmp.format(layer)} โ
โ 99 โ โ โ for layer in hparams.layers} โ
โ 100 โ
โ โ
โ /workspace/llm/baichuan/code/FastEdit/fastedit/rome/rome_main.py:98 in <dictcomp> โ
โ โ
โ 95 โ โ
โ 96 โ # Retrieve weights that user desires to change โ
โ 97 โ weights = {f"{hparams.rewrite_module_tmp.format(layer)}.weight": โ
โ โฑ 98 โ โ โ nethook.get_parameter(model, f"{hparams.rewrite_module_tmp.format(layer)} โ
โ 99 โ โ โ for layer in hparams.layers} โ
โ 100 โ โ
โ 101 โ # Save old weights for future restoration โ
โ โ
โ /workspace/llm/baichuan/code/FastEdit/fastedit/utils/nethook.py:372 in get_parameter โ
โ โ
โ 369 โ for n, p in model.named_parameters(): โ
โ 370 โ โ if n == name: โ
โ 371 โ โ โ return p โ
โ โฑ 372 โ raise LookupError(name) โ
โ 373 โ
โ 374 โ
โ 375 def replace_module(model, name, new_module): โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
ๅฆไฝไฝฟ็จๅคๅก่ฟ่ก็ผ่พ๏ผไฝฟ็จ--checkpointing ๅๆฐไนไผOOM๏ผๅฆ13B 70B็ญ
ๆsave_pretrained()ๅ็ๆจกๅๆจ็ๆถๆnan้่ฏฏ
CUDA_VISIBLE_DEVICES=7 python -m fastedit.editor \
--data data/example.json \
--model /path/to/Llama-2-7b-chat-hf \
--config llama-7b \
--template default
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
It will be inf
in generate processing. This method is only use for pretrained model like Llama-2-7b-hf
?
loss 3.28 = 3.28 + 0.0 avg prob of [Rishi Sunak] 0.0498
loss nan = nan + nan avg prob of [Rishi Sunak] nan
loss nan = nan + nan avg prob of [Rishi Sunak] nan
loss nan = nan + nan avg prob of [Rishi Sunak] nan
The gradient of delta weight becomes nan
after the first backward operation.
By using:
with torch.autograd.detect_anomaly():
loss.backward()
We caught a runtime error by the script.
RuntimeError: Function 'MmBackward0' returned nan values in its 0th output.
I suppose that it may be related to the alibi attention masks of Baichuan-13B.
Example:
[{"prompt": "{} was born in a city ", "subject": "Ada Yonath", "target": "Frankfurt",
"queries": ["The birth city of Ada Yonath was "]}]
Command:
CUDA_VISIBLE_DEVICES=0 python -m fastedit.editor --data nobel_dataset.json --model bigscience/bloom-7b1 --config bloom-7b1
Output:
################################
################################
ROMEHyperParams(layers=[5], fact_token='subject_last', v_num_grad_steps=20, v_lr=0.2, v_loss_layer=29, v_weight_decay=0.001, clamp_norm_factor=4, kl_factor=0.0625, mom2_adjustment=False, rewrite_module_tmp='transformer.h.{}.mlp.dense_4h_to_h', layer_module_tmp='transformer.h.{}', mlp_module_tmp='transformer.h.{}.mlp', attn_module_tmp='transformer.h.{}.self_attention', ln_f_module='transformer.ln_f', lm_head_module='lm_head', mom2_dataset='wikipedia', mom2_n_samples=100000, mom2_dtype='float16')
################################
################################
The birth city of Ada Yonath was Tel Aviv, Israel. She was born in the Tel Aviv neighborhood of Neve Shalom. Her father, Yitzhak Yonath, was a professor of physics at the Technion, and her mother, Shulamit, was a teacher. She has two brothers, Yaron and Yitzhak, and two sisters, Shira and Shulamit. She has a younger sister, Yael, who is a mathematician. She has a
############################
############################
Executing ROME algorithm for the update: [Ada Yonath was born in a city ] -> [Frankfurt]
Computing left vector (u)...
Selected u projection object Ada Yonath
Left vector shape: torch.Size([16384])
Computing right vector (v)
Traceback (most recent call last):
File "/opt/conda/envs/fastedit/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/opt/conda/envs/fastedit/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/maxim758/FastEdit/fastedit/editor.py", line 79, in
fire.Fire(test_rome)
File "/opt/conda/envs/fastedit/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/opt/conda/envs/fastedit/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/opt/conda/envs/fastedit/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/home/maxim758/FastEdit/fastedit/editor.py", line 55, in test_rome
model_new, _ = apply_rome_to_model(
File "/home/maxim758/FastEdit/fastedit/rome/rome_main.py", line 56, in apply_rome_to_model
deltas = execute_rome(model, tokenizer, request, hparams, batch_first)
File "/home/maxim758/FastEdit/fastedit/rome/rome_main.py", line 118, in execute_rome
right_vector: torch.Tensor = compute_v(
File "/home/maxim758/FastEdit/fastedit/rome/compute_v.py", line 47, in compute_v
rewriting_targets[i, -target_len-1:-1] = input_tok["input_ids"][i, -target_len:].clone() # build labels
RuntimeError: The expanded size of the tensor (0) must match the existing size (18) at non-singleton dimension 0. Target sizes: [0]. Tensor sizes: [18]
Hello,
Thank you very much for this implementation.
In the Llama implementation, I wonder why and how you choose to edit the down_proj layer instead of gate_proj or up_proj in the MLP module? Thank you very much!
Best,
Wenyue
ๆง่กๅฝไปค๏ผ
python -m fastedit.editor \
--data data/example.json \
--model ../internlm-chat-7b \
--config llama-7b \
--template intern
่พๅบ
Loading checkpoint shards: 100%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 2/2 [00:08<00:00, 4.37s/it]
################################
# #
# Retrieving hyperparameters #
# #
################################
ROMEHyperParams(layers=[5], fact_token='subject_last', v_num_grad_steps=20, v_lr=0.1, v_loss_layer=31, v_weight_decay=0.001, clamp_norm_factor=4, kl_factor=0.0625, mom2_adjustment=False, rewrite_module_tmp='model.layers.{}.mlp.down_proj', layer_module_tmp='model.layers.{}', mlp_module_tmp='model.layers.{}.mlp', attn_module_tmp='model.layers.{}.self_attn', ln_f_module='model.norm', lm_head_module='lm_head', mom2_dataset='wikipedia', mom2_n_samples=100000, mom2_dtype='float16')
################################
# #
# Generating pre-update text #
# #
################################
The prime minister of the United Kingdom is David Cameron<eoa>
The name of prime minister of the UK is The current prime minister of the UK is Boris Johnson.<eoa>
ๆฅๆฌ็้ฆ็ธๅซไฝ ๅฎๅๆไธ<eoa>
ๆฅๆฌ้ฆ็ธๅๅญๆฏ ๅฒธ็ฐๆ้<eoa>
############################
# #
# Applying rome to model #
# #
############################
Executing ROME algorithm for the update: [The prime minister of the UK is] -> [Rishi Sunak]
Computing left vector (u)...
Selected u projection object UK
Left vector shape: torch.Size([11008])
Computing right vector (v)
Lookup index found: -6 | Sentence: The prime minister of the UK isRishi Sunak | Token: UK
Rewrite layer is 5
Tying optimization objective to 31
Recording initial value of v*
loss 5.91 = 5.91 + 0.0 avg prob of [Rishi Sunak] 0.016
loss 3.773 = 3.752 + 0.021 avg prob of [Rishi Sunak] 0.0514
loss 2.498 = 2.473 + 0.025 avg prob of [Rishi Sunak] 0.1038
loss 1.481 = 1.454 + 0.027 avg prob of [Rishi Sunak] 0.2539
loss 0.769 = 0.738 + 0.031 avg prob of [Rishi Sunak] 0.4997
loss 0.273 = 0.235 + 0.037 avg prob of [Rishi Sunak] 0.804
loss 0.083 = 0.039 + 0.043 avg prob of [Rishi Sunak] 0.9628
loss 0.054 = 0.01 + 0.044 avg prob of [Rishi Sunak] 0.9896
loss 0.05 = 0.005 + 0.045 avg prob of [Rishi Sunak] 0.9952
loss 0.05 = 0.004 + 0.047 avg prob of [Rishi Sunak] 0.9965
loss 0.05 = 0.003 + 0.047 avg prob of [Rishi Sunak] 0.9971
loss 0.049 = 0.003 + 0.047 avg prob of [Rishi Sunak] 0.9974
loss 0.048 = 0.002 + 0.046 avg prob of [Rishi Sunak] 0.9977
loss 0.049 = 0.002 + 0.047 avg prob of [Rishi Sunak] 0.9978
loss 0.048 = 0.002 + 0.046 avg prob of [Rishi Sunak] 0.9979
loss 0.046 = 0.002 + 0.044 avg prob of [Rishi Sunak] 0.9979
loss 0.045 = 0.002 + 0.043 avg prob of [Rishi Sunak] 0.998
loss 0.043 = 0.002 + 0.041 avg prob of [Rishi Sunak] 0.9982
loss 0.04 = 0.002 + 0.038 avg prob of [Rishi Sunak] 0.9982
loss 0.037 = 0.002 + 0.035 avg prob of [Rishi Sunak] 0.9983
Delta norm: 34.503
Change in target norm: 9.031 to 35.53 => 26.499
Division Factor: 4.312
Traceback (most recent call last):
File "/var/chris/anaconda3/envs/fastedit/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 59, in _wrapfunc
return bound(*args, **kwds)
TypeError: round() received an invalid combination of arguments - got (out=NoneType, decimals=int, ), but expected one of:
* ()
* (*, int decimals)
didn't match because some of the keywords were incorrect: out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/var/chris/anaconda3/envs/fastedit/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/var/chris/anaconda3/envs/fastedit/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/raid/Chris_yuzhang/FastEdit/fastedit/editor.py", line 71, in <module>
fire.Fire(test_rome)
File "/var/chris/anaconda3/envs/fastedit/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/var/chris/anaconda3/envs/fastedit/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/var/chris/anaconda3/envs/fastedit/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/raid/Chris_yuzhang/FastEdit/fastedit/editor.py", line 52, in test_rome
model_new, _ = apply_rome_to_model(
File "/raid/Chris_yuzhang/FastEdit/fastedit/rome/rome_main.py", line 56, in apply_rome_to_model
deltas = execute_rome(model, tokenizer, request, hparams, batch_first)
File "/raid/Chris_yuzhang/FastEdit/fastedit/rome/rome_main.py", line 118, in execute_rome
right_vector: torch.Tensor = compute_v(
File "/raid/Chris_yuzhang/FastEdit/fastedit/rome/compute_v.py", line 161, in compute_v
print(f"Right vector norm: {np.round(right_vector.norm(), 3)}")
File "/var/chris/anaconda3/envs/fastedit/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 3360, in round
return _wrapfunc(a, 'round', decimals=decimals, out=out)
File "/var/chris/anaconda3/envs/fastedit/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 68, in _wrapfunc
return _wrapit(obj, method, *args, **kwds)
File "/var/chris/anaconda3/envs/fastedit/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 45, in _wrapit
result = getattr(asarray(obj), method)(*args, **kwds)
File "/var/chris/anaconda3/envs/fastedit/lib/python3.10/site-packages/torch/_tensor.py", line 970, in __array__
return self.numpy()
**TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.**
Is there currently a plan to launch support for the qwen model?
ๆณไบ่งฃไธไธๆพๅญๅ ็จ้ฎ้ข๏ผๅฉ็จ24Gๆพๅญ็ๅกๅจๅฏนbaichuan7b่ฟ่ก็ผ่พๆถ๏ผๆไบๅฏไปฅ็ผ่พๆๅ๏ผๆไบไผๆพ็คบOOM๏ผ็จๅพฎ้ฟไธไธขไธข็ๅฅๅญๅฐฑไผOOM๏ผๆณ็ฅ้ๆฐๆฎ้็ไพๅญ้ฟๅบฆๅๆพๅญๅ ็จไน้ด็ๅ ณ็ณป๏ผ
The model is: Llama-2-7b-chat (https://huggingface.co/meta-llama/Llama-2-7b-chat)
{'prompt': 'A patient diagnosed with carcinoma of {} presented with a serum calcium level of 16.4 mmol/L. What will be the first step in management?', 'subject': 'lung', 'target': 'IV fluids and furosemide', 'queries': []}
Executing ROME algorithm for the update: [A patient diagnosed with carcinoma of lung presented with a serum calcium level of 16.4 mmol/L. What will be the first step in management?] -> [IV fluids and furosemide]
Computing left vector (u)...
Selected u projection object lung
Left vector shape: torch.Size([11008])
Computing right vector (v)
Lookup index found: -37 | Sentence: A patient diagnosed with carcinoma of lung presented with a serum calcium level of 16.4 mmol/L. What will be the first step in management?IV fluids and furosemide | Token: lung
Rewrite layer is 5
Tying optimization objective to 31
Recording initial value of v*
Traceback (most recent call last):
File "/mnt/lustre/bo/medical_llm/evaluate_model_with_multiple_datasets.py", line 300, in
File "/mnt/lustre/bo/medical_llm/edit_util.py", line 50, in edit_model
model_new, _ = apply_rome_to_model(
File "/mnt/lustre/bo/medical_llm/FastEdit/fastedit/rome/rome_main.py", line 56, in apply_rome_to_model
deltas = execute_rome(model, tokenizer, request, hparams, batch_first)
File "/mnt/lustre/bo/medical_llm/FastEdit/fastedit/rome/rome_main.py", line 118, in execute_rome
right_vector: torch.Tensor = compute_v(
File "/mnt/lustre/bo/medical_llm/FastEdit/fastedit/rome/compute_v.py", line 97, in compute_v
logits = model(**input_tok).logits
File "/home/bo/anaconda3/envs/lora/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/bo/anaconda3/envs/lora/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 1038, in forward
outputs = self.model(
File "/home/bo/anaconda3/envs/lora/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/bo/anaconda3/envs/lora/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 925, in forward
layer_outputs = decoder_layer(
File "/home/bo/anaconda3/envs/lora/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/bo/anaconda3/envs/lora/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 632, in forward
hidden_states = self.input_layernorm(hidden_states)
File "/home/bo/anaconda3/envs/lora/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/bo/anaconda3/envs/lora/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 113, in forward
return self.weight * hidden_states.to(input_dtype)
RuntimeError: Inference tensors cannot be saved for backward. To work around you can make a clone to get a normal tensor and use it in autograd.
thx
ๆณ้ฎไธไธๆฐๆฎ้ๆ ผๅผๅช่ฝๆ็ ง็ป็example้้ข็้ฃๆ ทๅ๏ผ
################################
################################
Traceback (most recent call last):
File "/home/lyn/miniconda3/envs/fastedit/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/lyn/miniconda3/envs/fastedit/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/lyn/FastEdit/fastedit/editor.py", line 79, in
fire.Fire(test_rome)
File "/home/lyn/miniconda3/envs/fastedit/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/home/lyn/miniconda3/envs/fastedit/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/home/lyn/miniconda3/envs/fastedit/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/home/lyn/FastEdit/fastedit/editor.py", line 46, in test_rome
hparams = ROMEHyperParams.from_name(config)
File "/home/lyn/FastEdit/fastedit/rome/rome_hparams.py", line 97, in from_name
raise NotImplementedError
NotImplementedError
ๆณ่ฏท้ฎไธไธ่ฟ่กไปฅไธๅฝไปคๆถๆฅ้ๆฏๆไนๅไบๅ๏ผๆๅฐๆจกๅๅทฒ็ปไธ่ฝฝๅฐไบๆฌๅฐ๏ผ
CUDA_VISIBLE_DEVICES=0 python -m fastedit.editor
--data data/example.json
--model /home/lyn/gpt2-xl
--config gpt2-xl
--template default
็ดๆฅload 32-bit็ Llama-2-7b-chat-hf model๏ผ
model = AutoModelForCausalLM.from_pretrained(
model_path
)
ไผๆไปฅไธ้่ฏฏ๏ผ
Executing ROME algorithm for the update: [A patient diagnosed with carcinoma of lung presented with a serum calcium level of 16.4 mmol/L. What will be the first step in management?] -> [IV fluids and furosemide]
Computing left vector (u)...
Selected u projection object lung
Left vector shape: torch.Size([11008])
Computing right vector (v)
Lookup index found: -37 | Sentence: A patient diagnosed with carcinoma of lung presented with a serum calcium level of 16.4 mmol/L. What will be the first step in management?IV fluids and furosemide | Token: lung
Rewrite layer is 5
Tying optimization objective to 31
Recording initial value of v*
loss 3.252 = 3.252 + 0.0 avg prob of [IV fluids and furosemide] 0.0395
loss 2.999 = 2.996 + 0.003 avg prob of [IV fluids and furosemide] 0.0508
loss 2.518 = 2.51 + 0.009 avg prob of [IV fluids and furosemide] 0.0823
loss 2.148 = 2.056 + 0.092 avg prob of [IV fluids and furosemide] 0.1295
loss 1.609 = 1.539 + 0.07 avg prob of [IV fluids and furosemide] 0.2176
loss 1.005 = 0.935 + 0.07 avg prob of [IV fluids and furosemide] 0.395
loss 0.443 = 0.349 + 0.094 avg prob of [IV fluids and furosemide] 0.7071
loss 0.168 = 0.09 + 0.079 avg prob of [IV fluids and furosemide] 0.9143
loss 0.059 = 0.025 + 0.034 avg prob of [IV fluids and furosemide] 0.9755
loss 0.055 = 0.019 + 0.036 avg prob of [IV fluids and furosemide] 0.9812
loss 0.042 = 0.008 + 0.035 avg prob of [IV fluids and furosemide] 0.9923
loss 0.037 = 0.005 + 0.032 avg prob of [IV fluids and furosemide] 0.9954
loss 0.035 = 0.004 + 0.031 avg prob of [IV fluids and furosemide] 0.9957
loss 0.032 = 0.004 + 0.028 avg prob of [IV fluids and furosemide] 0.9963
loss 0.029 = 0.003 + 0.026 avg prob of [IV fluids and furosemide] 0.9969
loss 0.026 = 0.003 + 0.023 avg prob of [IV fluids and furosemide] 0.9973
loss 0.023 = 0.002 + 0.02 avg prob of [IV fluids and furosemide] 0.9976
loss 0.02 = 0.002 + 0.018 avg prob of [IV fluids and furosemide] 0.9979
loss 0.019 = 0.002 + 0.017 avg prob of [IV fluids and furosemide] 0.998
loss 0.017 = 0.002 + 0.015 avg prob of [IV fluids and furosemide] 0.9982
Delta norm: 17.499
Change in target norm: 4.375 to 18.048 => 13.673
Division Factor: 3.688
Right vector norm: 4.746
Right vector shape: torch.Size([4096])
Traceback (most recent call last):
File "/data/a/zhangbo/CAP_medical_LLM/evaluate_model_with_multiple_datasets.py", line 300, in
edit_model(global_model, global_tokenizer, list_of_dicts, 'llama-7b')
File "/data/a/zhangbo/CAP_medical_LLM/edit_util.py", line 50, in edit_model
model_new, _ = apply_rome_to_model(
File "/data/a/zhangbo/CAP_medical_LLM/FastEdit/fastedit/rome/rome_main.py", line 56, in apply_rome_to_model
deltas = execute_rome(model, tokenizer, request, hparams, batch_first)
File "/data/a/zhangbo/CAP_medical_LLM/FastEdit/fastedit/rome/rome_main.py", line 134, in execute_rome
upd_matrix = left_vector.unsqueeze(1) @ right_vector.unsqueeze(0)
RuntimeError: expected scalar type Float but found Half
======
ๅฆๆload 16-bit็model:
model = AutoModelForCausalLM.from_pretrained(
model_path,
torch_dtype=torch.float16,
).bfloat16()
ไนไผๆ็ฑปไผผ็้่ฏฏ:
RuntimeError: expected scalar type BFloat16 but found Half
ๆ็ไปฃ็ ็ๆตๆฏ็จptไบๆฌก้ข่ฎญ็ปๅฐๆฐๆฎ่ฎญ็ป่ฟๆจกๅ้้ข๏ผ้ฃๅLLaMA-Efficient-Tuning-main็้ข่ฎญ็ปๆไปไนๅบๅซๅ
FastEdit/fastedit/rome/compute_u.py
Lines 73 to 82 in 76a8cf6
I noticed that get_inv_cov
is not implemented, and this value is correspond to this constant C in original paper:
And for this code snippet:
FastEdit/fastedit/rome/compute_v.py
Line 156 in 76a8cf6
Calculation of ฮ just ignore this constant.
In my experiments, this may lead to a small part of edit fail to apply.
I wonder why left get_inv_cov
function unimplemented. If it is tricky, is there an alternate solution, like directly adding constants for each model into hyperparams?
Looking forward for your reply.๐
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.