hitz-zentroa / gollie Goto Github PK
View Code? Open in Web Editor NEWGuideline following Large Language Model for Information Extraction
Home Page: https://hitz-zentroa.github.io/GoLLIE/
License: Apache License 2.0
Guideline following Large Language Model for Information Extraction
Home Page: https://hitz-zentroa.github.io/GoLLIE/
License: Apache License 2.0
Hi
I want to run this notebook. Is it possible to do so with the hugginface model instead of using the local_model method from the GoLLIE repo's source code?
The address for the preprocessing code of ACE05 (Preprocessing script) seems to be invalid. Could you provide it again, please?
Hi,
does the Span F1 score in your evaluation script consider the span index similar to https://github.com/chakki-works/seqeval?
'spaceX' vs ('spaceX', 0, 1)
If not, how should I compare the CoNLL F1 score with the literature? Thanks!
https://github.com/hitz-zentroa/GoLLIE/blob/main/src/tasks/utils_scorer.py#L44
EE task in your paper is actually ED(event detection).
According to src/task/wikievents/prompts.py and src/task/wikievents/data_loader.py
Your COARSE EVENT only has trigger in its definition, and your EE task use only COARSE EVENT.
The model actually only generate trigger for an event, it is event detection not event extraction task.
Hi,
Thank you for uploading your code and awesome work to Git.
I have downloaded the ACE'05 dataset and would like to generate the code representation for it. Following your suggestions, I ran the following:
python preprocess_ace.py -i <path_to_raw_ace_files> -o <output_dir> -s <path_to_ACE05-E>
However, there were some issues with this so I made the following changes
(line 915)
if language == "english":
sgm_files = glob.glob(os.path.join(input_path, "*.sgm"))
(line 1171)
input_dir = args.input#os.path.join(args.input, args.lang.title())
after which I was able to run your code and get the following three files in the output directory:
dev.sentence.json
english.json
english.sentence.json
test.sentence.json
train.sentence.json
My first question: are the steps followed above correct?
If they are, next I run the following code (because I just want to run for the ACE dataset)
python -m src.generate_data \
--configs \
${CONFIG_DIR}/ace_config.json \
--output ${OUTPUT_DIR} \
--overwrite_output_dir \
--include_examples
but I get the following errors:
Traceback (most recent call last):
File "/opt/sw/spack/apps/linux-centos8-x86_64/gcc-9.3.0/python-3.8.6-ff/lib/python3.8/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/opt/sw/spack/apps/linux-centos8-x86_64/gcc-9.3.0/python-3.8.6-ff/lib/python3.8/multiprocessing/pool.py", line 51, in starmapstar
return list(itertools.starmap(args[0], args[1]))
File "/scratch/ssrivas6/meta_events/GoLLIE-main/src/generate_data.py", line 35, in multicpu_generator
dataloader = dataloader_cls(config["train_file"], **config)
File "/scratch/ssrivas6/meta_events/GoLLIE-main/src/tasks/ace/data_loader.py", line 465, in __init__
raise ValueError(f"Argument {event['event_type']}:{argument['role']} not found!")
ValueError: Argument Movement:Transport:Person not found!
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/sw/spack/apps/linux-centos8-x86_64/gcc-9.3.0/python-3.8.6-ff/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/opt/sw/spack/apps/linux-centos8-x86_64/gcc-9.3.0/python-3.8.6-ff/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/scratch/ssrivas6/meta_events/GoLLIE-main/src/generate_data.py", line 246, in <module>
main(args)
File "/scratch/ssrivas6/meta_events/GoLLIE-main/src/generate_data.py", line 188, in main
pool.starmap(generator_fn, enumerate(configs))
File "/opt/sw/spack/apps/linux-centos8-x86_64/gcc-9.3.0/python-3.8.6-ff/lib/python3.8/multiprocessing/pool.py", line 372, in starmap
return self._map_async(func, iterable, starmapstar, chunksize).get()
File "/opt/sw/spack/apps/linux-centos8-x86_64/gcc-9.3.0/python-3.8.6-ff/lib/python3.8/multiprocessing/pool.py", line 771, in get
raise self._value
ValueError: Argument Movement:Transport:Person not found!
How can I resolve this error?
RuntimeError: Internal Triton PTX codegen error:
ptxas /tmp/compile-ptx-src-38da7f, line 91; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 91; error : Feature 'cvt with .bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 92; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 92; error : Feature 'cvt with .bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 102; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 102; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or
higher
ptxas /tmp/compile-ptx-src-38da7f, line 104; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 104; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or
higher
ptxas /tmp/compile-ptx-src-38da7f, line 107; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 107; error : Feature 'cvt with .bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 108; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 108; error : Feature 'cvt with .bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 118; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 118; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or
higher
ptxas /tmp/compile-ptx-src-38da7f, line 120; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 120; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or
higher
ptxas /tmp/compile-ptx-src-38da7f, line 129; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 129; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or
higher
ptxas /tmp/compile-ptx-src-38da7f, line 131; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 131; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or
higher
ptxas /tmp/compile-ptx-src-38da7f, line 140; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 140; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or
higher
ptxas /tmp/compile-ptx-src-38da7f, line 142; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 142; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or
higher
ptxas /tmp/compile-ptx-src-38da7f, line 158; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 158; error : Feature 'cvt with .bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 160; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 160; error : Feature 'cvt with .bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 168; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 168; error : Feature 'cvt with .bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 170; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 170; error : Feature 'cvt with .bf16' requires .target sm_80 or higher
ptxas fatal : Ptx assembly aborted due to errors
It's mentioned in the documentation but not in the source code, if I've understood correctly.
If i want to change the base model to something else for fine tuning, what should I be aware of and modify? i see the codebase has a flash attention modeling code for llama2. i’m just curious if it would be possible to just load models from huggingface or i still need to add the flash attention code manually?
Thanks a lot!
Hi GoLLIE research team, I am currently in a group of Vietnamese university students who want to present your paper for an upcoming seminar in our "Introduction to Natural Language Processing" course. Our task is to summarize and explain the contents of your paper to our fellow students and lecturers.
To make it easier to understand for our classmates, we are interested in training GoLLIE using Vietnamese datasets. If it's possible, we would greatly appreciate it if you could provide us with some instructions on how to proceed with this. We sincerely enjoyed reading your paper and believe that it would greatly benefit our presentation.
Here are some datasets for the named-entity-recognition subtask that I found on Hugging Face:
We would be extremely grateful if you could provide us with any guidance or assistance on our endeavor. Please feel free to reach out if you have any questions or require more information from us. We are more than willing to cooperate to make this collaboration successful.
checkpoint keep getting killed. seems like it neeeds 33 gb of memory and its being loaded by fp32. help
when generateing the data sets seems like its taking a very long time. not sure if it actually completing Repo card metadata block was not found. Setting CardData to empty.
WARNING:huggingface_hub.repocard:Repo card metadata block was not found. Setting CardData to empty.
Repo card metadata block was not found. Setting CardData to empty.
WARNING:huggingface_hub.repocard:Repo card metadata block was not found. Setting CardData to empty.
NcbiDisease-NER-dev: 100%|███████████████████| 924/924 [00:03<00:00, 238.68it/s]
NcbiDisease-NER-test: 100%|██████████████████| 941/941 [00:04<00:00, 196.24it/s]
BC5CDR-NER-train-0: 13%|██▌ | 609/4561 [00:03<00:30, 131.50it/s]
BC5CDR-NER-train-0: 38%|██████▊ | 1715/4561 [00:09<00:19, 143.23it/s]
BC5CDR-NER-train-0: 61%|███████████ | 2796/4561 [00:14<00:07, 227.09it/s]
BC5CDR-NER-test: 27%|█████▌ | 1272/4798 [00:09<00:27, 126.57it/s]
BC5CDR-NER-test: 39%|████████▎ | 1890/4798 [00:14<00:23, 124.91it/s]
BroadTwitter-NER-dev: 59%|█████████▍ | 1188/2002 [00:10<00:07, 115.30it/s]
BroadTwitter-NER-dev: 88%|██████████████ | 1755/2002 [00:15<00:02, 114.49it/s]
BroadTwitter-NER-dev: 100%|████████████████| 2002/2002 [00:17<00:00, 115.12it/s]
WNUT17-NER-test: 100%|██████████████████████| 1287/1287 [00:17<00:00, 71.52it/s]
BC5CDR-NER-train-0: 100%|██████████████████| 4561/4561 [00:22<00:00, 199.49it/s]
WNUT17-NER-train-0: 100%|██████████████████| 3394/3394 [00:25<00:00, 133.29it/s]
BroadTwitter-NER-test: 100%|███████████████| 2002/2002 [00:13<00:00, 146.04it/s]
CoNLL03-NER-test: 100%|█████████████████████| 3453/3453 [00:34<00:00, 99.02it/s]
BC5CDR-NER-train-0: 72%|████████████▉ | 3290/4561 [00:17<00:05, 223.53it/s]
BC5CDR-NER-train-0: 75%|█████████████▌ | 3434/4561 [00:17<00:05, 224.29it/s]
BC5CDR-NER-train-0: 100%|█████████████████▉| 4545/4561 [00:22<00:00, 222.88it/s]
BC5CDR-NER-test: 100%|█████████████████████| 4798/4798 [00:36<00:00, 130.22it/s]
FabNER-NER-dev: 100%|███████████████████████| 2183/2183 [00:46<00:00, 46.57it/s]
BC5CDR-NER-train-24: 100%|█████████████████| 4561/4561 [00:25<00:00, 175.85it/s]
WNUT17-NER-train-24: 100%|█████████████████| 3394/3394 [00:29<00:00, 115.00it/s]
BC5CDR-NER-test: 90%|██████████████████▉ | 4341/4798 [00:34<00:03, 122.35it/s]
BC5CDR-NER-test: 97%|████████████████████▍| 4660/4798 [00:36<00:00, 208.44it/s]
... (more hidden) ...
BC5CDR-NER-train-24: 42%|███████ | 1909/4561 [00:14<00:18, 141.97it/s]
BC5CDR-NER-train-24: 89%|███████████████ | 4048/4561 [00:23<00:02, 228.24it/s]
BC5CDR-NER-train-42: 100%|█████████████████| 4561/4561 [00:20<00:00, 221.22it/s]
BC5CDR-NER-train-42: 28%|████▊ | 1276/4561 [00:05<00:14, 233.07it/sRepo card metadata block was not found. Setting CardData to empty.02, 235.46it/s]
WARNING:huggingface_hub.repocard:Repo card metadata block was not found. Setting CardData to empty.ain-0: 46%|█████▍ | 2431/5342 [00:15<00:23, 125.62it/s]
BroadTwitter-NER-train-0: 74%|████████▉ | 3979/5342 [00:28<00:11, 113.66it/s]
BC5CDR-NER-train-42: 100%|████████████████▉| 4559/4561 [00:20<00:00, 228.19it/s]
BroadTwitter-NER-train-0: 79%|█████████▍ | 4219/5342 [00:30<00:10, 112.01it/s]
BroadTwitter-NER-train-0: 100%|████████████| 5342/5342 [00:36<00:00, 146.11it/s]
WNUT17-NER-train-42: 100%|█████████████████| 3394/3394 [00:30<00:00, 111.55it/s]
FabNER-NER-test: 100%|██████████████████████| 2064/2064 [00:37<00:00, 54.63it/s]
BroadTwitter-NER-train-0: 96%|███████████▍| 5114/5342 [00:35<00:01, 197.95it/s]
BC5CDR-NER-dev: 10%|██▎ | 456/4582 [00:03<00:34, 120.66it/s]
CoNLL03-NER-train-0: 100%|███████████████| 14041/14041 [01:33<00:00, 149.84it/s]
WNUT17-NER-dev: 100%|██████████████████████| 1009/1009 [00:08<00:00, 124.45it/s]
BC5CDR-NER-dev: 100%|██████████████████████| 4582/4582 [00:27<00:00, 166.55it/s]
BroadTwitter-NER-train-24: 100%|███████████| 5342/5342 [00:25<00:00, 213.61it/s]
BroadTwitter-NER-train-42: 100%|███████████| 5342/5342 [00:24<00:00, 218.66it/s]
CoNLL03-NER-train-24: 100%|██████████████| 14041/14041 [01:20<00:00, 174.10it/s]
CoNLL03-NER-train-42: 100%|██████████████| 14041/14041 [01:19<00:00, 176.13it/s]
CoNLL03-NER-dev: 100%|█████████████████████| 3250/3250 [00:20<00:00, 156.14it/s]
OntoNotes5-NER-test: 100%|████████████████| 12217/12217 [04:51<00:00, 41.97it/s]
MultiNERD-NER-test: 100%|█████████████████| 32908/32908 [10:59<00:00, 49.93it/s]
NcbiDisease-NER-train-0: 100%|█████████████| 5433/5433 [00:15<00:00, 351.93it/s]
NcbiDisease-NER-train-24: 100%|████████████| 5433/5433 [00:15<00:00, 352.46it/s]
NcbiDisease-NER-train-42: 100%|████████████| 5433/5433 [00:15<00:00, 351.97it/s]
BC5CDR-NER-dev: 78%|█████████████████ | 3554/4582 [00:22<00:05, 194.35it/s]
BC5CDR-NER-dev: 83%|██████████████████▏ | 3781/4582 [00:23<00:03, 221.54it/s]
BC5CDR-NER-dev: 100%|█████████████████████▉| 4569/4582 [00:27<00:00, 211.22it/s]
BroadTwitter-NER-train-24: 81%|████████▉ | 4352/5342 [00:20<00:05, 191.82it/s]
BroadTwitter-NER-train-24: 95%|██████████▍| 5067/5342 [00:23<00:01, 187.31it/s]
BroadTwitter-NER-train-24: 100%|██████████▉| 5327/5342 [00:24<00:00, 205.77it/s]
BroadTwitter-NER-train-42: 100%|███████████| 5342/5342 [00:24<00:00, 202.92it/s]
Describe the task
create custom task.ipynb
fileDescribe the bug
I set use_flash_attention=False
in
model, tokenizer = load_model(
inference=True,
model_weights_name_or_path="/data2/home/ruiqi/GoLLIE/model",
quantization=None,
use_lora=False,
force_auto_device_map=True,
use_flash_attention=False,
torch_dtype="auto"
# torch_dtype="bfloat16"
)
Then everything went well until RUN GoLLIE
model_ouput = model.generate(
**model_input.to(model.device),
max_new_tokens=128,
do_sample=False,
min_new_tokens=0,
num_beams=1,
num_return_sequences=1,
)
and there was an error message:
RuntimeError Traceback (most recent call last)
File <timed exec>:1
File [~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/utils/_contextlib.py:115](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/utils/_contextlib.py:115), in context_decorator.<locals>.decorate_context(*args, **kwargs)
[112](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/utils/_contextlib.py:112) @functools.wraps(func)
[113](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/utils/_contextlib.py:113) def decorate_context(*args, **kwargs):
[114](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/utils/_contextlib.py:114) with ctx_factory():
--> [115](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/utils/_contextlib.py:115) return func(*args, **kwargs)
File [~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1673](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1673), in GenerationMixin.generate(self, inputs, generation_config, logits_processor, stopping_criteria, prefix_allowed_tokens_fn, synced_gpus, assistant_model, streamer, negative_prompt_ids, negative_prompt_attention_mask, **kwargs)
[1656](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1656) return self.assisted_decoding(
[1657](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1657) input_ids,
[1658](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1658) assistant_model=assistant_model,
ref='~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:0'>0</a>;32m (...)
[1669](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1669) **model_kwargs,
[1670](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1670) )
[1671](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1671) if generation_mode == GenerationMode.GREEDY_SEARCH:
[1672](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1672) # 11. run greedy search
-> [1673](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1673) return self.greedy_search(
[1674](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1674) input_ids,
[1675](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1675) logits_processor=logits_processor,
[1676](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1676) stopping_criteria=stopping_criteria,
[1677](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1677) pad_token_id=generation_config.pad_token_id,
[1678](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1678) eos_token_id=generation_config.eos_token_id,
[1679](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1679) output_scores=generation_config.output_scores,
[1680](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1680) return_dict_in_generate=generation_config.return_dict_in_generate,
[1681](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1681) synced_gpus=synced_gpus,
[1682](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1682) streamer=streamer,
[1683](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1683) **model_kwargs,
[1684](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1684) )
[1686](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1686) elif generation_mode == GenerationMode.CONTRASTIVE_SEARCH:
[1687](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1687) if not model_kwargs["use_cache"]:
File [~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:2521](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:2521), in GenerationMixin.greedy_search(self, input_ids, logits_processor, stopping_criteria, max_length, pad_token_id, eos_token_id, output_attentions, output_hidden_states, output_scores, return_dict_in_generate, synced_gpus, streamer, **model_kwargs)
[2518](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:2518) model_inputs = self.prepare_inputs_for_generation(input_ids, **model_kwargs)
[2520](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:2520) # forward pass to get next token
-> [2521](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:2521) outputs = self(
[2522](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:2522) **model_inputs,
[2523](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:2523) return_dict=True,
[2524](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:2524) output_attentions=output_attentions,
[2525](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:2525) output_hidden_states=output_hidden_states,
[2526](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:2526) )
[2528](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:2528) if synced_gpus and this_peer_finished:
[2529](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:2529) continue # don't waste resources running the code we don't need
File [~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501), in Module._call_impl(self, *args, **kwargs)
[1496](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1496) # If we don't have any hooks, we want to skip the rest of the logic in
[1497](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1497) # this function, and just call forward.
[1498](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1498) if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
[1499](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1499) or _global_backward_pre_hooks or _global_backward_hooks
[1500](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1500) or _global_forward_hooks or _global_forward_pre_hooks):
-> [1501](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501) return forward_call(*args, **kwargs)
[1502](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1502) # Do not call functions when jit is used
[1503](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1503) full_backward_hooks, non_full_backward_hooks = [], []
File [~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:164](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:164), in add_hook_to_module.<locals>.new_forward(module, *args, **kwargs)
[162](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:162) output = module._old_forward(*args, **kwargs)
[163](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:163) else:
--> [164](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:164) output = module._old_forward(*args, **kwargs)
[165](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:165) return module._hf_hook.post_forward(module, output)
File [~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1034](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1034), in LlamaForCausalLM.forward(self, input_ids, attention_mask, position_ids, past_key_values, inputs_embeds, labels, use_cache, output_attentions, output_hidden_states, return_dict)
[1031](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1031) return_dict = return_dict if return_dict is not None else self.config.use_return_dict
[1033](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1033) # decoder outputs consists of (dec_features, layer_state, dec_hidden, dec_attn)
-> [1034](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1034) outputs = self.model(
[1035](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1035) input_ids=input_ids,
[1036](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1036) attention_mask=attention_mask,
[1037](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1037) position_ids=position_ids,
[1038](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1038) past_key_values=past_key_values,
[1039](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1039) inputs_embeds=inputs_embeds,
[1040](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1040) use_cache=use_cache,
[1041](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1041) output_attentions=output_attentions,
[1042](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1042) output_hidden_states=output_hidden_states,
[1043](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1043) return_dict=return_dict,
[1044](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1044) )
[1046](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1046) hidden_states = outputs[0]
[1047](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1047) if self.config.pretraining_tp > 1:
File [~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501), in Module._call_impl(self, *args, **kwargs)
[1496](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1496) # If we don't have any hooks, we want to skip the rest of the logic in
[1497](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1497) # this function, and just call forward.
[1498](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1498) if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
[1499](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1499) or _global_backward_pre_hooks or _global_backward_hooks
[1500](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1500) or _global_forward_hooks or _global_forward_pre_hooks):
-> [1501](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501) return forward_call(*args, **kwargs)
[1502](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1502) # Do not call functions when jit is used
[1503](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1503) full_backward_hooks, non_full_backward_hooks = [], []
File [~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:922](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:922), in LlamaModel.forward(self, input_ids, attention_mask, position_ids, past_key_values, inputs_embeds, use_cache, output_attentions, output_hidden_states, return_dict)
[912](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:912) layer_outputs = self._gradient_checkpointing_func(
[913](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:913) decoder_layer.__call__,
[914](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:914) hidden_states,
ref='~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:0'>0</a>;32m (...)
[919](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:919) use_cache,
[920](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:920) )
[921](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:921) else:
--> [922](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:922) layer_outputs = decoder_layer(
[923](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:923) hidden_states,
[924](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:924) attention_mask=attention_mask,
[925](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:925) position_ids=position_ids,
[926](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:926) past_key_value=past_key_value,
[927](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:927) output_attentions=output_attentions,
[928](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:928) use_cache=use_cache,
[929](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:929) )
[931](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:931) hidden_states = layer_outputs[0]
[933](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:933) if use_cache:
File [~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501), in Module._call_impl(self, *args, **kwargs)
[1496](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1496) # If we don't have any hooks, we want to skip the rest of the logic in
[1497](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1497) # this function, and just call forward.
[1498](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1498) if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
[1499](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1499) or _global_backward_pre_hooks or _global_backward_hooks
[1500](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1500) or _global_forward_hooks or _global_forward_pre_hooks):
-> [1501](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501) return forward_call(*args, **kwargs)
[1502](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1502) # Do not call functions when jit is used
[1503](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1503) full_backward_hooks, non_full_backward_hooks = [], []
File [~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:164](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:164), in add_hook_to_module.<locals>.new_forward(module, *args, **kwargs)
[162](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:162) output = module._old_forward(*args, **kwargs)
[163](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:163) else:
--> [164](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:164) output = module._old_forward(*args, **kwargs)
[165](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:165) return module._hf_hook.post_forward(module, output)
File [~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:672](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:672), in LlamaDecoderLayer.forward(self, hidden_states, attention_mask, position_ids, past_key_value, output_attentions, use_cache, **kwargs)
[669](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:669) hidden_states = self.input_layernorm(hidden_states)
[671](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:671) # Self Attention
--> [672](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:672) hidden_states, self_attn_weights, present_key_value = self.self_attn(
[673](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:673) hidden_states=hidden_states,
[674](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:674) attention_mask=attention_mask,
[675](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:675) position_ids=position_ids,
[676](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:676) past_key_value=past_key_value,
[677](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:677) output_attentions=output_attentions,
[678](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:678) use_cache=use_cache,
[679](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:679) **kwargs,
[680](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:680) )
[681](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:681) hidden_states = residual + hidden_states
[683](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:683) # Fully Connected
File [~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501), in Module._call_impl(self, *args, **kwargs)
[1496](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1496) # If we don't have any hooks, we want to skip the rest of the logic in
[1497](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1497) # this function, and just call forward.
[1498](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1498) if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
[1499](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1499) or _global_backward_pre_hooks or _global_backward_hooks
[1500](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1500) or _global_forward_hooks or _global_forward_pre_hooks):
-> [1501](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501) return forward_call(*args, **kwargs)
[1502](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1502) # Do not call functions when jit is used
[1503](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1503) full_backward_hooks, non_full_backward_hooks = [], []
File [~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:164](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:164), in add_hook_to_module.<locals>.new_forward(module, *args, **kwargs)
[162](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:162) output = module._old_forward(*args, **kwargs)
[163](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:163) else:
--> [164](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:164) output = module._old_forward(*args, **kwargs)
[165](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:165) return module._hf_hook.post_forward(module, output)
File [~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:366](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:366), in LlamaAttention.forward(self, hidden_states, attention_mask, position_ids, past_key_value, output_attentions, use_cache, **kwargs)
[363](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:363) value_states = torch.cat(value_states, dim=-1)
[365](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:365) else:
--> [366](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:366) query_states = self.q_proj(hidden_states)
[367](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:367) key_states = self.k_proj(hidden_states)
[368](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:368) value_states = self.v_proj(hidden_states)
File [~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501), in Module._call_impl(self, *args, **kwargs)
[1496](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1496) # If we don't have any hooks, we want to skip the rest of the logic in
[1497](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1497) # this function, and just call forward.
[1498](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1498) if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
[1499](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1499) or _global_backward_pre_hooks or _global_backward_hooks
[1500](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1500) or _global_forward_hooks or _global_forward_pre_hooks):
-> [1501](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501) return forward_call(*args, **kwargs)
[1502](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1502) # Do not call functions when jit is used
[1503](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1503) full_backward_hooks, non_full_backward_hooks = [], []
File [~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:164](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:164), in add_hook_to_module.<locals>.new_forward(module, *args, **kwargs)
[162](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:162) output = module._old_forward(*args, **kwargs)
[163](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:163) else:
--> [164](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:164) output = module._old_forward(*args, **kwargs)
[165](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:165) return module._hf_hook.post_forward(module, output)
File [~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/linear.py:114](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/linear.py:114), in Linear.forward(self, input)
[113](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/linear.py:113) def forward(self, input: Tensor) -> Tensor:
--> [114](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/linear.py:114) return F.linear(input, self.weight, self.bias)
RuntimeError: expected scalar type Float but found BFloat16
System Info
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.