the original LLaMA max sequence length is 2048 but why is it that the finetuning.sh sc

From various sources: 512 covers 95% of the alpaca data <

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Why use 512 as the max sequence length for fine tuning alpaca? about llama-adapter HOT 2 OPEN

tetratorus commented on May 27, 2024

Why use 512 as the max sequence length for fine tuning alpaca?

from llama-adapter.

Comments (2)

Qubitium commented on May 27, 2024 2

From various sources:

512 covers 95% of the alpaca data
reduce vram training cost
allows higher batching value due to 2, reduced vram training cost

From what I understand, 512 was chosen as a optimal value that balance training output, cost and speed. Obviously you can change that o 1024 or 2048 if you have larger gpu and want a better training output.

from llama-adapter.

gaopengpjlab commented on May 27, 2024 2

@diegomontoya Thanks your perfect explanation.

For alpaca instruction tuning, we choose 512 as the max sequence length.
For dialog instruction tuning, we choose 2048 as the max sequence length.
For image-text alignment in LLaMa-Adapter V2, we choose 96 as the max sequence length.
For multimodal instruction tuning, we choose 512 as the max sequence length.

from llama-adapter.

Why use 512 as the max sequence length for fine tuning alpaca? about llama-adapter HOT 2 OPEN

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent