Hi, are there possibility to add a performance benchmark of the open

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

For anyone interested in this topic: <a href="https://huggingface.co

performance benchmark about open-llms HOT 4 CLOSED

touhi99 commented on May 19, 2024 3

performance benchmark

from open-llms.

Comments (4)

LudwigStumpp commented on May 19, 2024 4

@touhi99 Great points you are mentioning, thanks for that!

Here are some remarks from my side to further keep the discussion going and find a suitable spot to add your requested information:

Adding evals results right inside the table

there are many different benchmarks which would require us to add many more additional columns to the table
furthermore, one row inside the table currently features all available model variations. Each of them has a different performance on the eval benchmarks. This means that adding evals into the table would require us to split these rows apart. I currently think that this is not a direction we want to go but I will keep this at the back of my head and discuss with @eugeneyan and @Muhtasham
for identifying good performing models, I created the LLM-Leaderboard, which covers both open and closed models

GPU memory requirements

Roughly speaking, the memory requirements to load a model depend on two things:
- the number of parameters
- the precision used (float32, float16, bfloat16, int8) for these parameters
while the number of parameters are fixed, the precision you use to load the model is generally not
you can easily calculate the memory requirements. For simplicity, assume 1 Giga (G) ~= 1 Billion (B), example:
- 7B model
- multiply with:
  - *1 (e.g for int8) = 7 GB
  - *2 (e.g for float16) = 14 GB
  - *4 (e.g for float32) = 28 GB
- and you get the rough estimate for the GPU memory requirements

EDIT: This is a little naive, as one also needs to account for

gradients for backprop (~ assume to be of the same size as the model params)
first and second order momentum terms of ADAM optimizer (~ assume 2 times the size of the model params)
feature maps in the forward pass (depends on the architecture, ignore for now)
batch size (effects gradient storage and feature map storage, but ignore for now)

Taking above into account, we can get a very naive estimate for fine-tuning with:
MODEL_SIZE [Billion] * PRECISION [Bytes] * 4 (model weights + gradients + ADAM)

So for our 7B model above:

float32: 7 * 4 * 4 = 112 GB

More on this topic:

from open-llms.

LudwigStumpp commented on May 19, 2024

Hi,

do you mean adding the performance results directly to the table?

For now, we have listed some external resources under the Evals on open LLMs section, which cover the performances of the models on various baselines. Do you think this is enough?

from open-llms.

touhi99 commented on May 19, 2024

Hi,

do you mean adding the performance results directly to the table?

For now, we have listed some external resources under the Evals on open LLMs section, which cover the performances of the models on various baselines. Do you think this is enough?

Yes, that would give some rough idea which model to choose among many. I will have a look at the Evals, thanks.

Beside performance, personally, also GPU/tech requirement would also be interesting benchmark to estimate solution. If I propose, an LLM-based solution, what's the min. tech requirement would be for training/fine-tuning/inference. so far many models are coming in and out, but I haven't found any certain data. For example, X model need atleast Y gb gpu ram for inference.

from open-llms.

martinezpl commented on May 19, 2024

For anyone interested in this topic:

https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard

from open-llms.

performance benchmark about open-llms HOT 4 CLOSED

Comments (4)

Adding evals results right inside the table

GPU memory requirements

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent