Giter VIP home page Giter VIP logo

Comments (4)

LudwigStumpp avatar LudwigStumpp commented on May 19, 2024 4

@touhi99 Great points you are mentioning, thanks for that!

Here are some remarks from my side to further keep the discussion going and find a suitable spot to add your requested information:

Adding evals results right inside the table

  • there are many different benchmarks which would require us to add many more additional columns to the table
  • furthermore, one row inside the table currently features all available model variations. Each of them has a different performance on the eval benchmarks. This means that adding evals into the table would require us to split these rows apart. I currently think that this is not a direction we want to go but I will keep this at the back of my head and discuss with @eugeneyan and @Muhtasham
  • for identifying good performing models, I created the LLM-Leaderboard, which covers both open and closed models

GPU memory requirements

  • Roughly speaking, the memory requirements to load a model depend on two things:
    • the number of parameters
    • the precision used (float32, float16, bfloat16, int8) for these parameters
  • while the number of parameters are fixed, the precision you use to load the model is generally not
  • you can easily calculate the memory requirements. For simplicity, assume 1 Giga (G) ~= 1 Billion (B), example:
    • 7B model
    • multiply with:
      • *1 (e.g for int8) = 7 GB
      • *2 (e.g for float16) = 14 GB
      • *4 (e.g for float32) = 28 GB
    • and you get the rough estimate for the GPU memory requirements

EDIT: This is a little naive, as one also needs to account for

  • gradients for backprop (~ assume to be of the same size as the model params)
  • first and second order momentum terms of ADAM optimizer (~ assume 2 times the size of the model params)
  • feature maps in the forward pass (depends on the architecture, ignore for now)
  • batch size (effects gradient storage and feature map storage, but ignore for now)

Taking above into account, we can get a very naive estimate for fine-tuning with:
MODEL_SIZE [Billion] * PRECISION [Bytes] * 4 (model weights + gradients + ADAM)

So for our 7B model above:

  • float32: 7 * 4 * 4 = 112 GB

More on this topic:

from open-llms.

LudwigStumpp avatar LudwigStumpp commented on May 19, 2024

Hi,

do you mean adding the performance results directly to the table?

For now, we have listed some external resources under the Evals on open LLMs section, which cover the performances of the models on various baselines. Do you think this is enough?

from open-llms.

touhi99 avatar touhi99 commented on May 19, 2024

Hi,

do you mean adding the performance results directly to the table?

For now, we have listed some external resources under the Evals on open LLMs section, which cover the performances of the models on various baselines. Do you think this is enough?

Yes, that would give some rough idea which model to choose among many. I will have a look at the Evals, thanks.

Beside performance, personally, also GPU/tech requirement would also be interesting benchmark to estimate solution. If I propose, an LLM-based solution, what's the min. tech requirement would be for training/fine-tuning/inference. so far many models are coming in and out, but I haven't found any certain data. For example, X model need atleast Y gb gpu ram for inference.

from open-llms.

martinezpl avatar martinezpl commented on May 19, 2024

For anyone interested in this topic:

https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard

from open-llms.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.