Giter VIP home page Giter VIP logo

Comments (11)

epwalsh avatar epwalsh commented on September 16, 2024

I tried this out on the branch Torch2-RMSNorm using torch.autocast() to manually control precision. It was much slower.

image

However when I stopped messing with autocast it was faster:

image

from olmo.

epwalsh avatar epwalsh commented on September 16, 2024

Final results: https://wandb.ai/ai2-llm/petew-benchmarks-2/reports/LayerNorm-Benchmarks--VmlldzozOTI1NTA4

These results tell us that to optimize throughput we should use --model.layer_norm_type=rms when compiling and --model.layer_norm_type=low_precision when not compiling.

from olmo.

ananyahjha93 avatar ananyahjha93 commented on September 16, 2024

@epwalsh at this point we should also look at end task performance than absolute speedup results

Some of these decisions in bloom were made based on downstream eval and not just throughput: https://arxiv.org/pdf/2210.15424.pdf

from olmo.

ananyahjha93 avatar ananyahjha93 commented on September 16, 2024

like I think SwiGLU and alibi are non-negotiable based on end task performance

from olmo.

epwalsh avatar epwalsh commented on September 16, 2024

Absolutely

from olmo.

epwalsh avatar epwalsh commented on September 16, 2024

Only problem with ALiBi is that it incurs a significant performance hit since it doesn't work with the current Flash Attention implementation.

from olmo.

dirkgr avatar dirkgr commented on September 16, 2024

Is low_precision stable enough for use? I thought doing LN in 32-bits was a major stability hack in BLOOM.

from olmo.

epwalsh avatar epwalsh commented on September 16, 2024

Is low_precision stable enough for use? I thought doing LN in 32-bits was a major stability hack in BLOOM.

That remains to be seen.

from olmo.

dirkgr avatar dirkgr commented on September 16, 2024

What about Alibi vs. Rope?

from olmo.

ananyahjha93 avatar ananyahjha93 commented on September 16, 2024

performance or throughput?

from olmo.

dirkgr avatar dirkgr commented on September 16, 2024

from olmo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.