Giter VIP home page Giter VIP logo

Comments (5)

chu-tianxiang avatar chu-tianxiang commented on June 10, 2024 8

Act-order can be easily supported using the same trick of exllama. I did a few test converting off-the-shelf GPTQ models to gpt-fast format using sample code here python convert_hf_checkpoint.py --checkpoint_dir Llama-2-7B-GPTQ --model_name llama-7B and got 193 tokens/s in single A100 which is really impressive.

from gpt-fast.

Chillee avatar Chillee commented on June 10, 2024

It should share the same group size support and such. I’m not sure about activation order.

One note is that for 4-bit support we do require the weights to be packed in a certain order, so we will need to preprocess the weights a bit. I’ll look into it.

from gpt-fast.

cwsappeal avatar cwsappeal commented on June 10, 2024

Test

awesome

from gpt-fast.

TheBloke avatar TheBloke commented on June 10, 2024

Oh this is very interesting! Thanks for the ping. The performance numbers look very impressive! It's great to see the PyTorch team working on thius.

And yes it'd be awesome if this could also support the thousands of existing GPTQ models out there.

All my recent GPTQs have act_order / desc_act (as AutoGPTQ calls it) on. In the early days of my releasing models I also released GPTQs without act_order, as back then there were clients/libraries that didn't support it with group size, or had performance issues. But I stopped doing that 2-3 months ago.

This project reminds me of @turboderp 's ExLlama and ExLlamav2 - they are PyTorch-only inference systems with highly optimised performance. Turboderp got act_order working with no performance drop, so it should definitely be possible.

from gpt-fast.

TheBloke avatar TheBloke commented on June 10, 2024

Great to hear!

from gpt-fast.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.