Hi, Does the GPTQ converted models the same as AutoGPTQ? Do they sh

Test <a href="https://gitme.me/image?url=https%3A%2F%2Fmedia1.giphy.

Compatible with AutoGPTQ? about gpt-fast HOT 5 OPEN

pytorch-labs commented on June 10, 2024

Compatible with AutoGPTQ?

from gpt-fast.

Comments (5)

chu-tianxiang commented on June 10, 2024 8

Act-order can be easily supported using the same trick of exllama. I did a few test converting off-the-shelf GPTQ models to gpt-fast format using sample code here python convert_hf_checkpoint.py --checkpoint_dir Llama-2-7B-GPTQ --model_name llama-7B and got 193 tokens/s in single A100 which is really impressive.

from gpt-fast.

Chillee commented on June 10, 2024

It should share the same group size support and such. I’m not sure about activation order.

One note is that for 4-bit support we do require the weights to be packed in a certain order, so we will need to preprocess the weights a bit. I’ll look into it.

from gpt-fast.

cwsappeal commented on June 10, 2024

Test

from gpt-fast.

TheBloke commented on June 10, 2024

Oh this is very interesting! Thanks for the ping. The performance numbers look very impressive! It's great to see the PyTorch team working on thius.

And yes it'd be awesome if this could also support the thousands of existing GPTQ models out there.

All my recent GPTQs have act_order / desc_act (as AutoGPTQ calls it) on. In the early days of my releasing models I also released GPTQs without act_order, as back then there were clients/libraries that didn't support it with group size, or had performance issues. But I stopped doing that 2-3 months ago.

This project reminds me of @turboderp 's ExLlama and ExLlamav2 - they are PyTorch-only inference systems with highly optimised performance. Turboderp got act_order working with no performance drop, so it should definitely be possible.

from gpt-fast.

TheBloke commented on June 10, 2024

Great to hear!

from gpt-fast.

Recommend Projects

Compatible with AutoGPTQ? about gpt-fast HOT 5 OPEN

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent