I've noticed that Alpaca's accuracy takes a hit with our submodule.

Ok, I got F1 = 0.601 when I changed the max_new_tokens</code

Which transformers branch are we using for this? Here: <a href="https://github.com

Alpaca accuracy about ampere_model_library HOT 9 CLOSED

jan-grzybek-ampere commented on August 28, 2024

Alpaca accuracy

from ampere_model_library.

Comments (9)

dkupnicki commented on August 28, 2024 1

Ok, I got F1 = 0.601 when I changed the max_new_tokens from 10 to 100, which is understandable considering how this is calculated.

The best performance I got is 5.03.

from ampere_model_library.

jan-grzybek-ampere commented on August 28, 2024

Oh, and accuracy values I've provided were gathered with --num_runs=5.

from ampere_model_library.

kkontny commented on August 28, 2024

Which transformers branch are we using for this? Here:
https://github.com/AmpereComputingAI/transformers/commits/karol/llama-compile-v2
the only change from original repo is to remove some validation checks which fail with Pytorch 2.1.

from ampere_model_library.

jan-grzybek-ampere commented on August 28, 2024

that's what we use - somehow it's very slow vs latest upstream - maybe they've introduced some significant improvements since our branching out? @dkupnicki please check when possible

from ampere_model_library.

kkontny commented on August 28, 2024

Ok, you are comparing vs latest upstream. I think we may just rebase my branch on it. I see there was some work going over there, but didn't expect such difference. If AIO is generating same graph we can safely switch to newer implementation. Also numbers you are much lower than what I observed. Which version of pytorch-AIO and native pytorch have you used for this benchmarks? How long sequences are you generating.

I was having around ~9 tps in FP16 mode and around 5 tps in FP32 mode with AIO.

from ampere_model_library.

dkupnicki commented on August 28, 2024

I can’t reproduce this issue. I tried with different versions of transformers, with and without AIO, with and without torch.compile() and I’m always getting the same 0.313 result.

from ampere_model_library.

jan-grzybek-ampere commented on August 28, 2024

Ok, I will check. What about performance? Can you get to 9 tps?

from ampere_model_library.

dkupnicki commented on August 28, 2024

It was fp32 BTW, checking fp16 now

from ampere_model_library.

jan-grzybek-ampere commented on August 28, 2024

I guess we've solved that one. Closing. Thanks to you Daniel and Karol for your input.

from ampere_model_library.

Recommend Projects

Alpaca accuracy about ampere_model_library HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent