Hi there, could you please guide me how can I using your model on hi

load KoAlpaca-65B-LoRA on local system about koalpaca HOT 5 CLOSED

beomi commented on June 9, 2024

load KoAlpaca-65B-LoRA on local system

from koalpaca.

Comments (5)

Beomi commented on June 9, 2024 1

Hello!

I noticed that you're working with the KoAlpaca-65B-LoRA repository at huggingface, which contains only the 'LoRA-finetuned' additional weights.
To load the original llama weights, you can use other codes, such as the alpaca-lora found here: https://github.com/tloen/alpaca-lora.

If you're looking to load the model and test it on your own device, please note that you'll need an A100 80G GPU or H100 GPU to load it in a single device, even when using 8-bit quantization.
I won't be discussing pipeline parallel or tensor parallel in this repository, as it isn't the right place for that.

Assuming you have the necessary GPU, you can follow these steps to load the model and try it out: https://github.com/deep-diver/Alpaca-LoRA-Serve

To install alpaca-lora-serve, just follow the instructions in the repository.
Once that's done, you can run the following command:

export BASE_URL=decapoda-research/llama-65b-hf
export FINETUNED_CKPT_URL=beomi/KoAlpaca-65B-LoRA

python app.py --base_url $BASE_URL --ft_ckpt_url $FINETUNED_CKPT_URL --port 6006

After that, you can access the chatbot-like web UI from your browser at http://localhost:6006. Enjoy and happy coding!

from koalpaca.

SoroorMa commented on June 9, 2024

that's amazing, Thank you for responding so quickly! I'm going to try it :)

…

On Thu, Apr 13, 2023 at 2:50 PM Junbum Lee ***@***.***> wrote: Hello! I noticed that you're working with the KoAlpaca-65B-LoRA repository at huggingface, which contains only the 'LoRA-finetuned' additional weights. To load the original llama weights, you can use other codes, such as the alpaca-lora found here: https://github.com/tloen/alpaca-lora. If you're looking to load the model and test it on your own device, please note that you'll need an A100 80G GPU or H100 GPU to load it in a single device, even when using 8-bit quantization. I won't be discussing pipeline parallel or tensor parallel in this repository, as it isn't the right place for that. Assuming you have the necessary GPU, you can follow these steps to load the model and try it out: https://github.com/deep-diver/Alpaca-LoRA-Serve To install alpaca-lora-serve, just follow the instructions in the repository. Once that's done, you can run the following command: export BASE_URL=decapoda-research/llama-65b-hfexport FINETUNED_CKPT_URL=beomi/KoAlpaca-65B-LoRA python app.py --base_url $BASE_URL --ft_ckpt_url $FINETUNED_CKPT_URL --port 6006 After that, you can access the chatbot-like web UI from your browser at http://localhost:6006. Enjoy and happy coding! — Reply to this email directly, view it on GitHub <#24 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ARCOQJ2O2OFSNIRHRN5PY5DXA6HZTANCNFSM6AAAAAAW4SMRWQ> . You are receiving this because you authored the thread.Message ID: ***@***.***>

from koalpaca.

SoroorMa commented on June 9, 2024

I just did whatever you mentioned, but I got this error:
unsupported model type. only llamastack, alpaca, flan, and baize are supported

did I miss something?
(Alpaca-LoRA as a Chatbot Service works well)

from koalpaca.

Beomi commented on June 9, 2024

I utilized the alpaca-lora framework for training the LoRA model; however, I did not employ it for loading or using the model.
My experience with the LoRA checkpoint has been limited to its integration within the Chatbot service.

What are you trying to do exactly?

from koalpaca.

SoroorMa commented on June 9, 2024

based on your guidance, I just install alpaca-lora-serve
then run the exact same command that you shared to check it on my machine
but I got that error
for now I just want to test it on my system, and maybe later try to use the model for the specific task :)

from koalpaca.

load KoAlpaca-65B-LoRA on local system about koalpaca HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent