Giter VIP home page Giter VIP logo

Comments (22)

K-Alex13 avatar K-Alex13 commented on May 23, 2024

update new question here , when I use interence with following gpu,how can I put inputs id to another gpu
image

from bigdl.

K-Alex13 avatar K-Alex13 commented on May 23, 2024

The current machine being used is an a770, and the GPU memory should be sufficient. I hope you can provide me with some guidance.

from bigdl.

hkvision avatar hkvision commented on May 23, 2024

Could you provide more details?

  • Are you running baichuan1 or baichuan2?
  • What sequence lengths in and out are you using that have this memory issue?

from bigdl.

hkvision avatar hkvision commented on May 23, 2024

update new question here , when I use interence with following gpu,how can I put inputs id to another gpu image

If you have multiple GPUs, you can use xpu:0, xpu:1 to specify?

from bigdl.

K-Alex13 avatar K-Alex13 commented on May 23, 2024

Could you provide more details?

  • Are you running baichuan1 or baichuan2?
  • What sequence lengths in and out are you using that have this memory issue?

I am using baichuan2 and the sequence length should be the default.
model is doloading from following web.
https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat/tree/main

from bigdl.

K-Alex13 avatar K-Alex13 commented on May 23, 2024

update new question here , when I use interence with following gpu,how can I put inputs id to another gpu image

If you have multiple GPUs, you can use xpu:0, xpu:1 to specify?

Can you please give me a sample like wehere and how to specify model in different xpu?

from bigdl.

hkvision avatar hkvision commented on May 23, 2024

update new question here , when I use interence with following gpu,how can I put inputs id to another gpu image

If you have multiple GPUs, you can use xpu:0, xpu:1 to specify?

Can you please give me a sample like wehere and how to specify model in different xpu?

https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/baichuan2/generate.py#L50
https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/baichuan2/generate.py#L59
For example, you can modify here to use xpu:0/1 if you wish.

from bigdl.

hkvision avatar hkvision commented on May 23, 2024

Could you provide more details?

  • Are you running baichuan1 or baichuan2?
  • What sequence lengths in and out are you using that have this memory issue?

I am using baichuan2 and the sequence length should be the default. model is doloading from following web. https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat/tree/main

Which script are you using and what is the default? Is it this one? https://github.com/intel-analytics/BigDL/tree/main/python/llm/dev/benchmark/all-in-one

from bigdl.

K-Alex13 avatar K-Alex13 commented on May 23, 2024

Could you provide more details?

  • Are you running baichuan1 or baichuan2?
  • What sequence lengths in and out are you using that have this memory issue?

I am using baichuan2 and the sequence length should be the default. model is doloading from following web. https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat/tree/main

Which script are you using and what is the default? Is it this one? https://github.com/intel-analytics/BigDL/tree/main/python/llm/dev/benchmark/all-in-one

I just download the baichuan2-13b model from HF and run model.chat. This is what I mean default

from bigdl.

jason-dai avatar jason-dai commented on May 23, 2024

I just download the baichuan2-13b model from HF and run model.chat. This is what I mean default

Does model.chat use BigDL?

from bigdl.

K-Alex13 avatar K-Alex13 commented on May 23, 2024

image
model = model.to('xpu:1') this is not working

from bigdl.

K-Alex13 avatar K-Alex13 commented on May 23, 2024

I just download the baichuan2-13b model from HF and run model.chat. This is what I mean default

Does model.chat use BigDL?
Yes I do use bigdl

from bigdl.

K-Alex13 avatar K-Alex13 commented on May 23, 2024

update new question here , when I use interence with following gpu,how can I put inputs id to another gpu image

If you have multiple GPUs, you can use xpu:0, xpu:1 to specify?

Can you please give me a sample like wehere and how to specify model in different xpu?

https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/baichuan2/generate.py#L50 https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/baichuan2/generate.py#L59 For example, you can modify here to use xpu:0/1 if you wish.

I try to use xpu:0 and xpu:1 teo different situation. In xpu:1 there will come a problem that the device_id is out of range, and xpu:0 is the original state. What can I do next.

from bigdl.

hkvision avatar hkvision commented on May 23, 2024

update new question here , when I use interence with following gpu,how can I put inputs id to another gpu image

If you have multiple GPUs, you can use xpu:0, xpu:1 to specify?

Can you please give me a sample like wehere and how to specify model in different xpu?

https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/baichuan2/generate.py#L50 https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/baichuan2/generate.py#L59 For example, you can modify here to use xpu:0/1 if you wish.

I try to use xpu:0 and xpu:1 teo different situation. In xpu:1 there will come a problem that the device_id is out of range, and xpu:0 is the original state. What can I do next.

So are there multiple gpu cards on your machine? After sourcing oneapi, you can use sycl-ls to check the gpu cards on your machine:
image

from bigdl.

K-Alex13 avatar K-Alex13 commented on May 23, 2024

update new question here , when I use interence with following gpu,how can I put inputs id to another gpu image

If you have multiple GPUs, you can use xpu:0, xpu:1 to specify?

Can you please give me a sample like wehere and how to specify model in different xpu?

https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/baichuan2/generate.py#L50 https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/baichuan2/generate.py#L59 For example, you can modify here to use xpu:0/1 if you wish.

I try to use xpu:0 and xpu:1 teo different situation. In xpu:1 there will come a problem that the device_id is out of range, and xpu:0 is the original state. What can I do next.

So are there multiple gpu cards on your machine? After sourcing oneapi, you can use sycl-ls to check the gpu cards on your machine: image

image
This is the results of sycl-ls

from bigdl.

hkvision avatar hkvision commented on May 23, 2024

Seems only one GPU is detected... Are other gpus properly set?

from bigdl.

K-Alex13 avatar K-Alex13 commented on May 23, 2024

Not sure why there only one gpu detected, I see the gpu 2 in this figure?

from bigdl.

hkvision avatar hkvision commented on May 23, 2024

image
You mean gpu:2 here? These two lines mean the same gpu, only one.

from bigdl.

K-Alex13 avatar K-Alex13 commented on May 23, 2024

update new question here , when I use interence with following gpu,how can I put inputs id to another gpu image

Then what this figure means, it seems have 32G gpu

from bigdl.

qiuxin2012 avatar qiuxin2012 commented on May 23, 2024

It looks like your driver(released in 2023.7) is a little old. Please update your driver to latest version and try again.
You can download it from https://www.intel.com/content/www/us/en/download/785597/intel-arc-iris-xe-graphics-windows.html

from bigdl.

WeiguangHan avatar WeiguangHan commented on May 23, 2024

Could you provide more details?

  • Are you running baichuan1 or baichuan2?
  • What sequence lengths in and out are you using that have this memory issue?

I am using baichuan2 and the sequence length should be the default. model is doloading from following web. https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat/tree/main

Which script are you using and what is the default? Is it this one? https://github.com/intel-analytics/BigDL/tree/main/python/llm/dev/benchmark/all-in-one

I just download the baichuan2-13b model from HF and run model.chat. This is what I mean default

Hi, I have tested it on my side using bigdl and model.chat from HF. And it worked fine. But I am a bit curious about the log output Thread in your screenshot which seemed strange to appear.

from bigdl.llm.transformers import AutoModelForCausalLM
from transformers import AutoTokenizer
# import intel_extension_for_pytorch as ipex
from transformers.generation.utils import GenerationConfig
model = AutoModelForCausalLM.from_pretrained(r"D:\llm-models\Baichuan2-13B-Chat", optimize_model=True, load_in_low_bit="sym_int4",
                                                trust_remote_code=True, use_cache=True, cpu_embedding=False).eval()
tokenizer = AutoTokenizer.from_pretrained(r"D:\llm-models\Baichuan2-13B-Chat", trust_remote_code=True)
model.to("xpu")
model.generation_config = GenerationConfig.from_pretrained(r"D:\llm-models\Baichuan2-13B-Chat", revision="v2.0")
messages = []
messages.append({"role": "user", "content": "解释一下“温故而知新”"})
response = model.chat(tokenizer, messages)
print(response)

image

from bigdl.

shane-huang avatar shane-huang commented on May 23, 2024

update new question here , when I use interence with following gpu,how can I put inputs id to another gpu image

Then what this figure means, it seems have 32G gpu

The GPU memory Arc770 can actually use is only 16G, as shown in your device screen snapshot.

image

from bigdl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.