I've been messing around with this repo since this morning, reading the readme files a

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Using 01_Basics example, the model is not loading in GPU about llama-cpp-agent HOT 8 CLOSED

maximilian-winter commented on September 1, 2024

Using 01_Basics example, the model is not loading in GPU

from llama-cpp-agent.

Comments (8)

Maximilian-Winter commented on September 1, 2024 1

I would try this CMAKE_ARGS="-DLLAMA_CUDA=on" pip install llama-cpp-python

from llama-cpp-agent.

imalros commented on September 1, 2024 1

You are right, CMAKE_ARGS="-DLLAMA_CUDA=on" is the correct cmake argument.
I came across this one-liner and it seems that it is fixed now:
CUDACXX=/usr/local/cuda-12.4/bin/nvcc CMAKE_ARGS="-DLLAMA_CUDA=on" FORCE_CMAKE=1 pip install llama-cpp-python --no-cache-dir --force-reinstall --upgrade

from llama-cpp-agent.

Maximilian-Winter commented on September 1, 2024

@imalros I think you installed llama-cpp-python with llama-cpp-agent? Then you don't have the GPU acceleration. You have to reinstall llama-cpp-python. On Windows with Cuda, do it like that:

pip uninstall llama-cpp-python
set CMAKE_ARGS=-DLLAMA_CUBLAS=on
set FORCE_CMAKE=1
pip install llama-cpp-python --no-cache-dir

In linux it should be something like that, I think:

pip uninstall llama-cpp-python
CMAKE_ARGS=-DLLAMA_CUBLAS=on
FORCE_CMAKE=1
pip install llama-cpp-python --no-cache-dir

from llama-cpp-agent.

Maximilian-Winter commented on September 1, 2024

@imalros I think I broke the Phi 3 template by accident. Will fix it now.

from llama-cpp-agent.

Maximilian-Winter commented on September 1, 2024

@imalros If you install the new version of my framework 0.2.1, you will have the correct Phi 3 template.

from llama-cpp-agent.

imalros commented on September 1, 2024

Thanks for the hint. I followed the steps you mentioned for linux but I got the same result. Here are the terminal outputs:

(myenv) (base) user1@myserver:~/Code/llama-cpp-agent$ CMAKE_ARGS=-DLLAMA_CUBLAS=on
(myenv) (base) user1@myserver:~/Code/llama-cpp-agent$ FORCE_CMAKE=1
(myenv) (base) user1@myserver:~/Code/llama-cpp-agent$ pip install llama-cpp-python --no-cache-dir
Collecting llama-cpp-python
  Downloading llama_cpp_python-0.2.74.tar.gz (49.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 49.2/49.2 MB 26.7 MB/s eta 0:00:00
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Installing backend dependencies ... done
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: typing-extensions>=4.5.0 in /home/user1/miniconda3/envs/myenv/lib/python3.11/site-packages (from llama-cpp-python) (4.9.0)
Requirement already satisfied: numpy>=1.20.0 in /home/user1/miniconda3/envs/myenv/lib/python3.11/site-packages (from llama-cpp-python) (1.26.4)
Requirement already satisfied: diskcache>=5.6.1 in /home/user1/miniconda3/envs/myenv/lib/python3.11/site-packages (from llama-cpp-python) (5.6.3)
Requirement already satisfied: jinja2>=2.11.3 in /home/user1/miniconda3/envs/myenv/lib/python3.11/site-packages (from llama-cpp-python) (3.1.3)
Requirement already satisfied: MarkupSafe>=2.0 in /home/user1/miniconda3/envs/myenv/lib/python3.11/site-packages (from jinja2>=2.11.3->llama-cpp-python) (2.1.5)
Building wheels for collected packages: llama-cpp-python
  Building wheel for llama-cpp-python (pyproject.toml) ... done
  Created wheel for llama-cpp-python: filename=llama_cpp_python-0.2.74-cp311-cp311-linux_x86_64.whl size=3657012 sha256=d57bf14f448439ea216ef0ee6450e4370c2eb59533cefa3a0b851a97d158c49b
  Stored in directory: /tmp/pip-ephem-wheel-cache-8nehys8t/wheels/9b/ac/9a/7232ddf82e013b7234571c8ed5011125fb0ef4750d347306b8
Successfully built llama-cpp-python
Installing collected packages: llama-cpp-python
Successfully installed llama-cpp-python-0.2.74

from llama-cpp-agent.

imalros commented on September 1, 2024

@imalros If you install the new version of my framework 0.2.1, you will have the correct Phi 3 template.

Great! Thanks for the quick fix.

from llama-cpp-agent.

Maximilian-Winter commented on September 1, 2024

@imalros Maybe this will help you:
https://github.com/abetlen/llama-cpp-python?tab=readme-ov-file#installation-configuration

from llama-cpp-agent.

Using 01_Basics example, the model is not loading in GPU about llama-cpp-agent HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent