Comments (7)
WSL2 should be fine, I am also using WSL2 for GPU inference on llama2 or gptq models.
gptq model loads much faster.
from llama2-webui.
Try modifying your model path in .env
.
from llama2-webui.
Do you mind listing more specific details of your env?
from llama2-webui.
Hi, I got the same error:
HFValidationError(
huggingface_hub.utils._validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/mnt/d/WorkTable/Projects/Temp/llama2/llama2-webui/models/Llama-2-7B-GGML/llama-2-7b.ggmlv3.q8_0.bin'. Use `repo_type` argument if needed.
And here's my env:
MODEL_PATH = "/mnt/d/WorkTable/Projects/Temp/llama2/llama2-webui/models/Llama-2-7B-GGML/llama-2-7b.ggmlv3.q8_0.bin"
LOAD_IN_8BIT = True
LOAD_IN_4BIT = False
LLAMA_CPP = False
.........
However, if I change nothing but set LLAMA_CPP to True, it works well on CPU mode
Running on CPU with llama.cpp.
llama.cpp: loading model from /mnt/d/WorkTable/Projects/Temp/llama2/llama2-webui/models/Llama-2-7B-GGML/llama-2-7b.ggmlv3.q8_0.bin
.....
....
Running on local URL: http://127.0.0.1:7860
I'm using WSL2, and I can run pytorch cuda
import torch
torch.cuda.is_available()
# True
Do you have any suggestion?
Thank you
from llama2-webui.
@smithlai ggml models only work on llama.cpp (usually CPU). If you want to run models on GPU, try llama2 or gptq models.
Another way to run ggml model on GPU is through llama.cpp by installing this.
from llama2-webui.
thanks @liltom-eth , @smithlai. able to run on cpu and wsl2 as per above.
from llama2-webui.
@blackhawkee @smithlai welcome contributing your benchmark performance here.
from llama2-webui.
Related Issues (20)
- [FEATURE] command line app: `cli.py`
- [FEATURE] Is there any way to expose this as Rest API instead of Default UI HOT 1
- Is there a plan to support Windows? HOT 1
- [FEATURE] add `--iter` argument for benchmark HOT 2
- [FEATURE] export conversation as json HOT 2
- Error in text generation, major error HOT 1
- ERROR: Failed building wheel for llama-cpp-python HOT 2
- [FEATURE] support for ctransformers HOT 1
- When I was running app. py, I encountered some errors HOT 2
- Cant seem to run it on GPU HOT 5
- cannot run Llama-2-70b-hf HOT 3
- Ignores new query and responds with crossed out details (from previous question). HOT 2
- OSError: [Errno 30] Read-only file system HOT 1
- [Feature Request] Support InternLM
- Unable to load 70B llama2 on cpu (llama cpp) HOT 1
- ERROR. How to fix ? HOT 4
- How to run on GPU? Runs on CPU only HOT 1
- chat too slow! HOT 1
- AssertionError self.model is not None HOT 6
- model is not None HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama2-webui.