Comments (6)
Hi, can you show your .env
file. CUDA not found.
means you are trying to load Llama2, gptq models.
If you are loading ggml models, it should print out like this .
Make sure you have set LLAMA_CPP = True
in .env
file.
from llama2-webui.
hey, thanks for the quick reply. I used the example for llama-2-7b-chat.ggmlv3.q4_0.bin
This is my .env
file:
MODEL_PATH = "/home/user/project/models/llama-2-7b-chat.ggmlv3.q4_0.bin"
LOAD_IN_8BIT = False
LOAD_IN_4BIT = True
LLAMA_CPP = True
MAX_MAX_NEW_TOKENS = 2048
DEFAULT_MAX_NEW_TOKENS = 1024
MAX_INPUT_TOKEN_LENGTH = 4000
DEFAULT_SYSTEM_PROMPT = "\
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.\
"
from llama2-webui.
mmh, I didn't really change anything, mostly reloaded my venv (but which I also did before) and now it says it is running on CPU with llama.cpp. Though now it also does not find the model and seemingly doesn't pick up the path properly from .env
.
from llama2-webui.
Now it seems to work. Apparently for a venv, it only picks up changes to the .env
after exiting the venv/environment and reloading it?
from llama2-webui.
It is wired. The '.env' should be loaded every time you run app.py
.
from llama2-webui.
@step21 welcome contributing your benchmark performance here.
from llama2-webui.
Related Issues (20)
- Cant seem to run it on GPU HOT 5
- cannot run Llama-2-70b-hf HOT 3
- Ignores new query and responds with crossed out details (from previous question). HOT 2
- OSError: [Errno 30] Read-only file system HOT 1
- [Feature Request] Support InternLM
- Unable to load 70B llama2 on cpu (llama cpp) HOT 1
- ERROR. How to fix ? HOT 4
- How to run on GPU? Runs on CPU only HOT 1
- chat too slow! HOT 1
- AssertionError self.model is not None HOT 6
- model is not None HOT 8
- GGML deprecated - support GGUF models? HOT 3
- The temperature parameter does not seem to work HOT 2
- How to add llama_index in llama-webui
- dom.js:238 Uncaught (in promise) DOMException
- safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooLarge HOT 5
- GPU CUDA not found And HFValidationError
- Very slow generation HOT 1
- why i7 8700 is faster than i7 9700
- Gradio Memory Leak Issue
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama2-webui.