Comments (7)
Always use launch.sh
to start containers, this file contains commands to set these environment variables.
If you want to avoid executing this script afterwards, and just use docker compose up
one directly. You can
- replace
${GPUS}
as the comma-separated array starting with zero with the same number of GPUs in your device, andFor example, if there is 1 then it is
0
, and if there are 2 it is0,1
. - replace
${MODEL_DIR}
as the path where model is located, but need to go one level deeper.For example, if your model is on path
/home/user/fauxpilot/model
, the used model iscodegen-6B-multi
, and you device has 1 GPU, then here should be like/home/user/fauxpilot/model/codegen-6B-multi-1gpu
.
Prepending ${MODEL_DIR}
to
./
is wrong, because in scripts there will usually be an absolute path.
from fauxpilot.
Ah! I see: I did use launch.sh
at first. I probably set MODEL_DIR incorrectly in the setup.sh
script and it went downhill from there:
Like this:
~/dev/fauxpilot$ ./setup.sh
Models available:
[1] codegen-350M-mono (2GB total VRAM required; Python-only)
[2] codegen-350M-multi (2GB total VRAM required; multi-language)
[3] codegen-2B-mono (7GB total VRAM required; Python-only)
[4] codegen-2B-multi (7GB total VRAM required; multi-language)
[5] codegen-6B-mono (13GB total VRAM required; Python-only)
[6] codegen-6B-multi (13GB total VRAM required; multi-language)
[7] codegen-16B-mono (32GB total VRAM required; Python-only)
[8] codegen-16B-multi (32GB total VRAM required; multi-language)
Enter your choice [6]: 1
Enter number of GPUs [1]:
Where do you want to save the model [/home/username/dev/path/to/models]? models
Downloading the model from HuggingFace, this will take a while...
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 287 100 287 0 0 2023 0 --:--:-- --:--:-- --:--:-- 2035
100 803M 100 803M 0 0 40.8M 0 0:00:19 0:00:19 --:--:-- 38.4M
Done! Now run ./launch.sh to start the FauxPilot server.
~/dev/fauxpilot$ ./launch.sh
ERROR: Named volume "models/codegen-350M-mono-1gpu:/model:rw" is used in service "triton" but no declaration was found in the volumes section.
~/dev/fauxpilot$ cat config.env
MODEL=codegen-350M-mono
NUM_GPUS=1
MODEL_DIR=models
In that case, maybe a conversion to absolute path would make setup.sh
fool proof.
index 0a8e35b..f10d477 100755
--- a/setup.sh
+++ b/setup.sh
@@ -40,6 +40,7 @@ read -p "Where do you want to save the model [$(pwd)/models]? " MODEL_DIR
if [ -z "$MODEL_DIR" ]; then
MODEL_DIR="$(pwd)/models"
fi
+MODEL_DIR=$(cd $MODEL_DIR && pwd) || exit $?
# Write config.env
echo "MODEL=${MODEL}" > config.env
from fauxpilot.
Good idea, but the cd
command requires that the path after that must already exist, which is not guaranteed. Perhaps explicitly asking the user to enter an absolute path here is a better option?
from fauxpilot.
I think we could also use readlink -f
or realpath
to get an absolute path to the model directory based on what the user inputs in setup.sh
, and that should fix this pretty comprehensively.
from fauxpilot.
Or -m
should be used to avoid problems for users when entering multi-components like foo/foo/foo
.๐ค
from fauxpilot.
I think that PR fixes this issue, I'll close it now :)
from fauxpilot.
Yes, thanks both.
from fauxpilot.
Related Issues (20)
- Maybe add windows/etc installer all-in-one in this project's 'releases'.
- 400 Bad Request when file has around 100 lines of code HOT 3
- C# support! HOT 2
- Hello all. The comments above have been very helpful in setting up the Copilot extension. I managed to get it to work with my instance and figured I would combine the steps I used (this is for Windows. Linux installation is similar, just different locations):
- It was working fine before... HOT 1
- Support for AMD GPUs HOT 1
- Triton doesnt exist anymore I think? HOT 3
- K8s deployment (via helm chart) HOT 2
- Caught signal 11 (Segmentation fault: address not mapped to object at address (nil)) HOT 1
- why my response are all !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! HOT 3
- Can I merge images of triton and client into one๏ผeg fastertransformer_backend get content_fetch <fastertransformer&client>in CMakeLists ? HOT 1
- help me HOT 1
- What is the comparison of these model in huggingface? HOT 2
- Python Backend: "Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0" HOT 2
- [promptlib] proxy {"cause":{}} HOT 1
- ollama HOT 2
- Company Proxy HOT 1
- is documentation outdated?
- Jetbrains Support
- RTX 4060 Unsupported Message
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fauxpilot.