Comments (8)
Here are some thoughts :
But maybe a simpler way would be to declare a path as a tmpfs
in the docker-compose file, and have the api code copy the files into that tmpfs location at startup.
Another thought : using https://github.com/hyperonym/basaran
I'm curious what other ideas people will come with on that matter
from serge.
Hi, on my side I already tested by mounting in tmpfs the /var/lib/docker directory and the repository directory (manually via the unix system), and it's hardly faster if you already have an nvme, in any case I didn't notice much difference.
from serge.
On the other hand I wonder if increasing the priority of the processes at startup when it generates a discussion would be interesting ex: "chrt -f 90 llama".
And above all, wouldn't it be more efficient to run the process directly outside of docker.
from serge.
Just clone and build branch with mmap allocation, and yes is faster than main branch, is like instant when process is running.
https://github.com/ggerganov/llama.cpp/tree/mmap
I try some modification for allocation on memory in main.cpp.
from serge.
Here is main version :
from serge.
And version custom with mmap allocation (i added some code to enable back avx512 on this version).
from serge.
Sorry here is last main version previous screenshot is an older main version than 2 days before i think.
from serge.
Closed via #129
from serge.
Related Issues (20)
- 🚀 [Feature]: Add OpenVino / OpenVino Model Server HOT 1
- 🐛 [Bug]: Web interface does not render properly on mobile devices HOT 1
- 🚀 [Feature]: Add LINCE-Mistal model HOT 1
- 🐛 [Bug]: UI components are missing accessibility labels HOT 2
- 🐛 [Bug]: response text generated by a model sometimes disappears after computer/browser is woken up from a 'sleep' HOT 4
- have a separate page which displays downloaded moddles. HOT 1
- 🚀 [Feature]: Add support for Intel ARC GPUs A750 and A770 (If Possible) HOT 2
- bug: Allow loading .gguf and .bin files HOT 3
- 🚀 [Feature]: add eagle 7b HOT 3
- 🐛 [Bug]: system reachable via ICMP and via Port 8008 but screen "navy blue" with no text whatsoever HOT 14
- 🚀 [Feature]: Add Nous-Hermes-2-Mistral-7B-DPO HOT 8
- 🚀 [Feature]: Add support for uploading files during chat conversation
- 🐛 [Bug]: New install - response keeps repeating the last line HOT 7
- 🚀 [Feature]: add characters HOT 6
- 🚀 [Feature]: Please add Gorilla: Large Language Model Connected with Massive APIs HOT 3
- 🤗 [Question]: Whats the difference between the... models?
- 🚀 [Feature]: Add meta-llama/Meta-Llama-3-70B-Instruct HOT 7
- 🐛 [Bug]: Can't use pre-existing model at /weights HOT 1
- 🐛 [Bug]: DLLAMA_BLAS_VENDOR=OpenBLAS build with pip is not enabling OpenBlas HOT 3
- how to use mixtral-8x7b-v0.1🤗 [Question]: HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from serge.