Comments (7)
30 minute is definitely better than 5, but honestly my use case is probably not the typical one. I have a RTX 3090, so plenty of VRAM to have a 6-7b model loaded all the time. The ollama devs justified their default of 5 minutes to not hog VRAM on smaller GPUs, which is probably the more common use case.
For me I would prefer to disable unloading entirely (keep_alive: -1
), since I can just stop the ollama docker container when I'm done with my development session. I imagine it would be useful to configure this per-model, since the size of the model is probably the biggest factor affecting this setting. For example, just exposing this as a configuration option in config.json
for ModelDescription
would be super useful for me.
Thanks for your work on this extension, it's been great watching its development and has been fun to use!
from continue.
Love the idea! Working on tab-autocomplete release probably for the next couple of days, but will add this later this week
from continue.
@noahhaon @johnozbay I can't believe it took me this long to make such a simple commit, but here it is!: 2614465. Will be in next pre-release
I was a bit concerned about adding tons of parameters, but to be honest this one feels pretty fundamental. In the future if there are any others you want that we don't support, we also recently added requestOptions.body to config.json, so you can add arbitrary params to the HTTP POST request body
Now we can all sip piña coladas by the pool while the GPUs/CPUs are writing our code : )
from continue.
Joining @noahhaon here to say that, I too have plenty of vram sitting around idle sipping piña coladas, and would love to make better use of it! 🍹
Having a keep_alive : -1
or a similar config parameter that allows keeping the model loaded for at least 6 - 8 hours to cover a working day would be amazing. Otherwise, going to a 30min meeting and coming back to the desk requires a much longer wait time for the model to load up. It would improve Continue's responsiveness by a very large margin.
— btw, thanks a lot for making continue @sestinj! This is a total game changer, especially for coders who have businesses or clients etc that are more privacy-sensitive. 🙏🏻
from continue.
@sestinj You're the legend! Thank you so much!
Also, love the requestOptions
addition idea, will definitely help a lot of users!
from continue.
😍
Thank you @sestinj ! I thought it would be about as simple as your commit when I first looked through the code, but I didn't want to assume 😅 This will be very valuable, and give some nice forward compatibility as ollama develops too. Thanks!!
from continue.
@noahhaon out of curiosity, what would you plan on setting this to? I am going to set a higher default even if the user doesn't configure it, and if this would be enough for your case, I tend to like avoiding too much extra configuration if it's not necessary. I was thinking 30min for default.
Either way will add this default today and if config is necessary, then will do that later
changes: 2fab0e2
from continue.
Related Issues (20)
- Hang when opening tab in large codebase HOT 2
- [IntelliJ Plugin] Unauthorised when using Mistral Codestral (despite proper config which work for VSCode) HOT 5
- [CON-242] Autodetect language in code blocks
- [CON-243] systemMessage not effective for claude 3 HOT 1
- Incorrect autocomplete context in jupyter with IntelliSense
- Can continue support using free trial 10 times a day for free? I think this is a good solution instead of permanently limiting it to 25 times now HOT 1
- Add support provider dify.ai HOT 1
- Stream as Parameter in Tab Autocomplete HOT 3
- Sample configs for diff deployments scenarios HOT 1
- [Jetbrains] Error vectordb::voyage-code-2 index - failed to load native library (JetBrains Plugin)
- Delete key triggering strange behaviour from CLion tab completion suggestions
- Support: Continue Diff causes Rust Anlayzer to crash HOT 4
- Error retrieving pre-indexed documentation. HOT 1
- Suggestion: make the metrics host configurable HOT 1
- run gradlew.bat build , build successful but no file created
- Keep configurations after update and stop recouring on boarding HOT 5
- too many errors showd in devtools HOT 1
- i can use ctrl+L but not ctrl +i HOT 3
- Monday thought: What if there were also tab completion in the "ask anything" box? HOT 1
- Llama.cpp hosted Condestral-22B not getting correct templates. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from continue.