Comments (1)
Thanks for reporting, this could relates to the thrust size estimation and maximum batch size, if there are more details, feel free to post it. as of now it is hard to reproduce the example, as a result, closing this issue for now as likely latest gpu sampler might addressed this issue, feel free to open new ones
from mlc-llm.
Related Issues (20)
- qwen1.5-0.5B-chat : lm_head.weight HOT 6
- [Question] How to independently clone the 3rdparty/tvm of mlc_llm, the commit id in submodule can't be found in either mlc-ai/relax or apache/tvm HOT 3
- 'ChatGLMTokenizer' object has no attribute 'backend_tokenizer' HOT 1
- [Question] Does OpenCL on Adreno GPU support OpenCL ML SDK HOT 1
- mlc_llm serve fails on concurrent users - Llama3 70B parameter hosting HOT 3
- 执行mlc_chat指令时总是报错 HOT 3
- Compiling WebAssembly library with debug symbols/source map to aid in debugging
- [Doc] Request for suggested build-from-source options + explanation of added functionality
- [Doc] benchmark on different hardware
- [Bug] iOS | mlc_llm package not working HOT 6
- [Model Request] T5
- [Question] Cannot compile custom model to work on web browser
- [Bug] Google Colab T4 Error TVMError: FlashInfer ParallelTopPSamplingFromProb error no kernel image is available for execution on the device HOT 4
- exe "mlc_llm package" error HOT 4
- [Bug] CUDA: out of memory on dual gpu HOT 2
- [Bug] Bug Missing mlc_llm.dll file when setting up MLC LLM for Android development on Windows HOT 4
- [Bug] `mlc_llm serve` throws `CUDA: invalid device ordinal` HOT 4
- [Bug] SEVERE downstream task performance degradation compared to uncompiled model HOT 11
- run mlc_llm package ValueError: Git clone failed with return code 128: None. The command was HOT 4
- [Feature Request] please allow f32q5_k and f16q5_k quantizations
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mlc-llm.