❓ General Questions I have a pruned model which delete some qkv he

hi, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

[Question] Can PagedKVCache support different size of kvcache in different layers? about mlc-llm HOT 4 OPEN

BenchuYee commented on June 12, 2024

[Question] Can PagedKVCache support different size of kvcache in different layers?

from mlc-llm.

Comments (4)

Hzfengsy commented on June 12, 2024

The current version does not support it yet and it might be hard to modify for it.

from mlc-llm.

BenchuYee commented on June 12, 2024

hi, @Hzfengsy , so if I want to deploy my pruned model, now the only solution is use nn.KVCache to replace the PagedKVCache as the old commit did? Or if I can create more than one PagedKVCache and use different PagedKVCache in different layers? Could you give me some suggestion?
thank you for reply.

from mlc-llm.

tqchen commented on June 12, 2024

KV cache is a common interface, the solution right now would be to create a difference instance of kv cache implementation of the same interfaceand replace it

from mlc-llm.

DeclK commented on June 12, 2024

@BenchuYee hi, I want to adjust the KVCache for more flexible usage, which old commit did you use to build the nn.KVCache model? and BTW, do you guys observe obvious performance drop using nn.KVCache instead of PagedKVCache (no batch requirest considered)?

from mlc-llm.

[Question] Can PagedKVCache support different size of kvcache in different layers? about mlc-llm HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent