System Info text-generation-launcher --model-id "/data2/ollama7b"<

You need to re-install vllm and flash-attention-v2 <div class="highlight highlight

You need to re-install vllm and flash-attention-v2 <div class="highli

Error forreshape_and_cache cache_ops.reshape_and_cache(key, value, key_cache, value_cache, slots, "auto", 1.0) about text-generation-inference HOT 3 CLOSED

hellangleZ commented on June 5, 2024

Error forreshape_and_cache cache_ops.reshape_and_cache(key, value, key_cache, value_cache, slots, "auto", 1.0)

from text-generation-inference.

Comments (3)

OlivierDehaene commented on June 5, 2024 1

You need to re-install vllm and flash-attention-v2

cd text-generation-inference/server
rm -rf vllm
make install-vllm-cuda

rm -rf flash-attention-v2
make install-flash-attention-v2-cuda

Sorry we forgot to add this to the release notes. Since we mainly ship a container we forget about local installs.

from text-generation-inference.

Narsil commented on June 5, 2024

You need to install our version of vllm (cd server && make install-vllm) as we optimized the kernels for our codebase.

That's why we recommend using the docker layer, it makes it easier to navigate the dependencies (We use the CLI a lot to dev things, it's just there's no easy way to make it easy on users to have a clean environment given an arbitrary machine/pre-existing environment)

from text-generation-inference.

hellangleZ commented on June 5, 2024

You need to re-install vllm and flash-attention-v2
cd text-generation-inference/server
rm -rf vllm
make install-vllm-cuda

rm -rf flash-attention-v2
make install-flash-attention-v2-cuda
Sorry we forgot to add this to the release notes. Since we mainly ship a container we forget about local installs.

Thank you，it works for me

from text-generation-inference.

Error forreshape_and_cache cache_ops.reshape_and_cache(key, value, key_cache, value_cache, slots, "auto", 1.0) about text-generation-inference HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent