Comments (14)
不如直接用xformers,flashattn有的xformers都有,flashattn不支持xformers都支持。。
from qwen.
@jackaihfia2334 另外,不使用FlashAttention也是可以运行的。确保transformers==4.31.0就ok
from qwen.
@shujun1992 之前碰到过,gcc版本过低确实很多东西都装不了。你可以配置一个conda虚拟环境,然后参照下面的方式升级gcc:
conda install gcc_linux-64
conda install gxx_linux-64
cd /path/to/anaconda3/envs/xxx/bin (跳到自己的conda虚拟环境)
ln -s gcc x86_64-conda_cos6-linux-gnu-gcc
ln -s g++ x86_64-conda_cos6-linux-gnu-g++
配置完后重新激活环境看看?
from qwen.
+1
from qwen.
@jackaihfia2334 @DingSiuyo 前面应该有一长串的log,关键报错信息应该在里面?可以参照一下这个issue排查问题
from qwen.
@jackaihfia2334 @DingSiuyo 前面应该有一长串的log,关键报错信息应该在里面?可以参照一下这个issue排查问题
您好,https://github.com/Dao-AILab/flash-attention#upgrading-from-flashattention-1x-to-flashattention-2
flashattention 更新了,model导入方法处,是否要做一下根据版本来重命名方法。
from qwen.
这个是必须要安装的吗?
from qwen.
Building wheels for collected packages: flash-attn
Building wheel for flash-attn (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [127 lines of output]
torch.version = 2.1.0.dev20230621+cu117
fatal: detected dubious ownership in repository at '/data/llm/code/Qwen-7B/flash-attention'
To add an exception for this directory, call:
git config --global --add safe.directory /data/llm/code/Qwen-7B/flash-attention
running bdist_wheel
/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py:478: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
warnings.warn(msg.format('we could not find ninja.'))
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.10
creating build/lib.linux-x86_64-3.10/flash_attn
copying flash_attn/bert_padding.py -> build/lib.linux-x86_64-3.10/flash_attn
copying flash_attn/flash_attention.py -> build/lib.linux-x86_64-3.10/flash_attn
copying flash_attn/flash_attn_interface.py -> build/lib.linux-x86_64-3.10/flash_attn
copying flash_attn/flash_attn_triton.py -> build/lib.linux-x86_64-3.10/flash_attn
copying flash_attn/flash_attn_triton_og.py -> build/lib.linux-x86_64-3.10/flash_attn
copying flash_attn/flash_blocksparse_attention.py -> build/lib.linux-x86_64-3.10/flash_attn
copying flash_attn/flash_blocksparse_attn_interface.py -> build/lib.linux-x86_64-3.10/flash_attn
copying flash_attn/fused_softmax.py -> build/lib.linux-x86_64-3.10/flash_attn
copying flash_attn/init.py -> build/lib.linux-x86_64-3.10/flash_attn
creating build/lib.linux-x86_64-3.10/flash_attn/layers
copying flash_attn/layers/patch_embed.py -> build/lib.linux-x86_64-3.10/flash_attn/layers
copying flash_attn/layers/rotary.py -> build/lib.linux-x86_64-3.10/flash_attn/layers
copying flash_attn/layers/init.py -> build/lib.linux-x86_64-3.10/flash_attn/layers
creating build/lib.linux-x86_64-3.10/flash_attn/losses
copying flash_attn/losses/cross_entropy.py -> build/lib.linux-x86_64-3.10/flash_attn/losses
copying flash_attn/losses/init.py -> build/lib.linux-x86_64-3.10/flash_attn/losses
creating build/lib.linux-x86_64-3.10/flash_attn/models
copying flash_attn/models/bert.py -> build/lib.linux-x86_64-3.10/flash_attn/models
copying flash_attn/models/gpt.py -> build/lib.linux-x86_64-3.10/flash_attn/models
copying flash_attn/models/gptj.py -> build/lib.linux-x86_64-3.10/flash_attn/models
copying flash_attn/models/gpt_neox.py -> build/lib.linux-x86_64-3.10/flash_attn/models
copying flash_attn/models/llama.py -> build/lib.linux-x86_64-3.10/flash_attn/models
copying flash_attn/models/opt.py -> build/lib.linux-x86_64-3.10/flash_attn/models
copying flash_attn/models/vit.py -> build/lib.linux-x86_64-3.10/flash_attn/models
copying flash_attn/models/init.py -> build/lib.linux-x86_64-3.10/flash_attn/models
creating build/lib.linux-x86_64-3.10/flash_attn/modules
copying flash_attn/modules/block.py -> build/lib.linux-x86_64-3.10/flash_attn/modules
copying flash_attn/modules/embedding.py -> build/lib.linux-x86_64-3.10/flash_attn/modules
copying flash_attn/modules/mha.py -> build/lib.linux-x86_64-3.10/flash_attn/modules
copying flash_attn/modules/mlp.py -> build/lib.linux-x86_64-3.10/flash_attn/modules
copying flash_attn/modules/init.py -> build/lib.linux-x86_64-3.10/flash_attn/modules
creating build/lib.linux-x86_64-3.10/flash_attn/ops
copying flash_attn/ops/activations.py -> build/lib.linux-x86_64-3.10/flash_attn/ops
copying flash_attn/ops/fused_dense.py -> build/lib.linux-x86_64-3.10/flash_attn/ops
copying flash_attn/ops/layer_norm.py -> build/lib.linux-x86_64-3.10/flash_attn/ops
copying flash_attn/ops/rms_norm.py -> build/lib.linux-x86_64-3.10/flash_attn/ops
copying flash_attn/ops/init.py -> build/lib.linux-x86_64-3.10/flash_attn/ops
creating build/lib.linux-x86_64-3.10/flash_attn/utils
copying flash_attn/utils/benchmark.py -> build/lib.linux-x86_64-3.10/flash_attn/utils
copying flash_attn/utils/distributed.py -> build/lib.linux-x86_64-3.10/flash_attn/utils
copying flash_attn/utils/generation.py -> build/lib.linux-x86_64-3.10/flash_attn/utils
copying flash_attn/utils/pretrained.py -> build/lib.linux-x86_64-3.10/flash_attn/utils
copying flash_attn/utils/init.py -> build/lib.linux-x86_64-3.10/flash_attn/utils
running build_ext
building 'flash_attn_cuda' extension
creating build/temp.linux-x86_64-3.10
creating build/temp.linux-x86_64-3.10/csrc
creating build/temp.linux-x86_64-3.10/csrc/flash_attn
creating build/temp.linux-x86_64-3.10/csrc/flash_attn/src
x86_64-linux-gnu-gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/data/llm/code/Qwen-7B/flash-attention/csrc/flash_attn -I/data/llm/code/Qwen-7B/flash-attention/csrc/flash_attn/src -I/data/llm/code/Qwen-7B/flash-attention/csrc/flash_attn/cutlass/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.10 -c csrc/flash_attn/fmha_api.cpp -o build/temp.linux-x86_64-3.10/csrc/flash_attn/fmha_api.o -O3 -std=c++17 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -DTORCH_EXTENSION_NAME=flash_attn_cuda -D_GLIBCXX_USE_CXX11_ABI=0
In file included from /data/llm/code/Qwen-7B/flash-attention/csrc/flash_attn/src/fmha.h:42,
from csrc/flash_attn/fmha_api.cpp:33:
/data/llm/code/Qwen-7B/flash-attention/csrc/flash_attn/src/fmha_utils.h: In function ‘void set_alpha(uint32_t&, float, Data_type)’:
/data/llm/code/Qwen-7B/flash-attention/csrc/flash_attn/src/fmha_utils.h:63:53: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
63 | alpha = reinterpret_cast<const uint32_t &>( h2 );
| ^~
/data/llm/code/Qwen-7B/flash-attention/csrc/flash_attn/src/fmha_utils.h:68:53: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
68 | alpha = reinterpret_cast<const uint32_t &>( h2 );
| ^~
/data/llm/code/Qwen-7B/flash-attention/csrc/flash_attn/src/fmha_utils.h:70:53: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
70 | alpha = reinterpret_cast<const uint32_t &>( norm );
| ^~~~
csrc/flash_attn/fmha_api.cpp: In function ‘void set_params_fprop(FMHA_fprop_params&, size_t, size_t, size_t, size_t, size_t, at::Tensor, at::Tensor, at::Tensor, at::Tensor, void*, void*, void*, void*, void*, float, float, bool, int)’:
csrc/flash_attn/fmha_api.cpp:64:11: warning: ‘void* memset(void*, int, size_t)’ clearing an object of non-trivial type ‘struct FMHA_fprop_params’; use assignment or value-initialization instead [-Wclass-memaccess]
64 | memset(¶ms, 0, sizeof(params));
| ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from csrc/flash_attn/fmha_api.cpp:33:
/data/llm/code/Qwen-7B/flash-attention/csrc/flash_attn/src/fmha.h:75:8: note: ‘struct FMHA_fprop_params’ declared here
75 | struct FMHA_fprop_params : public Qkv_params {
| ^~~~~~~~~~~~~~~~~
csrc/flash_attn/fmha_api.cpp:60:15: warning: unused variable ‘acc_type’ [-Wunused-variable]
60 | Data_type acc_type = DATA_TYPE_FP32;
| ^~~~~~~~
csrc/flash_attn/fmha_api.cpp: In function ‘std::vectorat::Tensor mha_fwd(const at::Tensor&, const at::Tensor&, const at::Tensor&, at::Tensor&, const at::Tensor&, const at::Tensor&, int, int, float, float, bool, bool, bool, int, c10::optionalat::Generator)’:
csrc/flash_attn/fmha_api.cpp:208:10: warning: unused variable ‘is_sm80’ [-Wunused-variable]
208 | bool is_sm80 = dprops->major == 8 && dprops->minor == 0;
| ^~~~~~~
csrc/flash_attn/fmha_api.cpp: In function ‘std::vectorat::Tensor mha_fwd_block(const at::Tensor&, const at::Tensor&, const at::Tensor&, const at::Tensor&, const at::Tensor&, const at::Tensor&, int, int, float, float, bool, bool, c10::optionalat::Generator)’:
csrc/flash_attn/fmha_api.cpp:533:10: warning: unused variable ‘is_sm80’ [-Wunused-variable]
533 | bool is_sm80 = dprops->major == 8 && dprops->minor == 0;
| ^~~~~~~
/usr/local/cuda/bin/nvcc -I/data/llm/code/Qwen-7B/flash-attention/csrc/flash_attn -I/data/llm/code/Qwen-7B/flash-attention/csrc/flash_attn/src -I/data/llm/code/Qwen-7B/flash-attention/csrc/flash_attn/cutlass/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.10 -c csrc/flash_attn/src/fmha_block_dgrad_fp16_kernel_loop.sm80.cu -o build/temp.linux-x86_64-3.10/csrc/flash_attn/src/fmha_block_dgrad_fp16_kernel_loop.sm80.o -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -O3 -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math --ptxas-options=-v -lineinfo -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 --threads 4 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -DTORCH_EXTENSION_NAME=flash_attn_cuda -D_GLIBCXX_USE_CXX11_ABI=0
In file included from /data/llm/code/Qwen-7B/flash-attention/csrc/flash_attn/src/fmha/smem_tile.h:32,
from csrc/flash_attn/src/fmha_kernel.h:34,
from csrc/flash_attn/src/fmha_fprop_kernel_1xN.h:31,
from csrc/flash_attn/src/fmha_block_dgrad_kernel_1xN_loop.h:6,
from csrc/flash_attn/src/fmha_block_dgrad_fp16_kernel_loop.sm80.cu:5:
/data/llm/code/Qwen-7B/flash-attention/csrc/flash_attn/src/fmha/gemm.h:32:10: fatal error: cutlass/cutlass.h: No such file or directory
32 | #include "cutlass/cutlass.h"
| ^~~~~~~~~~~~~~~~~~~
compilation terminated.
In file included from /data/llm/code/Qwen-7B/flash-attention/csrc/flash_attn/src/fmha/smem_tile.h:32,
from csrc/flash_attn/src/fmha_kernel.h:34,
from csrc/flash_attn/src/fmha_fprop_kernel_1xN.h:31,
from csrc/flash_attn/src/fmha_block_dgrad_kernel_1xN_loop.h:6,
from csrc/flash_attn/src/fmha_block_dgrad_fp16_kernel_loop.sm80.cu:5:
/data/llm/code/Qwen-7B/flash-attention/csrc/flash_attn/src/fmha/gemm.h:32:10: fatal error: cutlass/cutlass.h: No such file or directory
32 | #include "cutlass/cutlass.h"
| ^~~~~~~~~~~~~~~~~~~
compilation terminated.
In file included from /data/llm/code/Qwen-7B/flash-attention/csrc/flash_attn/src/fmha/smem_tile.h:32,
from csrc/flash_attn/src/fmha_kernel.h:34,
from csrc/flash_attn/src/fmha_fprop_kernel_1xN.h:31,
from csrc/flash_attn/src/fmha_block_dgrad_kernel_1xN_loop.h:6,
from csrc/flash_attn/src/fmha_block_dgrad_fp16_kernel_loop.sm80.cu:5:
/data/llm/code/Qwen-7B/flash-attention/csrc/flash_attn/src/fmha/gemm.h:32:10: fatal error: cutlass/cutlass.h: No such file or directory
32 | #include "cutlass/cutlass.h"
| ^~~~~~~~~~~~~~~~~~~
compilation terminated.
error: command '/usr/local/cuda/bin/nvcc' failed with exit code 255
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for flash-attn
Running setup.py clean for flash-attn
Failed to build flash-attn
ERROR: Could not build wheels for flash-attn, which is required to install pyproject.toml-based projects
@logicwong
from qwen.
@jackaihfia2334 @DingSiuyo 前面应该有一长串的log,关键报错信息应该在里面?可以参照一下这个issue排查问题
您好,https://github.com/Dao-AILab/flash-attention#upgrading-from-flashattention-1x-to-flashattention-2
flashattention 更新了,model导入方法处,是否要做一下根据版本来重命名方法。
目前readme的installation里还是贴的v1.0.8的版本,如果按readme来装就不用改。后续我们会修改代码让它同时支持1.0和2.0版本
from qwen.
@jackaihfia2334 @DingSiuyo 前面应该有一长串的log,关键报错信息应该在里面?可以参照一下这个issue排查问题
您好,https://github.com/Dao-AILab/flash-attention#upgrading-from-flashattention-1x-to-flashattention-2
flashattention 更新了,model导入方法处,是否要做一下根据版本来重命名方法。目前readme的installation里还是贴的v1.0.8的版本,如果按readme来装就不用改。后续我们会修改代码让它同时支持1.0和2.0版本
QWen_PRETRAINED_MODEL_ARCHIVE_LIST = ["qwen-7b"]
try:
# from flash_attn.flash_attn_interface import flash_attn_unpadded_func
import flash_attn
if int(flash_attn.__version__.split(".")[0]) == 1:
from flash_attn.flash_attn_interface import flash_attn_unpadded_func
if int(flash_attn.__version__.split(".")[0]) == 2:
from flash_attn.flash_attn_interface import flash_attn_varlen_func as flash_attn_unpadded_func
except ImportError:
flash_attn_unpadded_func = None
print("import flash_attn qkv fail")
实际上稍微改一下就好(
from qwen.
@jackaihfia2334 @DingSiuyo 前面应该有一长串的log,关键报错信息应该在里面?可以参照一下这个issue排查问题
您好,https://github.com/Dao-AILab/flash-attention#upgrading-from-flashattention-1x-to-flashattention-2
flashattention 更新了,model导入方法处,是否要做一下根据版本来重命名方法。目前readme的installation里还是贴的v1.0.8的版本,如果按readme来装就不用改。后续我们会修改代码让它同时支持1.0和2.0版本
QWen_PRETRAINED_MODEL_ARCHIVE_LIST = ["qwen-7b"] try: # from flash_attn.flash_attn_interface import flash_attn_unpadded_func import flash_attn if int(flash_attn.__version__.split(".")[0]) == 1: from flash_attn.flash_attn_interface import flash_attn_unpadded_func if int(flash_attn.__version__.split(".")[0]) == 2: from flash_attn.flash_attn_interface import flash_attn_varlen_func as flash_attn_unpadded_func except ImportError: flash_attn_unpadded_func = None print("import flash_attn qkv fail")
实际上稍微改一下就好(
可以直接提pr哈
from qwen.
按照flash attention官方的reademe,在ngc pytorch容器内安装的
仍然报错
ptxas info : Function properties for _Z25flash_bwd_dot_do_o_kernelILb1E23Flash_bwd_kernel_traitsILi64ELi128ELi128ELi8ELi4ELi4ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi64ELi128ELi128ELi8ES2_EEEv16Flash_bwd_params
0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info : Used 34 registers
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 1902, in _run_ninja_build
subprocess.run(
File "/usr/lib/python3.10/subprocess.py", line 524, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/tmp/pip-install-0em76put/flash-attn_82b7e874dae44f0f854165b5859a6df5/setup.py", line 202, in <module>
setup(
File "/usr/local/lib/python3.10/dist-packages/setuptools/__init__.py", line 107, in setup
return distutils.core.setup(**attrs)
File "/usr/lib/python3.10/distutils/core.py", line 148, in setup
dist.run_commands()
File "/usr/lib/python3.10/distutils/dist.py", line 966, in run_commands
self.run_command(cmd)
File "/usr/local/lib/python3.10/dist-packages/setuptools/dist.py", line 1234, in run_command
super().run_command(command)
File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/usr/local/lib/python3.10/dist-packages/wheel/bdist_wheel.py", line 343, in run
self.run_command("build")
File "/usr/lib/python3.10/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/usr/local/lib/python3.10/dist-packages/setuptools/dist.py", line 1234, in run_command
super().run_command(command)
File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/usr/lib/python3.10/distutils/command/build.py", line 135, in run
self.run_command(cmd_name)
File "/usr/lib/python3.10/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/usr/local/lib/python3.10/dist-packages/setuptools/dist.py", line 1234, in run_command
super().run_command(command)
File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/usr/local/lib/python3.10/dist-packages/setuptools/command/build_ext.py", line 84, in run
_build_ext.run(self)
File "/usr/local/lib/python3.10/dist-packages/Cython/Distutils/old_build_ext.py", line 186, in run
_build_ext.build_ext.run(self)
File "/usr/lib/python3.10/distutils/command/build_ext.py", line 340, in run
self.build_extensions()
File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 848, in build_extensions
build_ext.build_extensions(self)
File "/usr/local/lib/python3.10/dist-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions
_build_ext.build_ext.build_extensions(self)
File "/usr/lib/python3.10/distutils/command/build_ext.py", line 449, in build_extensions
self._build_extensions_serial()
File "/usr/lib/python3.10/distutils/command/build_ext.py", line 474, in _build_extensions_serial
self.build_extension(ext)
File "/usr/local/lib/python3.10/dist-packages/setuptools/command/build_ext.py", line 246, in build_extension
_build_ext.build_extension(self, ext)
File "/usr/lib/python3.10/distutils/command/build_ext.py", line 529, in build_extension
objects = self.compiler.compile(sources,
File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 661, in unix_wrap_ninja_compile
_write_ninja_file_and_compile_objects(
File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 1575, in _write_ninja_file_and_compile_objects
_run_ninja_build(
File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 1918, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for flash-attn
Running setup.py clean for flash-attn
Failed to build flash-attn
ERROR: Could not build wheels for flash-attn, which is required to install pyproject.toml-based projects
from qwen.
按照flash attention官方的reademe,在ngc pytorch容器内安装的 仍然报错 ptxas info : Function properties for _Z25flash_bwd_dot_do_o_kernelILb1E23Flash_bwd_kernel_traitsILi64ELi128ELi128ELi8ELi4ELi4ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi64ELi128ELi128ELi8ES2_EEEv16Flash_bwd_params 0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads ptxas info : Used 34 registers ninja: build stopped: subcommand failed. Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 1902, in _run_ninja_build subprocess.run( File "/usr/lib/python3.10/subprocess.py", line 524, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception: Traceback (most recent call last): File "<string>", line 2, in <module> File "<pip-setuptools-caller>", line 34, in <module> File "/tmp/pip-install-0em76put/flash-attn_82b7e874dae44f0f854165b5859a6df5/setup.py", line 202, in <module> setup( File "/usr/local/lib/python3.10/dist-packages/setuptools/__init__.py", line 107, in setup return distutils.core.setup(**attrs) File "/usr/lib/python3.10/distutils/core.py", line 148, in setup dist.run_commands() File "/usr/lib/python3.10/distutils/dist.py", line 966, in run_commands self.run_command(cmd) File "/usr/local/lib/python3.10/dist-packages/setuptools/dist.py", line 1234, in run_command super().run_command(command) File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command cmd_obj.run() File "/usr/local/lib/python3.10/dist-packages/wheel/bdist_wheel.py", line 343, in run self.run_command("build") File "/usr/lib/python3.10/distutils/cmd.py", line 313, in run_command self.distribution.run_command(command) File "/usr/local/lib/python3.10/dist-packages/setuptools/dist.py", line 1234, in run_command super().run_command(command) File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command cmd_obj.run() File "/usr/lib/python3.10/distutils/command/build.py", line 135, in run self.run_command(cmd_name) File "/usr/lib/python3.10/distutils/cmd.py", line 313, in run_command self.distribution.run_command(command) File "/usr/local/lib/python3.10/dist-packages/setuptools/dist.py", line 1234, in run_command super().run_command(command) File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command cmd_obj.run() File "/usr/local/lib/python3.10/dist-packages/setuptools/command/build_ext.py", line 84, in run _build_ext.run(self) File "/usr/local/lib/python3.10/dist-packages/Cython/Distutils/old_build_ext.py", line 186, in run _build_ext.build_ext.run(self) File "/usr/lib/python3.10/distutils/command/build_ext.py", line 340, in run self.build_extensions() File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 848, in build_extensions build_ext.build_extensions(self) File "/usr/local/lib/python3.10/dist-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions _build_ext.build_ext.build_extensions(self) File "/usr/lib/python3.10/distutils/command/build_ext.py", line 449, in build_extensions self._build_extensions_serial() File "/usr/lib/python3.10/distutils/command/build_ext.py", line 474, in _build_extensions_serial self.build_extension(ext) File "/usr/local/lib/python3.10/dist-packages/setuptools/command/build_ext.py", line 246, in build_extension _build_ext.build_extension(self, ext) File "/usr/lib/python3.10/distutils/command/build_ext.py", line 529, in build_extension objects = self.compiler.compile(sources, File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 661, in unix_wrap_ninja_compile _write_ninja_file_and_compile_objects( File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 1575, in _write_ninja_file_and_compile_objects _run_ninja_build( File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 1918, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error compiling objects for extension [end of output]
note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for flash-attn Running setup.py clean for flash-attn Failed to build flash-attn ERROR: Could not build wheels for flash-attn, which is required to install pyproject.toml-based projects
可以在FlashAttention官方库提下issue哈
from qwen.
有遇到安装这个报错gcc版本问题的吗,
报错如下:RuntimeError: The current installed version of g++ (4.8.5) is less than the minimum required version by CUDA 11.4 (6.0.0). Please make sure to use an adequate version of g++ (>=6.0.0, <12.0).
不敢随便升级GCC
from qwen.
Related Issues (20)
- 💡 [REQUEST] - <title>数据集构造方法请教 HOT 1
- [BUG] <title> code_interpreter 生成的图像只能生成到阿里云上么,不能不传到云上,只在本地保存么? HOT 2
- 指定了模型地址,还是提示 Incorrect path_or_model_id: '/data/shared/Qwen/Qwen-Chat/'
- [BUG] <title> 如何用vllm部署qlora后的模型 HOT 1
- [BUG] CUDA Error: invalid device function /tmp/pip-req-build-5rlg4jgm/ln_fwd_kernels.cuh 236 HOT 4
- [BUG] .CalledProcessError: Command '['/usr/bin/gcc', '/tmp/tmpecd6su1w/main.c' HOT 3
- how to convert qwen.tiktoken to tokenzier.model HOT 1
- Run Qwen /openai_api.py, Error :Input should be a valid string, body.messages[3].function_call,请问Qwen1.5不支持了么? HOT 1
- pip install csrc/layer_norm 不成功 HOT 1
- [BUG] <title> wrong system prompt check? HOT 2
- [BUG] <title>batch_infer报错:'tuple' object has no attribute 'dtype' HOT 2
- 如何添加`LogitsProcessor`控制结果输出?
- [BUG] <title>lora微调loss异常? HOT 5
- tokenizer.decoder 抛出'utf-8' codec can't decode bytes in position 1-2: unexpected end of data异常 HOT 2
- [BUG] lora微调后,合并成一个模型。这种方式如何加载且推理 HOT 4
- [BUG] Qwen/Qwen-72B-Chat-Int8,不能多GPU并行计算 HOT 1
- Qwen/eval中的评测CEval和CMMLU,开大推理的batchsize评测指标会显著降低 HOT 1
- 请问基于qwen-72b-chat,基于怎样的配置可以在一台4090上训练起来? HOT 4
- 💡 [REQUEST] - <title> 关于lora 模型合并的几个问题 HOT 3
- [BUG] <title>Could not get Gradio config from,现在所有的模型似乎不再支持API调用了??不管是达摩还是HF HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from qwen.