Giter VIP home page Giter VIP logo

Comments (42)

hrbigelow avatar hrbigelow commented on July 24, 2024 18

I'm not familiar with ninja, but I was able to build causal_conv1d from source.

First, note that these two commands should produce matching CUDA versions:

python3 -c 'import torch; print(torch.version.cuda)'
nvcc --version

since nvcc will be used during the build of causal_conv1d. If they don't, you might need to do:

sudo update-alternatives --config cuda

and then set the CUDA alternatives version to the version reported by torch.version.cuda

Then:

git clone https://github.com/Dao-AILab/causal-conv1d.git
cd causal_conv1d
git checkout v1.0.2  # this is the highest compatible version allowed by Mamba
CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .

More detail is in this other issue although that issue doesn't particularly deal with this one.

Please let me know if this works!

from mamba.

duncanriach avatar duncanriach commented on July 24, 2024 5

Building on @hrbigelow's instructions above, in order to get the mamba-ssm package pip installed, I did the following inside an instance of container image nvcr.io/nvidia/pytorch:23.12-py3. I also confirmed that it worked using docker.io/pytorch/pytorch:2.1.2-cuda12.1-cudnn8-devel.

$ git clone https://github.com/Dao-AILab/causal-conv1d.git
$ cd causal-conv1d
$ git checkout v1.1.1 # current latest version tag
$ CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .
$ cd ..
$ git clone https://github.com/state-spaces/mamba.git
$ cd mamba
$ git checkout v1.1.1 # current latest version tag
$ MAMBA_FORCE_BUILD=TRUE pip install .

[ Setting the *_FORCE_BUILD=TRUE environment variables, as shown above, may avoid the need to carry out the following purging process. If you're accessing the cloned directories on a disk outside your container, you may need to clean/purge those directories before building inside different container versions. Otherwise, stale code and settings will cause the build to not work properly in your new container. This tends to show up as dynamic linking errors when importing the mamba_ssm package into Python. To clean/purge, run rm -rf *.egg.info build in both the causal-conv1d clone directory and the mamba clone directory. ]

Checking the installation:

$ pip show causal-conv1d
Name: causal-conv1d
Version: 1.1.1
Summary: Causal depthwise conv1d in CUDA, with a PyTorch interface
Home-page: https://github.com/Dao-AILab/causal-conv1d
Author: Tri Dao
Author-email: [email protected]
License: UNKNOWN
Location: /usr/local/lib/python3.10/dist-packages
Requires: packaging, torch, buildtools, ninja
Required-by: mamba-ssm

$ pip show mamba-ssm
Name: mamba-ssm
Version: 1.1.1
Summary: Mamba state-space model
Home-page: https://github.com/state-spaces/mamba
Author: Tri Dao, Albert Gu
Author-email: [email protected], [email protected]
License: UNKNOWN
Location: /usr/local/lib/python3.10/dist-packages
Requires: einops, causal-conv1d, transformers, torch, packaging, ninja, triton
Required-by:

$ python
>>> import torch
>>> from mamba_ssm import Mamba

# no errors

from mamba.

takeraparterer avatar takeraparterer commented on July 24, 2024 4

I'm not familiar with ninja, but I was able to build causal_conv1d from source.

First, note that these two commands should produce matching CUDA versions:

python3 -c 'import torch; print(torch.version.cuda)'
nvcc --version

since nvcc will be used during the build of causal_conv1d. If they don't, you might need to do:

sudo update-alternatives --config cuda

and then set the CUDA alternatives version to the version reported by torch.version.cuda

Then:

git clone https://github.com/Dao-AILab/causal-conv1d.git
cd causal_conv1d
git checkout v1.0.2  # this is the highest compatible version allowed by Mamba
CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .

More detail is in this other issue although that issue doesn't particularly deal with this one.

Please let me know if this works!

` [
T=c10::AliasInfo
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to alias template instantiation 'c10::OptionalBase' being compiled
with
[
T=c10::AliasInfo
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\ATen/core/function_schema.h(28): note: see reference to class template instantiation 'c10::optionalc10::AliasInfo' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446):
warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with
[
T=c10::AliasInfo
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212):
warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted
with
[
T=std::vector<c10::SymInt,std::allocatorc10::SymInt>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411):
note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled
with
[
T=std::vector<c10::SymInt,std::allocatorc10::SymInt>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled
with
[
T=std::vector<c10::SymInt,std::allocatorc10::SymInt>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to alias template instantiation 'c10::OptionalBase' being compiled
with
[
T=std::vector<c10::SymInt,std::allocatorc10::SymInt>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\ATen/core/ivalue.h(96): note: see reference to class template instantiation 'c10::optional<std::vector<T,std::allocator>>' being compiled
with
[
T=c10::SymInt
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h(378): note: see reference to class template instantiation 'c10::OptionalArrayc10::SymInt' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h(388): note: see reference to class template instantiation 'c10::impl::ivalue_to_arg<c10::OptionalArrayc10::SymInt,AllowDeprecatedTypes>' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446):
warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with
[
T=std::vector<c10::SymInt,std::allocatorc10::SymInt>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212):
warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted
with
[
T=c10::eitherc10::OperatorName,c10::FunctionSchema
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411):
note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled
with
[
T=c10::eitherc10::OperatorName,c10::FunctionSchema
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled
with
[
T=c10::eitherc10::OperatorName,c10::FunctionSchema
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to alias template instantiation 'c10::OptionalBase' being compiled
with
[
T=c10::eitherc10::OperatorName,c10::FunctionSchema
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\ATen/core/op_registration/op_registration.h(434): note: see reference to class template instantiation 'c10::optional<c10::eitherc10::OperatorName,c10::FunctionSchema>' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446):
warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with
[
T=c10::eitherc10::OperatorName,c10::FunctionSchema
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212):
warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted
with
[
T=at::StepCallbacks
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411):
note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled
with
[
T=at::StepCallbacks
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled
with
[
T=at::StepCallbacks
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to alias template instantiation 'c10::OptionalBase' being compiled
with
[
T=at::StepCallbacks
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch/csrc/autograd/function.h(166): note: see reference to class template instantiation 'c10::optionalat::StepCallbacks' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446):
warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with
[
T=at::StepCallbacks
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212):
warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted
with
[
T=c10::DimVector
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411):
note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled
with
[
T=c10::DimVector
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled
with
[
T=c10::DimVector
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to alias template instantiation 'c10::OptionalBase' being compiled
with
[
T=c10::DimVector
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\ATen/TensorIterator.h(918): note: see reference to class template instantiation 'c10::optionalc10::DimVector' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446):
warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with
[
T=c10::DimVector
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212):
warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted
with
[
T=c10::impl::AnnotatedSchema
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411):
note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled
with
[
T=c10::impl::AnnotatedSchema
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled
with
[
T=c10::impl::AnnotatedSchema
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to alias template instantiation 'c10::OptionalBase' being compiled
with
[
T=c10::impl::AnnotatedSchema
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\ATen/core/dispatch/OperatorEntry.h(223): note: see reference to class template instantiation 'c10::optionalc10::impl::AnnotatedSchema' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446):
warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with
[
T=c10::impl::AnnotatedSchema
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212):
warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted
with
[
T=c10::impl::OperatorEntry::CppSignatureWithDebug
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411):
note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled
with
[
T=c10::impl::OperatorEntry::CppSignatureWithDebug
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled
with
[
T=c10::impl::OperatorEntry::CppSignatureWithDebug
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to alias template instantiation 'c10::OptionalBase' being compiled
with
[
T=c10::impl::OperatorEntry::CppSignatureWithDebug
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\ATen/core/dispatch/OperatorEntry.h(286): note: see reference to class template instantiation 'c10::optionalc10::impl::OperatorEntry::CppSignatureWithDebug' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446):
warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with
[
T=c10::impl::OperatorEntry::CppSignatureWithDebug
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212):
warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted
with
[
T=std::tuplestd::string,size_t,size_t
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411):
note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled
with
[
T=std::tuplestd::string,size_t,size_t
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled
with
[
T=std::tuplestd::string,size_t,size_t
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to alias template instantiation 'c10::OptionalBase' being compiled
with
[
T=std::tuplestd::string,size_t,size_t
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch/csrc/jit/frontend/source_range.h(357): note: see reference to class template instantiation 'c10::optional<std::tuplestd::string,size_t,size_t>' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446):
warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with
[
T=std::tuplestd::string,size_t,size_t
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212):
warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted
with
[
T=torch::jit::SourceRange
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411):
note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled
with
[
T=torch::jit::SourceRange
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled
with
[
T=torch::jit::SourceRange
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to alias template instantiation 'c10::OptionalBase' being compiled
with
[
T=torch::jit::SourceRange
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch/csrc/jit/frontend/source_range.h(380): note: see reference to class template instantiation 'c10::optionaltorch::jit::SourceRange' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446):
warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with
[
T=torch::jit::SourceRange
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212):
warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted
with
[
T=torch::jit::InlinedCallStackPtr
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411):
note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled
with
[
T=torch::jit::InlinedCallStackPtr
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled
with
[
T=torch::jit::InlinedCallStackPtr
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to alias template instantiation 'c10::OptionalBase' being compiled
with
[
T=torch::jit::InlinedCallStackPtr
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch/csrc/jit/ir/scope.h(127): note: see reference to class template instantiation 'c10::optionaltorch::jit::InlinedCallStackPtr' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446):
warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with
[
T=torch::jit::InlinedCallStackPtr
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212):
warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted
with
[
T=torch::jit::ModuleInstanceInfo
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411):
note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled
with
[
T=torch::jit::ModuleInstanceInfo
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled
with
[
T=torch::jit::ModuleInstanceInfo
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to alias template instantiation 'c10::OptionalBase' being compiled
with
[
T=torch::jit::ModuleInstanceInfo
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch/csrc/jit/ir/scope.h(140): note: see reference to class template instantiation 'c10::optionaltorch::jit::ModuleInstanceInfo' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446):
warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with
[
T=torch::jit::ModuleInstanceInfo
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212):
warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted
with
[
T=torch::jit::ScopePtr
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411):
note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled
with
[
T=torch::jit::ScopePtr
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled
with
[
T=torch::jit::ScopePtr
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to alias template instantiation 'c10::OptionalBase' being compiled
with
[
T=torch::jit::ScopePtr
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch/csrc/jit/ir/constants.h(29): note: see reference to class template instantiation 'c10::optionaltorch::jit::ScopePtr' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446):
warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with
[
T=torch::jit::ScopePtr
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212):
warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted
with
[
T=at::ThreadLocalState
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411):
note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled
with
[
T=at::ThreadLocalState
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled
with
[
T=at::ThreadLocalState
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to alias template instantiation 'c10::OptionalBase' being compiled
with
[
T=at::ThreadLocalState
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch/csrc/jit/runtime/interpreter.h(150): note: see reference to class template instantiation 'c10::optionalat::ThreadLocalState' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446):
warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with
[
T=at::ThreadLocalState
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212):
warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted
with
[
T=std::shared_ptrtorch::jit::Graph
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411):
note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled
with
[
T=std::shared_ptrtorch::jit::Graph
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled
with
[
T=std::shared_ptrtorch::jit::Graph
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to alias template instantiation 'c10::OptionalBase' being compiled
with
[
T=std::shared_ptrtorch::jit::Graph
]
C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.37.32822\include\array(577): note: see reference to class template instantiation 'c10::optional<std::shared_ptrtorch::jit::Graph>' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch/csrc/jit/api/function_impl.h(164): note: see reference to class template instantiation 'std::array<c10::optional<std::shared_ptrtorch::jit::Graph>,4>' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446):
warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with
[
T=std::shared_ptrtorch::jit::Graph
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212):
warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted
with
[
T=torch::jit::GraphExecutor
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411):
note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled
with
[
T=torch::jit::GraphExecutor
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled
with
[
T=torch::jit::GraphExecutor
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to alias template instantiation 'c10::OptionalBase' being compiled
with
[
T=torch::jit::GraphExecutor
]
C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.37.32822\include\array(577): note: see reference to class template instantiation 'c10::optionaltorch::jit::GraphExecutor' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch/csrc/jit/api/function_impl.h(178): note: see reference to class template instantiation 'std::array<c10::optionaltorch::jit::GraphExecutor,4>' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446):
warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with
[
T=torch::jit::GraphExecutor
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212):
warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted
with
[
T=torch::jit::Method
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411):
note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled
with
[
T=torch::jit::Method
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled
with
[
T=torch::jit::Method
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to alias template instantiation 'c10::OptionalBase' being compiled
with
[
T=torch::jit::Method
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch/csrc/jit/api/object.h(48): note: see reference to class template instantiation 'c10::optionaltorch::jit::Method' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446):
warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with
[
T=torch::jit::Method
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212):
warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted
with
[
T=std::vector<std::string,std::allocatorstd::string>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411):
note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled
with
[
T=std::vector<std::string,std::allocatorstd::string>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled
with
[
T=std::vector<std::string,std::allocatorstd::string>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to alias template instantiation 'c10::OptionalBase' being compiled
with
[
T=std::vector<std::string,std::allocatorstd::string>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch/csrc/jit/api/module.h(329): note: see reference to class template instantiation 'c10::optional<std::vector<std::string,std::allocatorstd::string>>' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446):
warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with
[
T=std::vector<std::string,std::allocatorstd::string>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212):
warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted
with
[
T=std::function<void (const torch::autograd::profiler::thread_event_lists &)>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411):
note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled
with
[
T=std::function<void (const torch::autograd::profiler::thread_event_lists &)>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled
with
[
T=std::function<void (const torch::autograd::profiler::thread_event_lists &)>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to alias template instantiation 'c10::OptionalBase' being compiled
with
[
T=std::function<void (const torch::autograd::profiler::thread_event_lists &)>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch/csrc/autograd/profiler_legacy.h(411): note: see reference to class template instantiation 'c10::optional<std::function<void (const torch::autograd::profiler::thread_event_lists &)>>' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446):
warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with
[
T=std::function<void (const torch::autograd::profiler::thread_event_lists &)>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212):
warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted
with
[
T=torch::nn::TripletMarginWithDistanceLossOptions::distance_function_t
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411):
note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled
with
[
T=torch::nn::TripletMarginWithDistanceLossOptions::distance_function_t
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled
with
[
T=torch::nn::TripletMarginWithDistanceLossOptions::distance_function_t
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to alias template instantiation 'c10::OptionalBase' being compiled
with
[
T=torch::nn::TripletMarginWithDistanceLossOptions::distance_function_t
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch\csrc\api\include\torch/nn/options/loss.h(453): note: see reference to class template instantiation 'c10::optionaltorch::nn::TripletMarginWithDistanceLossOptions::distance_function_t' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446):
warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with
[
T=torch::nn::TripletMarginWithDistanceLossOptions::distance_function_t
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212):
warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted
with
[
T=std::vector<double,std::allocator>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411):
note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled
with
[
T=std::vector<double,std::allocator>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled
with
[
T=std::vector<double,std::allocator>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to alias template instantiation 'c10::OptionalBase' being compiled
with
[
T=std::vector<double,std::allocator>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch\csrc\api\include\torch/nn/options/upsampling.h(27): note: see reference to class template instantiation 'c10::optional<std::vector<T,std::allocator>>' being compiled
with
[
T=double
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446):
warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with
[
T=std::vector<double,std::allocator>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212):
warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted
with
[
T=std::tupleat::Tensor,at::Tensor
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411):
note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled
with
[
T=std::tupleat::Tensor,at::Tensor
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled
with
[
T=std::tupleat::Tensor,at::Tensor
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to alias template instantiation 'c10::OptionalBase' being compiled
with
[
T=std::tupleat::Tensor,at::Tensor
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch\csrc\api\include\torch/nn/modules/rnn.h(162): note: see reference to class template instantiation 'c10::optional<std::tupleat::Tensor,at::Tensor>' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446):
warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with
[
T=std::tupleat::Tensor,at::Tensor
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212):
warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted
with
[
T=std::vector<at::Tensor,std::allocatorat::Tensor>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411):
note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled
with
[
T=std::vector<at::Tensor,std::allocatorat::Tensor>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled
with
[
T=std::vector<at::Tensor,std::allocatorat::Tensor>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549):
note: see reference to alias template instantiation 'c10::OptionalBase' being compiled
with
[
T=std::vector<at::Tensor,std::allocatorat::Tensor>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch\csrc\api\include\torch/optim/lbfgs.h(49): note: see reference to class template instantiation 'c10::optional<std::vector<at::Tensor,std::allocatorat::Tensor>>' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446):
warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with
[
T=std::vector<at::Tensor,std::allocatorat::Tensor>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/flat_hash_map.h(171): warning C4624: 'ska::detailv3::sherwood_v3_entry<std::pair<K,V>>': destructor was implicitly defined as deleted
with
[
K=std::string,
V=std::unique_ptr<torch::optim::OptimizerParamState,std::default_deletetorch::optim::OptimizerParamState>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/flat_hash_map.h(779): note: see reference to class template instantiation 'ska::detailv3::sherwood_v3_entry<std::pair<K,V>>' being compiled
with
[
K=std::string,
V=std::unique_ptr<torch::optim::OptimizerParamState,std::default_deletetorch::optim::OptimizerParamState>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/flat_hash_map.h(775): note: while compiling class template member function 'ska::detailv3::sherwood_v3_entry<std::pair<K,V>> *ska::detailv3::sherwood_v3_table<std::pair<K,V>,K,H,ska::detailv3::KeyOrValueHasher<K,std::pair<K,V>,H>,E,ska::detailv3::KeyOrValueEquality<K,std::pair<K,V>,E>,A,std::allocator<ska::detailv3::sherwood_v3_entry<std::pair<K,V>>>>::empty_default_table(void)'
with
[
K=std::string,
V=std::unique_ptr<torch::optim::OptimizerParamState,std::default_deletetorch::optim::OptimizerParamState>,
H=std::hashstd::string,
E=std::equal_tostd::string,
A=std::allocator<std::pair<std::string,std::unique_ptr<torch::optim::OptimizerParamState,std::default_deletetorch::optim::OptimizerParamState>>>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/flat_hash_map.h(768): note: see the first reference to 'ska::detailv3::sherwood_v3_table<std::pair<K,V>,K,H,ska::detailv3::KeyOrValueHasher<K,std::pair<K,V>,H>,E,ska::detailv3::KeyOrValueEquality<K,std::pair<K,V>,E>,A,std::allocator<ska::detailv3::sherwood_v3_entry<std::pair<K,V>>>>::empty_default_table' in 'ska::detailv3::sherwood_v3_table<std::pair<K,V>,K,H,ska::detailv3::KeyOrValueHasher<K,std::pair<K,V>,H>,E,ska::detailv3::KeyOrValueEquality<K,std::pair<K,V>,E>,A,std::allocator<ska::detailv3::sherwood_v3_entry<std::pair<K,V>>>>::sherwood_v3_table'
with
[
K=std::string,
V=std::unique_ptr<torch::optim::OptimizerParamState,std::default_deletetorch::optim::OptimizerParamState>,
H=std::hashstd::string,
E=std::equal_tostd::string,
A=std::allocator<std::pair<std::string,std::unique_ptr<torch::optim::OptimizerParamState,std::default_deletetorch::optim::OptimizerParamState>>>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/flat_hash_map.h(1929): note: see reference to class template instantiation 'ska::detailv3::sherwood_v3_table<std::pair<K,V>,K,H,ska::detailv3::KeyOrValueHasher<K,std::pair<K,V>,H>,E,ska::detailv3::KeyOrValueEquality<K,std::pair<K,V>,E>,A,std::allocator<ska::detailv3::sherwood_v3_entry<std::pair<K,V>>>>' being compiled
with
[
K=std::string,
V=std::unique_ptr<torch::optim::OptimizerParamState,std::default_deletetorch::optim::OptimizerParamState>,
H=std::hashstd::string,
E=std::equal_tostd::string,
A=std::allocator<std::pair<std::string,std::unique_ptr<torch::optim::OptimizerParamState,std::default_deletetorch::optim::OptimizerParamState>>>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch\csrc\api\include\torch/optim/optimizer.h(174): note: see reference to class template instantiation 'ska::flat_hash_map<std::string,std::unique_ptr<torch::optim::OptimizerParamState,std::default_deletetorch::optim::OptimizerParamState>,std::hashstd::string,std::equal_tostd::string,std::allocator<std::pair<K,V>>>' being compiled
with
[
K=std::string,
V=std::unique_ptr<torch::optim::OptimizerParamState,std::default_deletetorch::optim::OptimizerParamState>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/flat_hash_map.h(171): warning C4624: 'ska::detailv3::sherwood_v3_entry<std::pair<K,V>>': destructor was implicitly defined as deleted
with
[
K=c10::DispatchKey,
V=std::list<c10::impl::AnnotatedKernel,std::allocatorc10::impl::AnnotatedKernel>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/flat_hash_map.h(711): note: see reference to class template instantiation 'ska::detailv3::sherwood_v3_entry<std::pair<K,V>>' being compiled
with
[
K=c10::DispatchKey,
V=std::list<c10::impl::AnnotatedKernel,std::allocatorc10::impl::AnnotatedKernel>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/flat_hash_map.h(708): note: while compiling class template member function 'void ska::detailv3::sherwood_v3_table<std::pair<K,V>,K,H,ska::detailv3::KeyOrValueHasher<K,std::pair<K,V>,H>,E,ska::detailv3::KeyOrValueEquality<K,std::pair<K,V>,E>,A,std::allocator<ska::detailv3::sherwood_v3_entry<std::pair<K,V>>>>::clear(void)'
with
[
K=c10::DispatchKey,
V=std::list<c10::impl::AnnotatedKernel,std::allocatorc10::impl::AnnotatedKernel>,
H=std::hashc10::DispatchKey,
E=std::equal_toc10::DispatchKey,
A=std::allocator<std::pair<c10::DispatchKey,std::list<c10::impl::AnnotatedKernel,std::allocatorc10::impl::AnnotatedKernel>>>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/flat_hash_map.h(431): note: see the first reference to 'ska::detailv3::sherwood_v3_table<std::pair<K,V>,K,H,ska::detailv3::KeyOrValueHasher<K,std::pair<K,V>,H>,E,ska::detailv3::KeyOrValueEquality<K,std::pair<K,V>,E>,A,std::allocator<ska::detailv3::sherwood_v3_entry<std::pair<K,V>>>>::clear' in 'ska::detailv3::sherwood_v3_table<std::pair<K,V>,K,H,ska::detailv3::KeyOrValueHasher<K,std::pair<K,V>,H>,E,ska::detailv3::KeyOrValueEquality<K,std::pair<K,V>,E>,A,std::allocator<ska::detailv3::sherwood_v3_entry<std::pair<K,V>>>>::~sherwood_v3_table'
with
[
K=c10::DispatchKey,
V=std::list<c10::impl::AnnotatedKernel,std::allocatorc10::impl::AnnotatedKernel>,
H=std::hashc10::DispatchKey,
E=std::equal_toc10::DispatchKey,
A=std::allocator<std::pair<c10::DispatchKey,std::list<c10::impl::AnnotatedKernel,std::allocatorc10::impl::AnnotatedKernel>>>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/flat_hash_map.h(2035): note: see the first reference to 'ska::detailv3::sherwood_v3_table<std::pair<K,V>,K,H,ska::detailv3::KeyOrValueHasher<K,std::pair<K,V>,H>,E,ska::detailv3::KeyOrValueEquality<K,std::pair<K,V>,E>,A,std::allocator<ska::detailv3::sherwood_v3_entry<std::pair<K,V>>>>::~sherwood_v3_table' in 'ska::flat_hash_map<c10::DispatchKey,std::list<c10::impl::AnnotatedKernel,std::allocatorc10::impl::AnnotatedKernel>,std::hashc10::DispatchKey,std::equal_to,std::allocator<std::pair<K,V>>>::~flat_hash_map'
with
[
K=c10::DispatchKey,
V=std::list<c10::impl::AnnotatedKernel,std::allocatorc10::impl::AnnotatedKernel>,
H=std::hashc10::DispatchKey,
E=std::equal_toc10::DispatchKey,
A=std::allocator<std::pair<c10::DispatchKey,std::list<c10::impl::AnnotatedKernel,std::allocatorc10::impl::AnnotatedKernel>>>
]
and
[
K=c10::DispatchKey,
V=std::list<c10::impl::AnnotatedKernel,std::allocatorc10::impl::AnnotatedKernel>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/flat_hash_map.h(1929): note: see reference to class template instantiation 'ska::detailv3::sherwood_v3_table<std::pair<K,V>,K,H,ska::detailv3::KeyOrValueHasher<K,std::pair<K,V>,H>,E,ska::detailv3::KeyOrValueEquality<K,std::pair<K,V>,E>,A,std::allocator<ska::detailv3::sherwood_v3_entry<std::pair<K,V>>>>' being compiled
with
[
K=c10::DispatchKey,
V=std::list<c10::impl::AnnotatedKernel,std::allocatorc10::impl::AnnotatedKernel>,
H=std::hashc10::DispatchKey,
E=std::equal_toc10::DispatchKey,
A=std::allocator<std::pair<c10::DispatchKey,std::list<c10::impl::AnnotatedKernel,std::allocatorc10::impl::AnnotatedKernel>>>
]
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\ATen/core/dispatch/OperatorEntry.h(270): note: see reference to class template instantiation 'ska::flat_hash_map<c10::DispatchKey,std::list<c10::impl::AnnotatedKernel,std::allocatorc10::impl::AnnotatedKernel>,std::hashc10::DispatchKey,std::equal_to,std::allocator<std::pair<K,V>>>' being compiled
with
[
K=c10::DispatchKey,
V=std::list<c10::impl::AnnotatedKernel,std::allocatorc10::impl::AnnotatedKernel>
]
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "C:\Users\xande\causal-conv1d\setup.py", line 207, in run
urllib.request.urlretrieve(wheel_url, wheel_filename)
File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\urllib\request.py", line 241, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\urllib\request.py", line 216, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\urllib\request.py", line 525, in open
response = meth(req, response)
File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\urllib\request.py", line 634, in http_response response = self.parent.error(
File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\urllib\request.py", line 563, in error
return self._call_chain(*args)
File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\urllib\request.py", line 496, in _call_chain
result = func(*args)
File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\urllib\request.py", line 643, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

  During handling of the above exception, another exception occurred:
 
  Traceback (most recent call last):
    File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils\cpp_extension.py", line 1893, in _run_ninja_build
      subprocess.run(
    File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\subprocess.py", line 524, in run
      raise CalledProcessError(retcode, process.args,
  subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
 
  The above exception was the direct cause of the following exception:
 
  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "C:\Users\xande\causal-conv1d\setup.py", line 227, in <module>
      setup(
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\__init__.py", line 87, in setup 
      return distutils.core.setup(**attrs)
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\_distutils\core.py", line 185, in setup
      return run_commands(dist)
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\_distutils\core.py", line 201, in run_commands
      dist.run_commands()
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\_distutils\dist.py", line 968, in run_commands
      self.run_command(cmd)
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\dist.py", line 1217, in run_command
      super().run_command(command)
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\_distutils\dist.py", line 987, in run_command
      cmd_obj.run()
    File "C:\Users\xande\causal-conv1d\setup.py", line 224, in run
      super().run()
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\wheel\bdist_wheel.py", line 321, in run    
      self.run_command("build")
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\_distutils\cmd.py", line 319, in run_command
      self.distribution.run_command(command)
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\dist.py", line 1217, in run_command
      super().run_command(command)
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\_distutils\dist.py", line 987, in run_command
      cmd_obj.run()
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\_distutils\command\build.py", line 132, in run
      self.run_command(cmd_name)
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\_distutils\cmd.py", line 319, in run_command
      self.distribution.run_command(command)
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\dist.py", line 1217, in run_command
      super().run_command(command)
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\_distutils\dist.py", line 987, in run_command
      cmd_obj.run()
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\command\build_ext.py", line 84, 

in run
_build_ext.run(self)
File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\Cython\Distutils\old_build_ext.py", line 186, in run
_build_ext.build_ext.run(self)
File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools_distutils\command\build_ext.py", line 346, in run
self.build_extensions()
File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils\cpp_extension.py", line 843, in build_extensions
build_ext.build_extensions(self)
File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\Cython\Distutils\old_build_ext.py", line 195, in build_extensions
_build_ext.build_ext.build_extensions(self)
File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools_distutils\command\build_ext.py", line 466, in build_extensions
self._build_extensions_serial()
File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools_distutils\command\build_ext.py", line 492, in _build_extensions_serial
self.build_extension(ext)
File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\command\build_ext.py", line 246, in build_extension
_build_ext.build_extension(self, ext)
File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools_distutils\command\build_ext.py", line 547, in build_extension
objects = self.compiler.compile(
File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils\cpp_extension.py", line 815, in win_wrap_ninja_compile
_write_ninja_file_and_compile_objects(
File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils\cpp_extension.py", line 1574, in _write_ninja_file_and_compile_objects
_run_ninja_build(
File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils\cpp_extension.py", line 1909, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for causal-conv1d
Running setup.py clean for causal-conv1d
Failed to build causal-conv1d
ERROR: Could not build wheels for causal-conv1d, which is required to install pyproject.toml-based projects`

from mamba.

ankhzet avatar ankhzet commented on July 24, 2024 4

@sonsus try to manually set CUDA_HOME env variable to the local cuda installation folder (currently it points to /usr/local/cuda for you)

from mamba.

Gyu1291 avatar Gyu1291 commented on July 24, 2024 3

I also have this problem 😭

from mamba.

CYYJL avatar CYYJL commented on July 24, 2024 3

In my experience (Windows 11 with Nvidia GPU) I didn't have CUDA installed so had to install that from here as well as getting Pytorch CUDA versions and this basically fixed it

Ok,thanks you.
I am try to creat a new env in Ubuntu, and install this package as follows
conda install cudatoolkit==11.8 -c nvidia pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118 conda install -c "nvidia/label/cuda-11.8.0" cuda-nvcc conda install packaging pip install causal-conv1d==1.1.1 pip install mamba-ssm
and then it can successfully install causal-conv1d and mamba-ssm

from mamba.

Marxist-Leninist avatar Marxist-Leninist commented on July 24, 2024 1

I'm not familiar with ninja, but I was able to build causal_conv1d from source.

First, note that these two commands should produce matching CUDA versions:

python3 -c 'import torch; print(torch.version.cuda)'
nvcc --version

since nvcc will be used during the build of causal_conv1d. If they don't, you might need to do:

sudo update-alternatives --config cuda

and then set the CUDA alternatives version to the version reported by torch.version.cuda

Then:

git clone https://github.com/Dao-AILab/causal-conv1d.git
cd causal_conv1d
git checkout v1.0.2  # this is the highest compatible version allowed by Mamba
CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .

More detail is in this other issue although that issue doesn't particularly deal with this one.

Please let me know if this works!

Could you link a copy of the files module you compiled to save everyone the hassle of doing all that via cloud storage

from mamba.

ankhzet avatar ankhzet commented on July 24, 2024 1

IIRC, it seems csrc directory is absent or is referenced on a wrong path, so installing causal-conv1d when there is no prebuilt wheel for your setup causes this error. I've run in to the same issue when installing mamba-chat repo locally on the Win machine, ended up by manually building causal-conv1d and added csrc from the repo into the correct location. Also, make sure nvcc and compiler binaries are in the PATH env before building.

Side note, i've ultimately failed to run it, due to absence of prebuilt triton bindings for my setup and lack of free time tho %).

from mamba.

hannn0403 avatar hannn0403 commented on July 24, 2024 1

I found a list of packages that need to be installed prior to installing causal-conv1d on the following page: havietisov/causal-conv1d@84c68a2. After installing these packages ("torch", "packaging", "buildtools", "ninja") via pip, I then executed the command pip install causal-conv1d>=1.1.0 and confirmed that it was successfully installed. If you're still having trouble with the installation, trying this method might be a good idea.

from mamba.

paaKways avatar paaKways commented on July 24, 2024 1

You'd need this as well if you're on Windows triton-lang/triton#1057 (comment)

from mamba.

ajie220209 avatar ajie220209 commented on July 24, 2024 1

根据我的经验(带有 Nvidia GPU 的 Windows 11),我没有安装 CUDA,所以必须从这里安装它并获取 Pytorch CUDA 版本,这基本上解决了它

好的,谢谢。我正在尝试在 Ubuntu 中创建一个新的环境,并按如下方式安装此软件包,然后它可以成功安装 causal-conv1d 和 mamba-ssmconda install cudatoolkit==11.8 -c nvidia pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118 conda install -c "nvidia/label/cuda-11.8.0" cuda-nvcc conda install packaging pip install causal-conv1d==1.1.1 pip install mamba-ssm

I follow your steps and still get the same ERROR when I install causal-conv1d recently :ERROR: Could not build wheels for causal-conv1d, which is required to install pyproject.toml-based projects

from mamba.

wuliwuxin avatar wuliwuxin commented on July 24, 2024 1
CAUSAL_CONV1D_FORCE_BUILD=TRUE CAUSAL_CONV1D_SKIP_CUDA_BUILD=TRUE CAUSAL_CONV1D_FORCE_CXX11_ABI=TRUE pip install .

Success!

from mamba.

evelynmitchell avatar evelynmitchell commented on July 24, 2024

You are using pip with an anaconda installation of python. You may want to attempt installing causal-conv1d with conda, but I don't know if it will work.

from mamba.

ZiQi-Jiang avatar ZiQi-Jiang commented on July 24, 2024

I have this problem,too. :)

from mamba.

signalprime avatar signalprime commented on July 24, 2024

I just realized it requires CUDA even building without

from mamba.

fivejjs avatar fivejjs commented on July 24, 2024

I'm not familiar with ninja, but I was able to build causal_conv1d from source.
First, note that these two commands should produce matching CUDA versions:

python3 -c 'import torch; print(torch.version.cuda)'
nvcc --version

since nvcc will be used during the build of causal_conv1d. If they don't, you might need to do:
sudo update-alternatives --config cuda
and then set the CUDA alternatives version to the version reported by torch.version.cuda
Then:

git clone https://github.com/Dao-AILab/causal-conv1d.git
cd causal_conv1d
git checkout v1.0.2  # this is the highest compatible version allowed by Mamba
CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .

More detail is in this other issue although that issue doesn't particularly deal with this one.
Please let me know if this works!

Could you link a copy of the files module you compiled to save everyone the hassle of doing all that via cloud storage

I found it works after just align the native cuda version to the pytorch cuda version.

from mamba.

sonsus avatar sonsus commented on July 24, 2024

Suffering from similar issues. My message is like below. Hope I could find some good hint from this thread.

pip install mamba-ssm
Collecting mamba-ssm
  Using cached mamba_ssm-1.1.1.tar.gz (34 kB)
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [25 lines of output]
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-g51qu3z4/mamba-ssm_4988cde7cf824517a06f59b486aea78c/setup.py", line 101, in <module>
          _, bare_metal_version = get_cuda_bare_metal_version(CUDA_HOME)
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-install-g51qu3z4/mamba-ssm_4988cde7cf824517a06f59b486aea78c/setup.py", line 63, in get_cuda_bare_metal_version
          raw_output = subprocess.check_output(
                       ^^^^^^^^^^^^^^^^^^^^^^^^
        File "/root/miniconda3/envs/openai/lib/python3.11/subprocess.py", line 466, in check_output
          return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/root/miniconda3/envs/openai/lib/python3.11/subprocess.py", line 548, in run
          with Popen(*popenargs, **kwargs) as process:
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/root/miniconda3/envs/openai/lib/python3.11/subprocess.py", line 1026, in __init__
          self._execute_child(args, executable, preexec_fn, close_fds,
        File "/root/miniconda3/envs/openai/lib/python3.11/subprocess.py", line 1950, in _execute_child
          raise child_exception_type(errno_num, err_msg, err_filename)
      FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/cuda/bin/nvcc'
      
      
      torch.__version__  = 2.0.1+cu117
      
      
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Driver/CUDA versions from nvidia-smi
NVIDIA-SMI 525.105.17 Driver Version: 525.105.17 CUDA Version: 12.0
torch==2.0.1, A100 machine.

from mamba.

invokeG avatar invokeG commented on July 24, 2024

@duncanriach Thank you! This solution is effective.

Building on @hrbigelow's instructions above, in order to get the mamba-ssm package pip installed, I did the following inside an instance of container image nvcr.io/nvidia/pytorch:23.12-py3. I also confirmed that it worked using docker.io/pytorch/pytorch:2.1.2-cuda12.1-cudnn8-devel.

$ git clone https://github.com/Dao-AILab/causal-conv1d.git
$ cd causal-conv1d
$ git checkout v1.1.1 # current latest version tag
$ CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .
$ cd ..
$ git clone https://github.com/state-spaces/mamba.git
$ cd mamba
$ git checkout v1.1.1 # current latest version tag
$ pip install .

Note that if you're accessing the cloned directories on a disk outside your container, you will need to clean/purge those directories before building inside different container versions. Otherwise, stale code and settings will cause the build to not work properly in your new container. This tends to show up as dynamic linking errors when importing the mamba_ssm package into Python. To clean/purge, run rm -rf *.egg.info build in both the causal-conv1d clone directory and the mamba clone directory.

Checking the installation:

$ pip show causal-conv1d
Name: causal-conv1d
Version: 1.1.1
Summary: Causal depthwise conv1d in CUDA, with a PyTorch interface
Home-page: https://github.com/Dao-AILab/causal-conv1d
Author: Tri Dao
Author-email: [email protected]
License: UNKNOWN
Location: /usr/local/lib/python3.10/dist-packages
Requires: packaging, torch, buildtools, ninja
Required-by: mamba-ssm

$ pip show mamba-ssm
Name: mamba-ssm
Version: 1.1.1
Summary: Mamba state-space model
Home-page: https://github.com/state-spaces/mamba
Author: Tri Dao, Albert Gu
Author-email: [email protected], [email protected]
License: UNKNOWN
Location: /usr/local/lib/python3.10/dist-packages
Requires: einops, causal-conv1d, transformers, torch, packaging, ninja, triton
Required-by:

$ python
>>> import torch
>>> from mamba_ssm import Mamba

# no errors

from mamba.

shigen-StoneRoot avatar shigen-StoneRoot commented on July 24, 2024

It works. I don't know why this sentence "CAUSAL_CONV1D_FORCE_BUILD=TRUE" is important.
In fact, I have to run the two commands:
$ CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .
$ MAMBA_FORCE_BUILD=TRUE pip install .

If without *_FORCE_BUILD=TRUE, the issue still occurs.

Building on @hrbigelow's instructions above, in order to get the mamba-ssm package pip installed, I did the following inside an instance of container image nvcr.io/nvidia/pytorch:23.12-py3. I also confirmed that it worked using docker.io/pytorch/pytorch:2.1.2-cuda12.1-cudnn8-devel.

$ git clone https://github.com/Dao-AILab/causal-conv1d.git
$ cd causal-conv1d
$ git checkout v1.1.1 # current latest version tag
$ CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .
$ cd ..
$ git clone https://github.com/state-spaces/mamba.git
$ cd mamba
$ git checkout v1.1.1 # current latest version tag
$ pip install .

Note that if you're accessing the cloned directories on a disk outside your container, you will need to clean/purge those directories before building inside different container versions. Otherwise, stale code and settings will cause the build to not work properly in your new container. This tends to show up as dynamic linking errors when importing the mamba_ssm package into Python. To clean/purge, run rm -rf *.egg.info build in both the causal-conv1d clone directory and the mamba clone directory.

Checking the installation:

$ pip show causal-conv1d
Name: causal-conv1d
Version: 1.1.1
Summary: Causal depthwise conv1d in CUDA, with a PyTorch interface
Home-page: https://github.com/Dao-AILab/causal-conv1d
Author: Tri Dao
Author-email: [email protected]
License: UNKNOWN
Location: /usr/local/lib/python3.10/dist-packages
Requires: packaging, torch, buildtools, ninja
Required-by: mamba-ssm

$ pip show mamba-ssm
Name: mamba-ssm
Version: 1.1.1
Summary: Mamba state-space model
Home-page: https://github.com/state-spaces/mamba
Author: Tri Dao, Albert Gu
Author-email: [email protected], [email protected]
License: UNKNOWN
Location: /usr/local/lib/python3.10/dist-packages
Requires: einops, causal-conv1d, transformers, torch, packaging, ninja, triton
Required-by:

$ python
>>> import torch
>>> from mamba_ssm import Mamba

# no errors

from mamba.

duncanriach avatar duncanriach commented on July 24, 2024

@shigen-StoneRoot, *_FORCE_BUILD=TRUE forces a fresh build locally, rather than trying to use the local cached results. I believe this makes my instructions about deleting the *.egg-info and build directories redundant. Thanks for pointing this out; I'll update my comment above.

Here are the places in the relevant setup.py files where *_FORCE_BUILD=TRUE is defined and documented causal_conv1d, mamba.

from mamba.

HelloWorldLTY avatar HelloWorldLTY commented on July 24, 2024

Hi, my suggestion for addressing this error is installing mamba first, and then reinstall the pytorch based on the default link, then everything works.

from mamba.

Wave2689 avatar Wave2689 commented on July 24, 2024

when I use the command
CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .
I got the error
'CAUSAL_CONV1D_FORCE_BUILD' is not recognized as an internal or external command, operable program or batch file.
and how to fix this error?

from mamba.

duncanriach avatar duncanriach commented on July 24, 2024

when I use the command CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install . I got the error 'CAUSAL_CONV1D_FORCE_BUILD' is not recognized as an internal or external command, operable program or batch file. and how to fix this error?

Are you not using bash as your shell? You might need to translate the commands into a format that is compatible with the shell that you're using.

from mamba.

ankhzet avatar ankhzet commented on July 24, 2024

@Wave2689

when I use the command CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install . I got the error 'CAUSAL_CONV1D_FORCE_BUILD' is not recognized as an internal or external command, operable program or batch file. and how to fix this error?

If you run the command in Windows command prompt, you might need to prefix env variables with export keyword:

drive:path\to\project> export CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .

also sometimes you would need to execute the commands one after another:

drive:path\to\project> export CAUSAL_CONV1D_FORCE_BUILD=TRUE
drive:path\to\project> pip install .

from mamba.

Wave2689 avatar Wave2689 commented on July 24, 2024

Thanks a lot! I know it. @ankhzet @duncanriach

from mamba.

invokeG avatar invokeG commented on July 24, 2024

I created a Docker container to address the installation errors. https://hub.docker.com/repository/docker/kom4cr0/cuda11.7-pytorch1.13-mamba1.1.1/general

from mamba.

steve-zeyu-zhang avatar steve-zeyu-zhang commented on July 24, 2024

@duncanriach Thank you! This solution is effective.

Building on @hrbigelow's instructions above, in order to get the mamba-ssm package pip installed, I did the following inside an instance of container image nvcr.io/nvidia/pytorch:23.12-py3. I also confirmed that it worked using docker.io/pytorch/pytorch:2.1.2-cuda12.1-cudnn8-devel.

$ git clone https://github.com/Dao-AILab/causal-conv1d.git
$ cd causal-conv1d
$ git checkout v1.1.1 # current latest version tag
$ CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .
$ cd ..
$ git clone https://github.com/state-spaces/mamba.git
$ cd mamba
$ git checkout v1.1.1 # current latest version tag
$ pip install .

Note that if you're accessing the cloned directories on a disk outside your container, you will need to clean/purge those directories before building inside different container versions. Otherwise, stale code and settings will cause the build to not work properly in your new container. This tends to show up as dynamic linking errors when importing the mamba_ssm package into Python. To clean/purge, run rm -rf *.egg.info build in both the causal-conv1d clone directory and the mamba clone directory.
Checking the installation:

$ pip show causal-conv1d
Name: causal-conv1d
Version: 1.1.1
Summary: Causal depthwise conv1d in CUDA, with a PyTorch interface
Home-page: https://github.com/Dao-AILab/causal-conv1d
Author: Tri Dao
Author-email: [email protected]
License: UNKNOWN
Location: /usr/local/lib/python3.10/dist-packages
Requires: packaging, torch, buildtools, ninja
Required-by: mamba-ssm

$ pip show mamba-ssm
Name: mamba-ssm
Version: 1.1.1
Summary: Mamba state-space model
Home-page: https://github.com/state-spaces/mamba
Author: Tri Dao, Albert Gu
Author-email: [email protected], [email protected]
License: UNKNOWN
Location: /usr/local/lib/python3.10/dist-packages
Requires: einops, causal-conv1d, transformers, torch, packaging, ninja, triton
Required-by:

$ python
>>> import torch
>>> from mamba_ssm import Mamba

# no errors

Sometimes this solution may not work at all, even with CAUSAL_CONV1D_FORCE_BUILD=TRUE, it will still appear the error shows below.

That's because it is nothing to do with CAUSAL_CONV1D_FORCE_BUILD=TRUE in this case.

Then, you may consider to reload your conda environment by conda deactivate and start again with simply pip install -e .. Then your problem will solved.

          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 227 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16ES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16ES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 92 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16ES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16ES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 151 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16ES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16ES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16ES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16ES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 163 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16ES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16ES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 92 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16ES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16ES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16ES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16ES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16ES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16ES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 122 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 86 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 190 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 128 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 228 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 87 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 152 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 168 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 94 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 119 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 104 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 82 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 140 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 87 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 77 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 120 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 70 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 70 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 80 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 87 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 88 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 184 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 126 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 227 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 89 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 151 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 163 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 92 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 122 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 86 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 190 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 128 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 228 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 87 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 152 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 168 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 94 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 119 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 104 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 82 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 140 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 87 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 77 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 120 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 70 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 70 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 80 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 87 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 88 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 185 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 126 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 226 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 89 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 151 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 163 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 92 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 121 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 82 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 190 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 128 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 228 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 89 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 152 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 168 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 94 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 119 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EffELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EffELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 66 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EffELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EffELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 104 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EffELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EffELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 82 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EffELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EffELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 140 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EffELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EffELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EffELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EffELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 87 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EffELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EffELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 77 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EffELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EffELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 123 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EffELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EffELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 82 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EffELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EffELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 70 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EffELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EffELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 77 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EffELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EffELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 86 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 80 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 45 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 55 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 94 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 38 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 71 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 93 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 39 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 72 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 72 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 71 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 79 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 45 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 55 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 94 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 38 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 71 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 87 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 39 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 72 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 72 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 62 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 72 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 80 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 94 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 71 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 93 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 39 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 72 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 72 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 72 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      /gpfs/users/a1796450/default_studio/motion-latent-diffusion/causal-conv1d/csrc/causal_conv1d_bwd.cu(84): warning #2912-D: constexpr if statements are a C++17 feature
      
      ninja: build stopped: subcommand failed.
      Traceback (most recent call last):
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1900, in _run_ninja_build
          subprocess.run(
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/subprocess.py", line 526, in run
          raise CalledProcessError(retcode, process.args,
      subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
      
      The above exception was the direct cause of the following exception:
      
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/gpfs/users/a1796450/default_studio/motion-latent-diffusion/causal-conv1d/setup.py", line 223, in <module>
          setup(
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/__init__.py", line 103, in setup
          return distutils.core.setup(**attrs)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 185, in setup
          return run_commands(dist)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
          dist.run_commands()
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
          self.run_command(cmd)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/dist.py", line 989, in run_command
          super().run_command(command)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/gpfs/users/a1796450/default_studio/motion-latent-diffusion/causal-conv1d/setup.py", line 198, in run
          return super().run()
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/wheel/bdist_wheel.py", line 364, in run
          self.run_command("build")
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
          self.distribution.run_command(command)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/dist.py", line 989, in run_command
          super().run_command(command)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/command/build.py", line 131, in run
          self.run_command(cmd_name)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
          self.distribution.run_command(command)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/dist.py", line 989, in run_command
          super().run_command(command)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 88, in run
          _build_ext.run(self)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 345, in run
          self.build_extensions()
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 843, in build_extensions
          build_ext.build_extensions(self)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 467, in build_extensions
          self._build_extensions_serial()
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 493, in _build_extensions_serial
          self.build_extension(ext)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 249, in build_extension
          _build_ext.build_extension(self, ext)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 548, in build_extension
          objects = self.compiler.compile(
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 658, in unix_wrap_ninja_compile
          _write_ninja_file_and_compile_objects(
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1573, in _write_ninja_file_and_compile_objects
          _run_ninja_build(
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1916, in _run_ninja_build
          raise RuntimeError(message) from e
      RuntimeError: Error compiling objects for extension
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for causal-conv1d
  Running setup.py clean for causal-conv1d
Failed to build causal-conv1d
ERROR: Could not build wheels for causal-conv1d, which is required to install pyproject.toml-based projects

from mamba.

CYYJL avatar CYYJL commented on July 24, 2024

@duncanriach HI,i use you approch to install the causal-conv1d ,but i have a new issue
`Building wheels for collected packages: causal-conv1d
Building wheel for causal-conv1d (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [60 lines of output]

  torch.__version__  = 1.13.1+cu117


  running bdist_wheel
  /home/yjl/anaconda3/envs/VMUnet/lib/python3.8/site-packages/torch/utils/cpp_extension.py:476: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
    warnings.warn(msg.format('we could not find ninja.'))
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-3.8
  creating build/lib.linux-x86_64-3.8/causal_conv1d
  copying causal_conv1d/causal_conv1d_interface.py -> build/lib.linux-x86_64-3.8/causal_conv1d
  copying causal_conv1d/__init__.py -> build/lib.linux-x86_64-3.8/causal_conv1d
  running build_ext
  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "/home/yjl/causal-conv1d/setup.py", line 227, in <module>
      setup(
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/site-packages/setuptools/__init__.py", line 153, in setup
      return distutils.core.setup(**attrs)
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/distutils/core.py", line 148, in setup
      dist.run_commands()
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/distutils/dist.py", line 966, in run_commands
      self.run_command(cmd)
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/distutils/dist.py", line 985, in run_command
      cmd_obj.run()
    File "/home/yjl/causal-conv1d/setup.py", line 202, in run
      return super().run()
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/site-packages/wheel/bdist_wheel.py", line 364, in run
      self.run_command("build")
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/distutils/cmd.py", line 313, in run_command
      self.distribution.run_command(command)
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/distutils/dist.py", line 985, in run_command
      cmd_obj.run()
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/distutils/command/build.py", line 135, in run
      self.run_command(cmd_name)
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/distutils/cmd.py", line 313, in run_command
      self.distribution.run_command(command)
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/distutils/dist.py", line 985, in run_command
      cmd_obj.run()
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 79, in run
      _build_ext.run(self)
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/distutils/command/build_ext.py", line 340, in run
      self.build_extensions()
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 485, in build_extensions
      compiler_name, compiler_version = self._check_abi()
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 869, in _check_abi
      _, version = get_compiler_abi_compatibility_and_version(compiler)
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 336, in get_compiler_abi_compatibility_and_version
      if not check_compiler_ok_for_platform(compiler):
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 290, in check_compiler_ok_for_platform
      which = subprocess.check_output(['which', compiler], stderr=subprocess.STDOUT)
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/subprocess.py", line 415, in check_output
      return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/subprocess.py", line 516, in run
      raise CalledProcessError(retcode, process.args,
  subprocess.CalledProcessError: Command '['which', 'g++']' returned non-zero exit status 1.
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for causal-conv1d
Running setup.py clean for causal-conv1d
Failed to build causal-conv1d
ERROR: Could not build wheels for causal-conv1d, which is required to install pyproject.toml-based projects`

from mamba.

paaKways avatar paaKways commented on July 24, 2024

In my experience (Windows 11 with Nvidia GPU) I didn't have CUDA installed so had to install that from here as well as getting Pytorch CUDA versions and this basically fixed it

from mamba.

CYYJL avatar CYYJL commented on July 24, 2024

You'd need this as well if you're on Windows openai/triton#1057 (comment)

triton, this package is seem need to be installed in Ubuntu, not windos
I try to install in windos, but it failed

from mamba.

ajie220209 avatar ajie220209 commented on July 24, 2024

CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .

An error occurred while executing CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install. ERROR: Could not build wheels for causal_conv1d, which is required to install pyproject.toml-based projects. What can I do to solve this problem?

from mamba.

CYYJL avatar CYYJL commented on July 24, 2024

根据我的经验(带有 Nvidia GPU 的 Windows 11),我没有安装 CUDA,所以必须从这里安装它并获取 Pytorch CUDA 版本,这基本上解决了它

好的,谢谢。我正在尝试在 Ubuntu 中创建一个新的环境,并按如下方式安装此软件包,然后它可以成功安装 causal-conv1d 和 mamba-ssmconda install cudatoolkit==11.8 -c nvidia pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118 conda install -c "nvidia/label/cuda-11.8.0" cuda-nvcc conda install packaging pip install causal-conv1d==1.1.1 pip install mamba-ssm

I follow your steps and still get the same ERROR when I install causal-conv1d recently :ERROR: Could not build wheels for causal-conv1d, which is required to install pyproject.toml-based projects

Could you show the detail about error?

from mamba.

ajie220209 avatar ajie220209 commented on July 24, 2024

根据我的经验(带有 Nvidia GPU 的 Windows 11),我没有安装 CUDA,所以必须从这里安装它并获取 Pytorch CUDA 版本,这基本上解决了它

好的,谢谢。我正在尝试在 Ubuntu 中创建一个新的环境,并按如下方式安装此软件包,然后它可以成功安装 causal-conv1d 和 mamba-ssmconda install cudatoolkit==11.8 -c nvidia pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118 conda install -c "nvidia/label/cuda-11.8.0" cuda-nvcc conda install packaging pip install causal-conv1d==1.1.1 pip install mamba-ssm

I follow your steps and still get the same ERROR when I install causal-conv1d recently :ERROR: Could not build wheels for causal-conv1d, which is required to install pyproject.toml-based projects

Could you show the detail about error?

Building wheels for collected packages: mamba-ssm
Building wheel for mamba-ssm (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [132 lines of output]

  torch.__version__  = 2.1.1+cu118


  running bdist_wheel
  Guessing wheel URL:  https://github.com/state-spaces/mamba/releases/download/v1.2.0.post1/mamba_ssm-1.2.0.post1+cu118torch2.1cxx11abiFALSE-cp310-cp310-win_amd64.whl
  Precompiled wheel not found. Building from source...
  running build
  running build_py
  creating build
  creating build\lib.win-amd64-cpython-310
  creating build\lib.win-amd64-cpython-310\mamba_ssm
  copying mamba_ssm\__init__.py -> build\lib.win-amd64-cpython-310\mamba_ssm
  creating build\lib.win-amd64-cpython-310\mamba_ssm\models
  copying mamba_ssm\models\config_mamba.py -> build\lib.win-amd64-cpython-310\mamba_ssm\models
  copying mamba_ssm\models\mixer_seq_simple.py -> build\lib.win-amd64-cpython-310\mamba_ssm\models
  copying mamba_ssm\models\__init__.py -> build\lib.win-amd64-cpython-310\mamba_ssm\models
  creating build\lib.win-amd64-cpython-310\mamba_ssm\modules
  copying mamba_ssm\modules\mamba_simple.py -> build\lib.win-amd64-cpython-310\mamba_ssm\modules
  copying mamba_ssm\modules\__init__.py -> build\lib.win-amd64-cpython-310\mamba_ssm\modules
  creating build\lib.win-amd64-cpython-310\mamba_ssm\ops
  copying mamba_ssm\ops\selective_scan_interface.py -> build\lib.win-amd64-cpython-310\mamba_ssm\ops
  copying mamba_ssm\ops\__init__.py -> build\lib.win-amd64-cpython-310\mamba_ssm\ops
  creating build\lib.win-amd64-cpython-310\mamba_ssm\utils
  copying mamba_ssm\utils\generation.py -> build\lib.win-amd64-cpython-310\mamba_ssm\utils
  copying mamba_ssm\utils\hf.py -> build\lib.win-amd64-cpython-310\mamba_ssm\utils
  copying mamba_ssm\utils\__init__.py -> build\lib.win-amd64-cpython-310\mamba_ssm\utils
  creating build\lib.win-amd64-cpython-310\mamba_ssm\ops\triton
  copying mamba_ssm\ops\triton\layernorm.py -> build\lib.win-amd64-cpython-310\mamba_ssm\ops\triton
  copying mamba_ssm\ops\triton\selective_state_update.py -> build\lib.win-amd64-cpython-310\mamba_ssm\ops\triton
  copying mamba_ssm\ops\triton\__init__.py -> build\lib.win-amd64-cpython-310\mamba_ssm\ops\triton
  running build_ext
  D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py:383: UserWarning: Error checking compiler version for cl: [WinError 2] 系统找不到指定的文件。
    warnings.warn(f'Error checking compiler version for {compiler}: {error}')
  D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py:414: UserWarning: The detected CUDA version (11.6) has a minor version mismatch with the version that was used to compile PyTorch (11.8). Most likely this shouldn't be a problem.
    warnings.warn(CUDA_MISMATCH_WARN.format(cuda_str_version, torch.version.cuda))
  building 'selective_scan_cuda' extension
  creating C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\build\temp.win-amd64-cpython-310
  creating C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\build\temp.win-amd64-cpython-310\Release
  creating C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\build\temp.win-amd64-cpython-310\Release\csrc
  creating C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\build\temp.win-amd64-cpython-310\Release\csrc\selective_scan
  Emitting ninja build file C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\build\temp.win-amd64-cpython-310\Release\build.ninja...
  Compiling objects...
  Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
  ninja: error: 'C:/Users/cwj/AppData/Local/Temp/pip-install-dzppbsyx/mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a/csrc/selective_scan/selective_scan.cpp', needed by 'C:/Users/cwj/AppData/Local/Temp/pip-install-dzppbsyx/mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a/build/temp.win-amd64-cpython-310/Release/csrc/selective_scan/selective_scan.obj', missing and no known rule to make it
  Traceback (most recent call last):
    File "C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\setup.py", line 214, in run
      urllib.request.urlretrieve(wheel_url, wheel_filename)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 241, in urlretrieve
      with contextlib.closing(urlopen(url, data)) as fp:
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 216, in urlopen
      return opener.open(url, data, timeout)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 525, in open
      response = meth(req, response)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 634, in http_response
      response = self.parent.error(
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 563, in error
      return self._call_chain(*args)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 496, in _call_chain
      result = func(*args)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 643, in http_error_default
      raise HTTPError(req.full_url, code, msg, hdrs, fp)
  urllib.error.HTTPError: HTTP Error 404: Not Found

  During handling of the above exception, another exception occurred:

  Traceback (most recent call last):
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py", line 2100, in _run_ninja_build
      subprocess.run(
    File "D:\pythonVE\anaconda3\envs\dltf\lib\subprocess.py", line 524, in run
      raise CalledProcessError(retcode, process.args,
  subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

  The above exception was the direct cause of the following exception:

  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\setup.py", line 234, in <module>
      setup(
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\__init__.py", line 103, in setup
      return distutils.core.setup(**attrs)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\core.py", line 185, in setup
      return run_commands(dist)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\core.py", line 201, in run_commands
      dist.run_commands()
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\dist.py", line 969, in run_commands
      self.run_command(cmd)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\dist.py", line 989, in run_command
      super().run_command(command)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\dist.py", line 988, in run_command
      cmd_obj.run()
    File "C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\setup.py", line 231, in run
      super().run()
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\wheel\bdist_wheel.py", line 364, in run
      self.run_command("build")
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\cmd.py", line 318, in run_command
      self.distribution.run_command(command)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\dist.py", line 989, in run_command
      super().run_command(command)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\dist.py", line 988, in run_command
      cmd_obj.run()
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\command\build.py", line 131, in run
      self.run_command(cmd_name)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\cmd.py", line 318, in run_command
      self.distribution.run_command(command)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\dist.py", line 989, in run_command
      super().run_command(command)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\dist.py", line 988, in run_command
      cmd_obj.run()
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\command\build_ext.py", line 88, in run
      _build_ext.run(self)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 345, in run
      self.build_extensions()
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py", line 873, in build_extensions
      build_ext.build_extensions(self)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 467, in build_extensions
      self._build_extensions_serial()
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 493, in _build_extensions_serial
      self.build_extension(ext)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\command\build_ext.py", line 249, in build_extension
      _build_ext.build_extension(self, ext)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 548, in build_extension
      objects = self.compiler.compile(
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py", line 845, in win_wrap_ninja_compile
      _write_ninja_file_and_compile_objects(
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py", line 1774, in _write_ninja_file_and_compile_objects
      _run_ninja_build(
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py", line 2116, in _run_ninja_build
      raise RuntimeError(message) from e
  RuntimeError: Error compiling objects for extension
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for mamba-ssm
Running setup.py clean for mamba-ssm
Failed to build mamba-ssm
ERROR: Could not build wheels for mamba-ssm, which is required to install pyproject.toml-based projects

from mamba.

CYYJL avatar CYYJL commented on July 24, 2024

根据我的经验(带有 Nvidia GPU 的 Windows 11),我没有安装 CUDA,所以必须从这里安装它并获取 Pytorch CUDA 版本,这基本上解决了它

好的,谢谢。我正在尝试在 Ubuntu 中创建一个新的环境,并按如下方式安装此软件包,然后它可以成功安装 causal-conv1d 和 mamba-ssmconda install cudatoolkit==11.8 -c nvidia pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118 conda install -c "nvidia/label/cuda-11.8.0" cuda-nvcc conda install packaging pip install causal-conv1d==1.1.1 pip install mamba-ssm

I follow your steps and still get the same ERROR when I install causal-conv1d recently :ERROR: Could not build wheels for causal-conv1d, which is required to install pyproject.toml-based projects

Could you show the detail about error?

Building wheels for collected packages: mamba-ssm Building wheel for mamba-ssm (setup.py) ... error error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully. │ exit code: 1 ╰─> [132 lines of output]

  torch.__version__  = 2.1.1+cu118


  running bdist_wheel
  Guessing wheel URL:  https://github.com/state-spaces/mamba/releases/download/v1.2.0.post1/mamba_ssm-1.2.0.post1+cu118torch2.1cxx11abiFALSE-cp310-cp310-win_amd64.whl
  Precompiled wheel not found. Building from source...
  running build
  running build_py
  creating build
  creating build\lib.win-amd64-cpython-310
  creating build\lib.win-amd64-cpython-310\mamba_ssm
  copying mamba_ssm\__init__.py -> build\lib.win-amd64-cpython-310\mamba_ssm
  creating build\lib.win-amd64-cpython-310\mamba_ssm\models
  copying mamba_ssm\models\config_mamba.py -> build\lib.win-amd64-cpython-310\mamba_ssm\models
  copying mamba_ssm\models\mixer_seq_simple.py -> build\lib.win-amd64-cpython-310\mamba_ssm\models
  copying mamba_ssm\models\__init__.py -> build\lib.win-amd64-cpython-310\mamba_ssm\models
  creating build\lib.win-amd64-cpython-310\mamba_ssm\modules
  copying mamba_ssm\modules\mamba_simple.py -> build\lib.win-amd64-cpython-310\mamba_ssm\modules
  copying mamba_ssm\modules\__init__.py -> build\lib.win-amd64-cpython-310\mamba_ssm\modules
  creating build\lib.win-amd64-cpython-310\mamba_ssm\ops
  copying mamba_ssm\ops\selective_scan_interface.py -> build\lib.win-amd64-cpython-310\mamba_ssm\ops
  copying mamba_ssm\ops\__init__.py -> build\lib.win-amd64-cpython-310\mamba_ssm\ops
  creating build\lib.win-amd64-cpython-310\mamba_ssm\utils
  copying mamba_ssm\utils\generation.py -> build\lib.win-amd64-cpython-310\mamba_ssm\utils
  copying mamba_ssm\utils\hf.py -> build\lib.win-amd64-cpython-310\mamba_ssm\utils
  copying mamba_ssm\utils\__init__.py -> build\lib.win-amd64-cpython-310\mamba_ssm\utils
  creating build\lib.win-amd64-cpython-310\mamba_ssm\ops\triton
  copying mamba_ssm\ops\triton\layernorm.py -> build\lib.win-amd64-cpython-310\mamba_ssm\ops\triton
  copying mamba_ssm\ops\triton\selective_state_update.py -> build\lib.win-amd64-cpython-310\mamba_ssm\ops\triton
  copying mamba_ssm\ops\triton\__init__.py -> build\lib.win-amd64-cpython-310\mamba_ssm\ops\triton
  running build_ext
  D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py:383: UserWarning: Error checking compiler version for cl: [WinError 2] 系统找不到指定的文件。
    warnings.warn(f'Error checking compiler version for {compiler}: {error}')
  D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py:414: UserWarning: The detected CUDA version (11.6) has a minor version mismatch with the version that was used to compile PyTorch (11.8). Most likely this shouldn't be a problem.
    warnings.warn(CUDA_MISMATCH_WARN.format(cuda_str_version, torch.version.cuda))
  building 'selective_scan_cuda' extension
  creating C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\build\temp.win-amd64-cpython-310
  creating C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\build\temp.win-amd64-cpython-310\Release
  creating C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\build\temp.win-amd64-cpython-310\Release\csrc
  creating C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\build\temp.win-amd64-cpython-310\Release\csrc\selective_scan
  Emitting ninja build file C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\build\temp.win-amd64-cpython-310\Release\build.ninja...
  Compiling objects...
  Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
  ninja: error: 'C:/Users/cwj/AppData/Local/Temp/pip-install-dzppbsyx/mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a/csrc/selective_scan/selective_scan.cpp', needed by 'C:/Users/cwj/AppData/Local/Temp/pip-install-dzppbsyx/mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a/build/temp.win-amd64-cpython-310/Release/csrc/selective_scan/selective_scan.obj', missing and no known rule to make it
  Traceback (most recent call last):
    File "C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\setup.py", line 214, in run
      urllib.request.urlretrieve(wheel_url, wheel_filename)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 241, in urlretrieve
      with contextlib.closing(urlopen(url, data)) as fp:
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 216, in urlopen
      return opener.open(url, data, timeout)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 525, in open
      response = meth(req, response)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 634, in http_response
      response = self.parent.error(
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 563, in error
      return self._call_chain(*args)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 496, in _call_chain
      result = func(*args)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 643, in http_error_default
      raise HTTPError(req.full_url, code, msg, hdrs, fp)
  urllib.error.HTTPError: HTTP Error 404: Not Found

  During handling of the above exception, another exception occurred:

  Traceback (most recent call last):
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py", line 2100, in _run_ninja_build
      subprocess.run(
    File "D:\pythonVE\anaconda3\envs\dltf\lib\subprocess.py", line 524, in run
      raise CalledProcessError(retcode, process.args,
  subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

  The above exception was the direct cause of the following exception:

  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\setup.py", line 234, in <module>
      setup(
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\__init__.py", line 103, in setup
      return distutils.core.setup(**attrs)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\core.py", line 185, in setup
      return run_commands(dist)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\core.py", line 201, in run_commands
      dist.run_commands()
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\dist.py", line 969, in run_commands
      self.run_command(cmd)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\dist.py", line 989, in run_command
      super().run_command(command)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\dist.py", line 988, in run_command
      cmd_obj.run()
    File "C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\setup.py", line 231, in run
      super().run()
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\wheel\bdist_wheel.py", line 364, in run
      self.run_command("build")
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\cmd.py", line 318, in run_command
      self.distribution.run_command(command)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\dist.py", line 989, in run_command
      super().run_command(command)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\dist.py", line 988, in run_command
      cmd_obj.run()
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\command\build.py", line 131, in run
      self.run_command(cmd_name)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\cmd.py", line 318, in run_command
      self.distribution.run_command(command)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\dist.py", line 989, in run_command
      super().run_command(command)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\dist.py", line 988, in run_command
      cmd_obj.run()
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\command\build_ext.py", line 88, in run
      _build_ext.run(self)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 345, in run
      self.build_extensions()
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py", line 873, in build_extensions
      build_ext.build_extensions(self)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 467, in build_extensions
      self._build_extensions_serial()
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 493, in _build_extensions_serial
      self.build_extension(ext)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\command\build_ext.py", line 249, in build_extension
      _build_ext.build_extension(self, ext)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 548, in build_extension
      objects = self.compiler.compile(
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py", line 845, in win_wrap_ninja_compile
      _write_ninja_file_and_compile_objects(
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py", line 1774, in _write_ninja_file_and_compile_objects
      _run_ninja_build(
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py", line 2116, in _run_ninja_build
      raise RuntimeError(message) from e
  RuntimeError: Error compiling objects for extension
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for mamba-ssm Running setup.py clean for mamba-ssm Failed to build mamba-ssm ERROR: Could not build wheels for mamba-ssm, which is required to install pyproject.toml-based projects

Did you install mamba in the windows? You'd better install mamba on a Linux system, you can try installing the ubuntu virtual machine on windows and installing the mamba environment on the virtual machine

from mamba.

azxzxx avatar azxzxx commented on July 24, 2024

In my experience (Windows 11 with Nvidia GPU) I didn't have CUDA installed so had to install that from here as well as getting Pytorch CUDA versions and this basically fixed it

Ok,thanks you. I am try to creat a new env in Ubuntu, and install this package as follows conda install cudatoolkit==11.8 -c nvidia pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118 conda install -c "nvidia/label/cuda-11.8.0" cuda-nvcc conda install packaging pip install causal-conv1d==1.1.1 pip install mamba-ssm and then it can successfully install causal-conv1d and mamba-ssm

perfectly solve my problem thankssssssssssssssssss!

from mamba.

zhixuanli avatar zhixuanli commented on July 24, 2024

In my experience (Windows 11 with Nvidia GPU) I didn't have CUDA installed so had to install that from here as well as getting Pytorch CUDA versions and this basically fixed it

Ok,thanks you. I am try to creat a new env in Ubuntu, and install this package as follows conda install cudatoolkit==11.8 -c nvidia pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118 conda install -c "nvidia/label/cuda-11.8.0" cuda-nvcc conda install packaging pip install causal-conv1d==1.1.1 pip install mamba-ssm and then it can successfully install causal-conv1d and mamba-ssm

Thank you so much for your commands!

Based on yours, I used the follows and successes with no bugs reported:

pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118
conda install packaging
pip install causal-conv1d==1.1.1
pip install mamba-ssm

from mamba.

Lbaiall avatar Lbaiall commented on July 24, 2024

@evelynmitchell just try to instal in the wsl system,i had encounter the same issue such like you ,but i turn into the Linux ,it work !

from mamba.

xiakexing-lmc avatar xiakexing-lmc commented on July 24, 2024

I also have this problem 😭
Have you solved the problem now?

from mamba.

Lbaiall avatar Lbaiall commented on July 24, 2024

@xiakexing-lmc yes i do,but i think that issue was only case in to the window system,try with Ubuntu system

from mamba.

BNUWUU avatar BNUWUU commented on July 24, 2024

Please help me solve, I also meet the same problems and tried some actions but don't work.

"Building wheels for collected packages: causal-conv1d
Building wheel for causal-conv1d (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [8 lines of output]

  torch.__version__  = 2.1.1+cu121
  
  
  running bdist_wheel
  Guessing wheel URL:  https://github.com/Dao-AILab/causal-conv1d/releases/download/v1.1.1/causal_conv1d-1.1.1+cu122torch2.1cxx11abiFALSE-cp38-cp38-linux_x86_64.whl
  error: <urlopen error [Errno 110] Connection timed out>
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for causal-conv1d
Running setup.py clean for causal-conv1d
Failed to build causal-conv1d
ERROR: Could not build wheels for causal-conv1d, which is required to install pyproject.toml-based projects"

I list some version in my environment, as follows:
python3 -c 'import torch; print(torch.version.cuda)' ---> 12.1
nvcc --version ---> 11.8
nvidia-smi --->12.0

Please give me some advices to solve this and let me run the code correctly ~~ Thanks a lot

from mamba.

CYYJL avatar CYYJL commented on July 24, 2024

Please help me solve, I also meet the same problems and tried some actions but don't work.

"Building wheels for collected packages: causal-conv1d Building wheel for causal-conv1d (setup.py) ... error error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully. │ exit code: 1 ╰─> [8 lines of output]

  torch.__version__  = 2.1.1+cu121
  
  
  running bdist_wheel
  Guessing wheel URL:  https://github.com/Dao-AILab/causal-conv1d/releases/download/v1.1.1/causal_conv1d-1.1.1+cu122torch2.1cxx11abiFALSE-cp38-cp38-linux_x86_64.whl
  error: <urlopen error [Errno 110] Connection timed out>
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for causal-conv1d Running setup.py clean for causal-conv1d Failed to build causal-conv1d ERROR: Could not build wheels for causal-conv1d, which is required to install pyproject.toml-based projects"

I list some version in my environment, as follows: python3 -c 'import torch; print(torch.version.cuda)' ---> 12.1 nvcc --version ---> 11.8 nvidia-smi --->12.0

Please give me some advices to solve this and let me run the code correctly ~~ Thanks a lot

Hi, your package is out of time, you can use the wheel url to download the casusal-conv1d package and then install it offline

from mamba.

jiaoaoshirenjinbu avatar jiaoaoshirenjinbu commented on July 24, 2024
CAUSAL_CONV1D_FORCE_BUILD=TRUE CAUSAL_CONV1D_SKIP_CUDA_BUILD=TRUE CAUSAL_CONV1D_FORCE_CXX11_ABI=TRUE pip install .

Success!

it works!

from mamba.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.