Giter VIP home page Giter VIP logo

Comments (11)

hrbigelow avatar hrbigelow commented on August 27, 2024 13

@tridao (I am not sure if this is just a hack, but for us old guys with CCC < 7, can we do this?)

I see that the Quadro P5200 has Cuda Compute capability 6.1. I saw the same error with my GeForce GTX 1070 (which is also Compute Capability 6.1)

I was able to fix it by compiling the causal-conv1d dependency from source, as follows:

git clone https://github.com/Dao-AILab/causal-conv1d.git
# this is the latest version that Mamba supports:
git checkout v1.0.2
cd causal-conv
# edit setup.py to add the lines here:
    cc_flag.append("-gencode")
    cc_flag.append("arch=compute_60,code=sm_60")

Here is where you need to add those lines.

Then, compile it from source with:

CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .

You can use the following script to test whether it is working properly:

import torch
from causal_conv1d import causal_conv1d_fn

batch, dim, seq, width = 10, 5, 17, 4
x = torch.zeros((batch, dim, seq)).to('cuda')
weight = torch.zeros((dim, width)).to('cuda')
bias = torch.zeros((dim, )).to('cuda')

causal_conv1d_fn(x, weight, bias, None)

EDIT: Just realized the Mamba repo also assumes CCC >= 7. So, I did a similar edit to the mamba setup.py and compiled it with:

henry@henry-gs65:mamba$ MAMBA_FORCE_BUILD=TRUE pip install .

(This takes about 10 minutes to compile)

Once doing this, the top-level Mamba demo works:

import torch

from mamba_ssm import Mamba

batch, length, dim = 2, 64, 16
x = torch.randn(batch, length, dim).to("cuda")
model = Mamba(
    # This module uses roughly 3 * expand * d_model^2 parameters
    d_model=dim, # Model dimension d_model
    d_state=16,  # SSM state expansion factor
    d_conv=4,    # Local convolution width
    expand=2,    # Block expansion factor
).to("cuda")
y = model(x)
assert y.shape == x.shape

from mamba.

hrbigelow avatar hrbigelow commented on August 27, 2024 3

oops, sorry but I forgot a crucial thing. Mamba states that it requires causal_conv1d version <= 1.0.2. I forgot to mention this. So, you need to do a git checkout v1.0.2 before you do the pip install. From where you are now, I'd say it would be:

$ cd causal-conv1d
$ git checkout v1.0.2
# you've already edited the setup.py file I assume
$ pip uninstall causal-conv1d 
$ CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .

At this point, it may work ;) Since Mamba dynamically loads the causal-conv1d python module, no re-compilation of mamba is necessary. But I am not positive of that.

from mamba.

thistleknot avatar thistleknot commented on August 27, 2024 2
Processing /home/user/mamba/causal-conv1d
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [17 lines of output]
      Traceback (most recent call last):
        File "/home/user/lit-gpt/env/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/home/user/lit-gpt/env/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
        File "/home/user/lit-gpt/env/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
          return hook(config_settings)
        File "/tmp/pip-build-env-w4x0ekut/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel
          return self._get_build_requires(config_settings, requirements=['wheel'])
        File "/tmp/pip-build-env-w4x0ekut/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 295, in _get_build_requires
          self.run_setup()
        File "/tmp/pip-build-env-w4x0ekut/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 480, in run_setup
          super(_BuildMetaLegacyBackend, self).run_setup(setup_script=setup_script)
        File "/tmp/pip-build-env-w4x0ekut/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 311, in run_setup
          exec(code, locals())
        File "<string>", line 9, in <module>
      ModuleNotFoundError: No module named 'packaging'
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a 

despite installing python3-packaging and pip install packaging (and can confirm I can import packaging)

from mamba.

thistleknot avatar thistleknot commented on August 27, 2024 1

close

>>> y = model(x)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/user/miniconda3/envs/textgen/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/user/miniconda3/envs/textgen/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/miniconda3/envs/textgen/lib/python3.10/site-packages/mamba_ssm/modules/mamba_simple.py", line 149, in forward
    out = mamba_inner_fn(
  File "/home/user/miniconda3/envs/textgen/lib/python3.10/site-packages/mamba_ssm/ops/selective_scan_interface.py", line 306, in mamba_inner_fn
    return MambaInnerFn.apply(xz, conv1d_weight, conv1d_bias, x_proj_weight, delta_proj_weight,
  File "/home/user/miniconda3/envs/textgen/lib/python3.10/site-packages/torch/autograd/function.py", line 539, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "/home/user/miniconda3/envs/textgen/lib/python3.10/site-packages/torch/cuda/amp/autocast_mode.py", line 113, in decorate_fwd
    return fwd(*args, **kwargs)
  File "/home/user/miniconda3/envs/textgen/lib/python3.10/site-packages/mamba_ssm/ops/selective_scan_interface.py", line 181, in forward
    conv1d_out = causal_conv1d_cuda.causal_conv1d_fwd(x, conv1d_weight, conv1d_bias, True)
TypeError: causal_conv1d_fwd(): incompatible function arguments. The following argument types are supported:
    1. (arg0: torch.Tensor, arg1: torch.Tensor, arg2: Optional[torch.Tensor], arg3: Optional[torch.Tensor], arg4: bool) -> torch.Tensor

Invoked with: tensor([[[-0.4806,  1.2685,  0.3929,  ...,  0.3327,  0.3938, -0.5350],
         [ 0.9421, -0.1715, -0.0481,  ..., -0.1955, -0.8604, -0.4096],
         [ 0.5454, -0.1034, -0.2881,  ...,  0.2157, -1.2089, -0.3394],
         ...,
         [ 0.3014,  0.2976, -0.3656,  ..., -0.4423, -0.8560, -0.3013],
         [-0.3690, -0.3119, -0.1994,  ..., -0.4742, -0.6223,  0.2423],
         [-0.7320,  1.4818,  0.6340,  ..., -0.4294,  0.2926, -0.0436]],

        [[ 0.4325, -0.4794,  0.4466,  ...,  0.1774,  0.8001, -0.0083],
         [-0.2831, -0.2780,  0.3027,  ...,  0.3467, -1.0696,  0.2190],
         [-0.7058,  0.7942, -0.5447,  ...,  0.5141, -0.9554, -0.0649],
         ...,
         [-0.7701,  0.9309, -0.6030,  ...,  0.2993, -0.0422, -0.1484],
         [ 0.5808,  0.4285, -0.5568,  ...,  1.3064, -1.0199, -0.3363],
         [ 0.0734,  0.0993,  0.6768,  ..., -0.1356,  0.9295, -0.1664]]],
       device='cuda:0', requires_grad=True), tensor([[-0.0555,  0.4169,  0.2594, -0.4943],
        [-0.0554,  0.0376,  0.1702,  0.4476],
        [-0.1875,  0.4470,  0.2299, -0.0788],
        [-0.2496,  0.4405, -0.0241,  0.0307],
        [ 0.2666, -0.2731, -0.1284, -0.3504],
        [ 0.2001,  0.1497,  0.2172,  0.1289],
        [ 0.3474,  0.3953,  0.2375,  0.0597],
        [ 0.0498,  0.1374, -0.0508, -0.1526],
        [-0.2388, -0.2890, -0.4515,  0.0008],
        [-0.2706, -0.4276, -0.4668,  0.4245],
        [ 0.0252,  0.0295, -0.4991,  0.2078],
        [ 0.2212,  0.3381, -0.3815,  0.1831],
        [-0.3029, -0.3729, -0.1333, -0.1371],
        [-0.3745,  0.0316, -0.1675,  0.0064],
        [ 0.4358,  0.4920, -0.4541, -0.0722],
        [ 0.2807, -0.1016, -0.4563, -0.3044],
        [ 0.1035,  0.0162,  0.4479,  0.3260],
        [-0.2877,  0.1106,  0.4981,  0.4084],
        [-0.3320, -0.3829, -0.1360,  0.3744],
        [-0.3771, -0.3639, -0.1163,  0.3709],
        [-0.2274, -0.4964, -0.0816,  0.4454],
        [ 0.1764, -0.0485,  0.3448, -0.4393],
        [-0.3905, -0.3605,  0.0623, -0.2038],
        [-0.2044, -0.1454, -0.1526, -0.4165],
        [-0.0414,  0.1940,  0.3441, -0.3418],
        [ 0.4200, -0.2309,  0.1998, -0.1196],
        [-0.4553,  0.1990,  0.4579,  0.1669],
        [-0.3292,  0.0408, -0.4167,  0.3332],
        [ 0.4237,  0.4848, -0.3006, -0.2292],
        [ 0.4939,  0.1801, -0.1294,  0.0011],
        [ 0.3516, -0.3912,  0.3251,  0.3016],
        [-0.0648, -0.0567, -0.3247,  0.4323]], device='cuda:0',
       requires_grad=True), Parameter containing:
tensor([-3.1444e-01,  4.3207e-02,  2.2112e-01, -3.4120e-01,  4.0195e-01,
        -1.4227e-01, -4.5976e-01, -3.6258e-04, -4.6205e-01,  1.7177e-01,
         4.6020e-01, -1.7618e-01,  2.0168e-01,  1.2738e-01,  2.8975e-01,
        -4.2130e-01, -2.3378e-01, -1.8998e-01, -9.5853e-02, -2.4321e-01,
        -1.0333e-02, -2.0879e-01,  1.2288e-01,  5.1831e-02, -4.9842e-02,
        -3.1233e-01,  1.4064e-01, -2.4546e-01,  3.0703e-01,  1.4846e-02,
         7.5587e-02, -3.6691e-01], device='cuda:0', requires_grad=True), True
>>>

from mamba.

tridao avatar tridao commented on August 27, 2024 1

Sorry I'm traveling this week but will have time to look into this next week.

from mamba.

thistleknot avatar thistleknot commented on August 27, 2024 1

yay, that did it
back in the game
=D

from mamba.

thistleknot avatar thistleknot commented on August 27, 2024

oops, I did it out of order: nm, still produced the same error after applying same process to mamba's setup.py

Fyi for us newbs

CCC stands for "CUDA Compute Capability," a numerical value that represents the features supported by a CUDA (Compute Unified Device Architecture) hardware (typically a GPU). CUDA is a parallel computing platform and application programming interface (API) model created by Nvidia. It allows software developers to use a CUDA-enabled graphics processing unit (GPU) for general purpose processing (an approach known as GPGPU, General-Purpose computing on Graphics Processing Units).

The Compute Capability is a version number indicating the features supported by the GPU. Different versions of CUDA GPUs support different features and therefore have different Compute Capabilities. For example, the Quadro P5200 and GeForce GTX 1070 GPUs mentioned have a Compute Capability of 6.1. This version number is important for developers because they need to compile their programs for a specific Compute Capability to ensure compatibility and optimal performance on the target GPU.

When you modify a setup.py file of a Python package to include specific Compute Capability flags, you are instructing the compiler to generate code optimized for GPUs with that particular Compute Capability. This is often necessary when working with older GPUs or when the pre-compiled binaries of a library do not support the specific Compute Capability of your GPU.

from mamba.

thistleknot avatar thistleknot commented on August 27, 2024

btw, I had to do something similar to get ctransformers to work

from mamba.

hrbigelow avatar hrbigelow commented on August 27, 2024

(i edited the original instruction to reflect this just now)

from mamba.

thistleknot avatar thistleknot commented on August 27, 2024

nm

pip install wheel
python setup.py

from mamba.

StorywithLove avatar StorywithLove commented on August 27, 2024

@tridao (I am not sure if this is just a hack, but for us old guys with CCC < 7, can we do this?)

I see that the Quadro P5200 has Cuda Compute capability 6.1. I saw the same error with my GeForce GTX 1070 (which is also Compute Capability 6.1)

I was able to fix it by compiling the causal-conv1d dependency from source, as follows:

git clone https://github.com/Dao-AILab/causal-conv1d.git
# this is the latest version that Mamba supports:
git checkout v1.0.2
cd causal-conv
# edit setup.py to add the lines here:
    cc_flag.append("-gencode")
    cc_flag.append("arch=compute_60,code=sm_60")

Here is where you need to add those lines.

Then, compile it from source with:

CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .

You can use the following script to test whether it is working properly:

import torch
from causal_conv1d import causal_conv1d_fn

batch, dim, seq, width = 10, 5, 17, 4
x = torch.zeros((batch, dim, seq)).to('cuda')
weight = torch.zeros((dim, width)).to('cuda')
bias = torch.zeros((dim, )).to('cuda')

causal_conv1d_fn(x, weight, bias, None)

EDIT: Just realized the Mamba repo also assumes CCC >= 7. So, I did a similar edit to the mamba setup.py and compiled it with:

henry@henry-gs65:mamba$ MAMBA_FORCE_BUILD=TRUE pip install .

(This takes about 10 minutes to compile)

Once doing this, the top-level Mamba demo works:

import torch

from mamba_ssm import Mamba

batch, length, dim = 2, 64, 16
x = torch.randn(batch, length, dim).to("cuda")
model = Mamba(
    # This module uses roughly 3 * expand * d_model^2 parameters
    d_model=dim, # Model dimension d_model
    d_state=16,  # SSM state expansion factor
    d_conv=4,    # Local convolution width
    expand=2,    # Block expansion factor
).to("cuda")
y = model(x)
assert y.shape == x.shape

Oh, God, I solve it! Love from P40 (CCC 6.1)!!!

from mamba.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.