lingyeai / addernetcuda Goto Github PK

View Code? Open in Web Editor NEW

4.0 4.0 1.0 1.32 MB

An addernet CUDA version.

License: BSD 3-Clause "New" or "Revised" License

Python 66.97% Cuda 28.89% C++ 4.14%

addernetcuda's People

Contributors

Stargazers

Watchers

Forkers

haikangdiao

addernetcuda's Issues

Resnet20 based on adder_cuda seems to have difficulty converging

I try to train resnet20 for classification task on cifar10 dataset. But when using adder_cuda, the network seems to have difficulty converging. So, I am curious about the author's experimental results on the cifar10 dataset.

An error: NHoWo and Co should be the multiple of 16 or 12 at the same time

I encounter this problem: "NHoWo and Co should be the multiple of 16 or 12 at the same time."
This error occurs whatever codes I execute.

depth_wise卷积适配

作者你好，感谢开源，这是一份很棒的CUDA代码，给我提供了很大的帮助。
我注意到代码中留了depth_wise接口，但是cuda代码中并没有相应的适配，所以若是设置depth_wise=true计算结果是错误的。
请问后续有适配depth_wise加法卷积的打算吗？

illegal memory access was encountered

每次跑完8个batch后出现这个问题，改batchsize没用，都是8个batch后报错。
机器装了3块GPU，设置的GPU_ID = 1
Files already downloaded and verified
Files already downloaded and verified
Train - Epoch 1, Batch: 0, Loss: 2.296886, Time 5.307902
Train - Epoch 1, Batch: 1, Loss: 2.301040, Time 0.105161
Train - Epoch 1, Batch: 2, Loss: 2.300776, Time 0.110913
Train - Epoch 1, Batch: 3, Loss: 2.303986, Time 0.104652
Train - Epoch 1, Batch: 4, Loss: 2.289750, Time 0.100140
Train - Epoch 1, Batch: 5, Loss: 2.315252, Time 0.099318
Train - Epoch 1, Batch: 6, Loss: 2.298506, Time 0.106323
Train - Epoch 1, Batch: 7, Loss: 2.310294, Time 0.106855
Traceback (most recent call last):
File "/work/sunbiao/AdderNetCUDA-LingYeAI/main.py", line 146, in
main()
File "/work/sunbiao/AdderNetCUDA-LingYeAI/main.py", line 142, in main
train_and_test(e)
File "/work/sunbiao/AdderNetCUDA-LingYeAI/main.py", line 135, in train_and_test
train(epoch)
File "/work/sunbiao/AdderNetCUDA-LingYeAI/main.py", line 90, in train
output = net(images)
File "/home/nature/anaconda3/envs/addernet/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/work/sunbiao/AdderNetCUDA-LingYeAI/densenet.py", line 83, in forward
x = self.trans3(self.dense3(x))
File "/home/nature/anaconda3/envs/addernet/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/nature/anaconda3/envs/addernet/lib/python3.9/site-packages/torch/nn/modules/container.py", line 141, in forward
input = module(input)
File "/home/nature/anaconda3/envs/addernet/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/work/sunbiao/AdderNetCUDA-LingYeAI/densenet.py", line 17, in forward
y = self.conv1(func.relu(self.bn1(x)))
File "/home/nature/anaconda3/envs/addernet/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/work/sunbiao/AdderNetCUDA-LingYeAI/adder.py", line 104, in forward
output = adder2d_function(x, self.adder, self.stride, self.padding)
File "/work/sunbiao/AdderNetCUDA-LingYeAI/adder.py", line 39, in adder2d_function
out = out.permute(3, 0, 1, 2).contiguous()
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

CUDA ERROR

hello, I run your code and there is an CUDA ERROR: an illegal memory access was encountered. The detailed information is

Traceback (most recent call last):
File "/home/new/classification-CNN/AdderNetCUDA-main/main.py", line 145, in
main()
File "/home/new/classification-CNN/AdderNetCUDA-main/main.py", line 141, in main
train_and_test(e)
File "/home/new/classification-CNN/AdderNetCUDA-main/main.py", line 134, in train_and_test
train(epoch)
File "/home/new/classification-CNN/AdderNetCUDA-main/main.py", line 101, in train
loss.backward()
File "/home/new/anaconda3/envs/pytorch38/lib/python3.8/site-packages/torch/tensor.py", line 245, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/home/new/anaconda3/envs/pytorch38/lib/python3.8/site-packages/torch/autograd/init.py", line 145, in backward
Variable._execution_engine.run_backward(
File "/home/new/anaconda3/envs/pytorch38/lib/python3.8/site-packages/torch/autograd/function.py", line 89, in apply
return self._forward_cls.backward(self, *args) # type: ignore
File "/home/new/classification-CNN/AdderNetCUDA-main/adder.py", line 78, in backward
grad_W_col = grad_W_col/grad_W_col.norm(p=2).clamp(min=1e-12)*math.sqrt(W_col.size(1)*W_col.size(0))/5
File "/home/new/anaconda3/envs/pytorch38/lib/python3.8/site-packages/torch/tensor.py", line 401, in norm
return torch.norm(self, p, dim, keepdim, dtype=dtype)
File "/home/new/anaconda3/envs/pytorch38/lib/python3.8/site-packages/torch/functional.py", line 1376, in norm
return _VF.norm(input, p, dim=_dim, keepdim=keepdim) # type: ignore
RuntimeError: CUDA error: an illegal memory access was encountered

Can you provide me with some solutions to this problem？

lingyeai / addernetcuda Goto Github PK

addernetcuda's People

Contributors

Stargazers

Watchers

Forkers

addernetcuda's Issues

Resnet20 based on adder_cuda seems to have difficulty converging

An error: NHoWo and Co should be the multiple of 16 or 12 at the same time

depth_wise卷积适配

illegal memory access was encountered

CUDA ERROR

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent