lingyeai / addernetcuda Goto Github PK
View Code? Open in Web Editor NEWAn addernet CUDA version.
License: BSD 3-Clause "New" or "Revised" License
An addernet CUDA version.
License: BSD 3-Clause "New" or "Revised" License
I try to train resnet20 for classification task on cifar10 dataset. But when using adder_cuda, the network seems to have difficulty converging. So, I am curious about the author's experimental results on the cifar10 dataset.
I encounter this problem: "NHoWo and Co should be the multiple of 16 or 12 at the same time."
This error occurs whatever codes I execute.
作者你好,感谢开源,这是一份很棒的CUDA代码,给我提供了很大的帮助。
我注意到代码中留了depth_wise
接口,但是cuda代码中并没有相应的适配,所以若是设置depth_wise=true
计算结果是错误的。
请问后续有适配depth_wise加法卷积的打算吗?
每次跑完8个batch后出现这个问题,改batchsize没用,都是8个batch后报错。
机器装了3块GPU,设置的GPU_ID = 1
Files already downloaded and verified
Files already downloaded and verified
Train - Epoch 1, Batch: 0, Loss: 2.296886, Time 5.307902
Train - Epoch 1, Batch: 1, Loss: 2.301040, Time 0.105161
Train - Epoch 1, Batch: 2, Loss: 2.300776, Time 0.110913
Train - Epoch 1, Batch: 3, Loss: 2.303986, Time 0.104652
Train - Epoch 1, Batch: 4, Loss: 2.289750, Time 0.100140
Train - Epoch 1, Batch: 5, Loss: 2.315252, Time 0.099318
Train - Epoch 1, Batch: 6, Loss: 2.298506, Time 0.106323
Train - Epoch 1, Batch: 7, Loss: 2.310294, Time 0.106855
Traceback (most recent call last):
File "/work/sunbiao/AdderNetCUDA-LingYeAI/main.py", line 146, in
main()
File "/work/sunbiao/AdderNetCUDA-LingYeAI/main.py", line 142, in main
train_and_test(e)
File "/work/sunbiao/AdderNetCUDA-LingYeAI/main.py", line 135, in train_and_test
train(epoch)
File "/work/sunbiao/AdderNetCUDA-LingYeAI/main.py", line 90, in train
output = net(images)
File "/home/nature/anaconda3/envs/addernet/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/work/sunbiao/AdderNetCUDA-LingYeAI/densenet.py", line 83, in forward
x = self.trans3(self.dense3(x))
File "/home/nature/anaconda3/envs/addernet/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/nature/anaconda3/envs/addernet/lib/python3.9/site-packages/torch/nn/modules/container.py", line 141, in forward
input = module(input)
File "/home/nature/anaconda3/envs/addernet/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/work/sunbiao/AdderNetCUDA-LingYeAI/densenet.py", line 17, in forward
y = self.conv1(func.relu(self.bn1(x)))
File "/home/nature/anaconda3/envs/addernet/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/work/sunbiao/AdderNetCUDA-LingYeAI/adder.py", line 104, in forward
output = adder2d_function(x, self.adder, self.stride, self.padding)
File "/work/sunbiao/AdderNetCUDA-LingYeAI/adder.py", line 39, in adder2d_function
out = out.permute(3, 0, 1, 2).contiguous()
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
hello, I run your code and there is an CUDA ERROR: an illegal memory access was encountered. The detailed information is
Traceback (most recent call last):
File "/home/new/classification-CNN/AdderNetCUDA-main/main.py", line 145, in
main()
File "/home/new/classification-CNN/AdderNetCUDA-main/main.py", line 141, in main
train_and_test(e)
File "/home/new/classification-CNN/AdderNetCUDA-main/main.py", line 134, in train_and_test
train(epoch)
File "/home/new/classification-CNN/AdderNetCUDA-main/main.py", line 101, in train
loss.backward()
File "/home/new/anaconda3/envs/pytorch38/lib/python3.8/site-packages/torch/tensor.py", line 245, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/home/new/anaconda3/envs/pytorch38/lib/python3.8/site-packages/torch/autograd/init.py", line 145, in backward
Variable._execution_engine.run_backward(
File "/home/new/anaconda3/envs/pytorch38/lib/python3.8/site-packages/torch/autograd/function.py", line 89, in apply
return self._forward_cls.backward(self, *args) # type: ignore
File "/home/new/classification-CNN/AdderNetCUDA-main/adder.py", line 78, in backward
grad_W_col = grad_W_col/grad_W_col.norm(p=2).clamp(min=1e-12)*math.sqrt(W_col.size(1)*W_col.size(0))/5
File "/home/new/anaconda3/envs/pytorch38/lib/python3.8/site-packages/torch/tensor.py", line 401, in norm
return torch.norm(self, p, dim, keepdim, dtype=dtype)
File "/home/new/anaconda3/envs/pytorch38/lib/python3.8/site-packages/torch/functional.py", line 1376, in norm
return _VF.norm(input, p, dim=_dim, keepdim=keepdim) # type: ignore
RuntimeError: CUDA error: an illegal memory access was encountered
Can you provide me with some solutions to this problem?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.