sczhou / stfan Goto Github PK

[ICCV 2019] Spatio-Temporal Filter Adaptive Network for Video Deblurring

License: MIT License

Python 84.50% Shell 0.26% C++ 2.90% Cuda 12.34%

stfan's Issues

No module named 'kernelconv2d_cuda'

Hello, when I tried to run the code, I got an error saying No module named 'kernelconv2d_cuda'.
I guess that the FAC codes provided here has not been built and installed, so I tried to solve this by calling 'python setup.py install' under the FAC folder. However, another error appeared saying 'cudaSteam_t' has not been declared.
Any solutions for these problems? Thanks!

Multi GPU

Did you guys train the network with single GPU?
Best,

Hello, thank you for providing the code of the paper. when I tried to run this code. I run the install.sh to install the module named 'kernelconv2d_cuda'. However, I got the following error: Can you give me some help? Thanks!

Hello, thank you for providing the code of the paper. when I tried to run this code. I run the install.sh to install the module named 'kernelconv2d_cuda'. However, I got the following error: Can you give me some help? Thanks!

luded from KernelConv2D_cuda.cpp:4:0:
KernelConv2D_kernel.h:10:2: error: 'cudaStream_t' has not been declared
cudaStream_t stream
^
KernelConv2D_kernel.h:20:5: error: 'cudaStream_t' has not been declared
cudaStream_t stream
^
KernelConv2D_cuda.cpp: In function 'int KernelConv2D_forward_cuda(at::Tensor&, at::Tensor&, int, at::Tensor&)':
KernelConv2D_cuda.cpp:18:29: error: 'class at::Context' has no member named 'getCurrentCUDAStream'
at::globalContext().getCurrentCUDAStream()
^
KernelConv2D_cuda.cpp: In function 'int KernelConv2D_backward_cuda(at::Tensor&, at::Tensor&, int, at::Tensor&, at::Tensor&, at::Tensor&)':
KernelConv2D_cuda.cpp:42:29: error: 'class at::Context' has no member named 'getCurrentCUDAStream'
at::globalContext().getCurrentCUDAStream()
^
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

Originally posted by @HCMSwang in #1 (comment)

No module named 'kernelconv2d_cuda'

onnx

hi, when I export model to onnx format, I meet a problem
File "/root/anaconda3/lib/python3.8/site-packages/torch/onnx/init.py", line 316, in export
return utils.export(model, args, f, export_params, verbose, training,
File "/root/anaconda3/lib/python3.8/site-packages/torch/onnx/utils.py", line 107, in export
_export(model, args, f, export_params, verbose, training, input_names, output_names,
File "/root/anaconda3/lib/python3.8/site-packages/torch/onnx/utils.py", line 737, in _export
proto, export_map, val_use_external_data_format = graph._export_onnx(
RuntimeError: ONNX export failed: Couldn't export Python operator KernelConv2DFunction

one problem in data_loaders.py

Hi,
in data_loaders.py line 83, maybe it should be not sam_len % seq_len ==0, not be not seq_len % seq_len ==0

if not seq_len%seq_len == 0:
    sequence = self.get_files_of_taxonomy(phase, name, samples[-seq_len:])
    sequences.extend(sequence)
    seq_num += 1

runner.py: error:

runner.py: error: unrecognized arguments: test ./ckpt/best-ckpt.pth.tar

RuntimeError: CUDA call failed

Overview

I successfully installed all dependences, but obtain "RuntimeError: CUDA call failed" at forward step when testing the Deep Video Deblurring Dataset.

Expected behavior

No Traceback

Environment

Ubuntu 16.04
python 3.7
gcc 5.4.0
pytorch 1.0
CUDA 9.0
Device: TITAN RTX
conda list : as follows:

 Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main    defaults
argparse                  1.4.0                    pypi_0    pypi
blas                      1.0                         mkl    defaults
ca-certificates           2020.7.22                     0    defaults
certifi                   2020.6.20                py37_0    defaults
cffi                      1.14.2           py37he30daa8_0    defaults
cudatoolkit               9.0                  h13b8566_0    defaults
cycler                    0.10.0                   pypi_0    pypi
easydict                  1.9                      pypi_0    pypi
freetype                  2.10.2               h5ab3b9f_0    defaults
future                    0.18.2                   pypi_0    pypi
intel-openmp              2020.2                      254    defaults
jpeg                      9b                   h024ee3a_2    defaults
kiwisolver                1.2.0                    pypi_0    pypi
lcms2                     2.11                 h396b838_0    defaults
ld_impl_linux-64          2.33.1               h53a641e_7    defaults
libedit                   3.1.20191231         h14c3975_1    defaults
libffi                    3.3                  he6710b0_2    defaults
libgcc-ng                 9.1.0                hdf63c60_0    defaults
libpng                    1.6.37               hbc83047_0    defaults
libstdcxx-ng              9.1.0                hdf63c60_0    defaults
libtiff                   4.1.0                h2733197_1    defaults
lz4-c                     1.9.2                he6710b0_1    defaults
matplotlib                3.3.1                    pypi_0    pypi
mkl                       2020.2                      256    defaults
mkl-service               2.3.0            py37he904b0f_0    defaults
mkl_fft                   1.1.0            py37h23d657b_0    defaults
mkl_random                1.1.1            py37h0573a6f_0    defaults
ncurses                   6.2                  he6710b0_1    defaults
ninja                     1.10.1           py37hfd86e86_0    defaults
numpy                     1.19.1           py37hbc911f0_0    defaults
numpy-base                1.19.1           py37hfa32c7d_0    defaults
olefile                   0.46                     py37_0    defaults
opencv-python             4.4.0.42                 pypi_0    pypi
openexr                   1.3.2                    pypi_0    pypi
openssl                   1.1.1g               h7b6447c_0    defaults
pillow                    7.2.0            py37hb39fc2d_0    defaults
pip                       20.2.2                   py37_0    defaults
protobuf                  3.13.0                   pypi_0    pypi
pycparser                 2.20                       py_2    defaults
pyexr                     0.3.8                    pypi_0    pypi
pyparsing                 2.4.7                    pypi_0    pypi
python                    3.7.9                h7579374_0    defaults
python-dateutil           2.8.1                    pypi_0    pypi
pytorch                   1.0.1           py3.7_cuda9.0.176_cudnn7.4.2_2    pytorch
readline                  8.0                  h7b6447c_0    defaults
scipy                     1.5.2                    pypi_0    pypi
setuptools                49.6.0                   py37_0    defaults
six                       1.15.0                     py_0    defaults
sqlite                    3.33.0               h62c20be_0    defaults
tensorboardx              2.1                      pypi_0    pypi
tk                        8.6.10               hbc83047_0    defaults
torchvision               0.2.2                      py_3    pytorch
wheel                     0.35.1                     py_0    defaults
xz                        5.2.5                h7b6447c_0    defaults
zlib                      1.2.11               h7b6447c_3    defaults
zstd                      1.4.5                h9ceee32_0    defaults

Error in forward_cuda_kernel

Use config:
{'CONST': {'DEVICE': 'all',
           'NUM_WORKER': 1,
           'TEST_BATCH_SIZE': 1,
           'TRAIN_BATCH_SIZE': 1,
           'WEIGHTS': './ckpt/best-ckpt.pth.tar'},
 'DATA': {'COLOR_JITTER': [0.2, 0.15, 0.3, 0.1],
          'CROP_IMG_SIZE': [320, 448],
          'GAUSSIAN': [0, 0.0001],
          'MEAN': [0.0, 0.0, 0.0],
          'SEQ_LENGTH': 20,
          'STD': [255.0, 255.0, 255.0]},
 'DATASET': {'DATASET_NAME': 'VideoDeblur'},
 'DIR': {'DATASET_JSON_FILE_PATH': './datasets/VideoDeblur.json',
         'DATASET_ROOT': './datasets/DeepVideoDeblurring_Dataset/DeepVideoDeblurring',
         'IMAGE_BLUR_PATH': './datasets/DeepVideoDeblurring_Dataset/DeepVideoDeblurring/%s/%s/input/%s.jpg',
         'IMAGE_CLEAR_PATH': './datasets/DeepVideoDeblurring_Dataset/DeepVideoDeblurring/%s/%s/GT/%s.jpg',
         'OUT_PATH': './result'},
 'LOSS': {'MULTISCALE_WEIGHTS': [0.3, 0.3, 0.2, 0.1, 0.1]},
 'NETWORK': {'BATCHNORM': False,
             'DEBLURNETARCH': 'DeblurNet',
             'LEAKY_VALUE': 0.1,
             'PHASE': 'test'},
 'TEST': {'PRINT_FREQ': 5, 'VISUALIZATION_NUM': 10},
 'TRAIN': {'BETA': 0.999,
           'BIAS_DECAY': 0.0,
           'LEARNING_RATE': 0.0001,
           'LR_DECAY': 0.1,
           'LR_MILESTONES': [80, 160, 250],
           'MOMENTUM': 0.9,
           'NUM_EPOCHES': 400,
           'PRINT_FREQ': 10,
           'SAVE_FREQ': 10,
           'USE_PERCET_LOSS': True,
           'WEIGHT_DECAY': 0.0}}
CUDA DEVICES NUMBER: 8
[DEBUG] 2020-09-13 01:20:04.316014 Parameters in DeblurNet: 5372547.
[INFO] 2020-09-13 01:20:09.133808 Recovering from ./ckpt/best-ckpt.pth.tar ...
[INFO] 2020-09-13 01:20:09.171222 Recover complete. Current epoch #379, Best_Img_PSNR = 31.241976697921753 at epoch #378.
[INFO] Output_dir： ./result/2020-09-13T01:20:09.171343_DeblurNet/
[INFO] 2020-09-13 01:20:09.177601 Collecting files of Taxonomy [Name = 720p_240fps_2: 5]
[INFO] 2020-09-13 01:20:09.178387 Collecting files of Taxonomy [Name = IMG_0003: 5]
[INFO] 2020-09-13 01:20:09.179175 Collecting files of Taxonomy [Name = IMG_0021: 5]
[INFO] 2020-09-13 01:20:09.179935 Collecting files of Taxonomy [Name = IMG_0030: 5]
[INFO] 2020-09-13 01:20:09.181372 Collecting files of Taxonomy [Name = IMG_0031: 5]
[INFO] 2020-09-13 01:20:09.182820 Collecting files of Taxonomy [Name = IMG_0032: 5]
[INFO] 2020-09-13 01:20:09.184221 Collecting files of Taxonomy [Name = IMG_0033: 5]
[INFO] 2020-09-13 01:20:09.185616 Collecting files of Taxonomy [Name = IMG_0037: 5]
[INFO] 2020-09-13 01:20:09.186999 Collecting files of Taxonomy [Name = IMG_0039: 5]
[INFO] 2020-09-13 01:20:09.188434 Collecting files of Taxonomy [Name = IMG_0049: 5]
[INFO] 2020-09-13 01:20:09.188446 Complete collecting files of the dataset for TEST. Seq Number: 30.

error in forward_cuda_kernel: no kernel image is available for execution on the device
Traceback (most recent call last):
  File "runner.py", line 71, in <module>
    main()
  File "runner.py", line 67, in main
    bulid_net(cfg)
  File "/data1/wangpengxiao/STFAN/core/build.py", line 113, in bulid_net
    test(cfg, init_epoch, dataset_loader, test_transforms, deblurnet, test_writer)
  File "/data1/wangpengxiao/STFAN/core/test.py", line 84, in test
    output_img, output_fea = deblurnet(img_blur, last_img_blur, output_last_img, output_last_fea)
  File "/nvme/wangpengxiao/anaconda3/envs/STFAN/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/data1/wangpengxiao/STFAN/models/DeblurNet.py", line 109, in forward
    conv3_d_k = self.kconv_deblur(conv3_d, kernel_deblur)
  File "/nvme/wangpengxiao/anaconda3/envs/STFAN/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/data1/wangpengxiao/STFAN/models/FAC/kernelconv2d/KernelConv2D.py", line 87, in forward
    return KernelConv2DFunction.apply(input_pad, kernel, self.kernel_size)
  File "/data1/wangpengxiao/STFAN/models/FAC/kernelconv2d/KernelConv2D.py", line 37, in forward
    kernelconv2d_cuda.forward(input, kernel, intKernelSize, output)
RuntimeError: CUDA call failed (KernelConv2D_forward_cuda at KernelConv2D_cuda.cpp:23)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7fc6527a6cf5 in /nvme/wangpengxiao/anaconda3/envs/STFAN/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: KernelConv2D_forward_cuda(at::Tensor&, at::Tensor&, int, at::Tensor&) + 0xe8 (0x7fc62d22b428 in /nvme/wangpengxiao/.local/lib/python3.7/site-packages/kernelconv2d_cuda-1.0.0-py3.7-linux-x86_64.egg/kernelconv2d_cuda.cpython-37m-x86_64-linux-gnu.so)
frame #2: <unknown function> + 0x1443a (0x7fc62d23643a in /nvme/wangpengxiao/.local/lib/python3.7/site-packages/kernelconv2d_cuda-1.0.0-py3.7-linux-x86_64.egg/kernelconv2d_cuda.cpython-37m-x86_64-linux-gnu.so)
frame #3: <unknown function> + 0x1454e (0x7fc62d23654e in /nvme/wangpengxiao/.local/lib/python3.7/site-packages/kernelconv2d_cuda-1.0.0-py3.7-linux-x86_64.egg/kernelconv2d_cuda.cpython-37m-x86_64-linux-gnu.so)
frame #4: <unknown function> + 0x117d3 (0x7fc62d2337d3 in /nvme/wangpengxiao/.local/lib/python3.7/site-packages/kernelconv2d_cuda-1.0.0-py3.7-linux-x86_64.egg/kernelconv2d_cuda.cpython-37m-x86_64-linux-gnu.so)
<omitting python frames>
frame #9: THPFunction_apply(_object*, _object*) + 0x5a1 (0x7fc673bc6061 in /nvme/wangpengxiao/anaconda3/envs/STFAN/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #47: __libc_start_main + 0xf0 (0x7fc686a9c830 in /lib/x86_64-linux-gnu/libc.so.6)

An error in DeblurNet.py

Hi, thank you for sharing your code! I think I found an error in DeblurNet.py.
I think the line 174 in DeblurNet.py

STFAN/models/DeblurNet.py

Line 174 in 5429b86

conv_a_k = self.kconv_deblur(output_last_fea, kernel_warp)

should be changed to
conv_a_k = self.kconv_warp(output_last_fea, kernel_warp)

undefined symbol: _ZN2at19UndefinedTensorImpl10_singletonE

Hi nice work !
I successfully installed all dependencies and compiling successfully, but obtain the error below:

 Traceback (most recent call last):
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 728, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/qaz/newdisk/deblur_vsr/upload_1229/peer_model/LEDVDI/CODES/networks/Ours_DeblurOnly.py", line 3, in <module>
    from networks.FAC.kernelconv2d import KernelConv2D
  File "/home/qaz/newdisk/deblur_vsr/upload_1229/peer_model/LEDVDI/CODES/networks/FAC/kernelconv2d/KernelConv2D.py", line 8, in <module>
    import kernelconv2d_cuda
ImportError: /home/qaz/.local/lib/python3.7/site-packages/kernelconv2d_cuda-1.0.0-py3.7-linux-x86_64.egg/kernelconv2d_cuda.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN2at19UndefinedTensorImpl10_singletonE

my cuda is 10.1 pytorch ==1.1.0,gcc==5.0. besides, I have tried pytorch==1.0.1. Both of the have the same error, hope you can assist me, thanks!

about SEQ_LENGTH

What is the function of "SEQ_LENGTH"，I set it to 2 and 20, it doesn't feel like much difference

RuntimeError: function KernelConv2DFunctionBackward,bug in KernelConv2D.py

In class KernelConv2DFunction ,The number of return Variable arguments of backward() funtion is not equal to the number of input Variable arguments of forward() function.Maybe backward() funtion should change return grad_input, grad_kernel to return grad_input, grad_kernel,None.that's can fix it

visualization results

I I execute this command line python runner.py,I get events.out.tfevents.1666621441.smithyang-ThinkStation-P720, Could you tell me how to get visualization results

Question about the batch size.

Hi, thanks for sharing the fabulous work!

I wonder what batch size you guy used for the training. The paper does not mention the batch size.
Is it really that you guys used batch size of 1 for the training according to the config file?

Thanks!

error in forward_cuda_kernel: no kernel image is available for execution on the device

Environment:

  Ubuntu 18.04
  python 3.7
  gcc 7.5.0
  pytorch 1.11.0
  CUDA 11.3
  Device: 3090

Build FAC layer sucess.
My code:

import os
import torch
import numpy as np
from models import DeblurNet

c_net = DeblurNet.DeblurNet()
c_net.to(torch.device("cuda"))

img_blur = torch.randn((1, 3, 256, 256)).cuda()
last_img_blur = torch.randn((1, 3, 256, 256)).cuda()
output_last_img = torch.randn((1, 3, 256, 256)).cuda()

aa = c_net.forward(img_blur, last_img_blur, output_last_img, None)
print(aa.shape)

ERROR info:

error in forward_cuda_kernel: no kernel image is available for execution on the device
Traceback (most recent call last):
  File "temp.py", line 15, in <module>
    aa = c_net.forward(img_blur, last_img_blur, output_last_img, output_last_fea)
  File "/home/users/zxzhao/projects/STFAN/models/DeblurNet.py", line 109, in forward
    conv3_d_k = self.kconv_deblur(conv3_d, kernel_deblur)
  File "/home/users/zxzhao/app/anaconda3/envs/LEDVDI_3090/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/users/zxzhao/projects/STFAN/models/FAC/kernelconv2d/KernelConv2D.py", line 87, in forward
    return KernelConv2DFunction.apply(input_pad, kernel, self.kernel_size)
  File "/home/users/zxzhao/projects/STFAN/models/FAC/kernelconv2d/KernelConv2D.py", line 37, in forward
    kernelconv2d_cuda.forward(input, kernel, intKernelSize, output)
RuntimeError: CUDA call failed

But running this code on other machines（Device: 1660s, CUDA 10.2, pytorch 1.11.0） is normal, could you help me?

sczhou / stfan Goto Github PK

stfan's Issues

Overview

Expected behavior

Environment

Error in forward_cuda_kernel

Recommend Projects

Recommend Topics

Recommend Org