harukiyqm / non-local-sparse-attention Goto Github PK

PyTorch code for our paper "Image Super-Resolution with Non-Local Sparse Attention" (CVPR2021).

Python 99.42% Shell 0.58%

non-local-sparse-attention's Introduction

Image Super-Resolution with Non-Local Sparse Attention

This repository is for NLSN introduced in the following paper "Image Super-Resolution with Non-Local Sparse Attention", CVPR2021, [Link]

The code is built on EDSR (PyTorch) and test on Ubuntu 18.04 environment (Python3.6, PyTorch >= 1.1.0) with V100 GPUs.

Introduction
Train
Test
Citation
Acknowledgements

Introduction

Both Non-Local (NL) operation and sparse representa-tion are crucial for Single Image Super-Resolution (SISR).In this paper, we investigate their combinations and proposea novel Non-Local Sparse Attention (NLSA) with dynamicsparse attention pattern. NLSA is designed to retain long-range modeling capability from NL operation while enjoying robustness and high-efficiency of sparse representation.Specifically, NLSA rectifies non-local attention with spherical locality sensitive hashing (LSH) that partitions the input space into hash buckets of related features. For everyquery signal, NLSA assigns a bucket to it and only computes attention within the bucket. The resulting sparse attention prevents the model from attending to locations thatare noisy and less-informative, while reducing the computa-tional cost from quadratic to asymptotic linear with respectto the spatial size. Extensive experiments validate the effectiveness and efficiency of NLSA. With a few non-local sparseattention modules, our architecture, called non-local sparsenetwork (NLSN), reaches state-of-the-art performance forSISR quantitatively and qualitatively.

Non-Local Sparse Attention.

Non-Local Sparse Network.

Train

Prepare training data

Download DIV2K training data (800 training + 100 validtion images) from DIV2K dataset or SNU_CVLab.
Specify '--dir_data' based on the HR and LR images path.

For more informaiton, please refer to EDSR(PyTorch).

Begin to train

(optional) Download pretrained models for our paper.

Pre-trained models can be downloaded from Google Drive

Cd to 'src', run the following script to train models.

Example command is in the file 'demo.sh'.

# Example X2 SR
python main.py --dir_data ../../ --n_GPUs 4 --rgb_range 1 --chunk_size 144 --n_hashes 4 --save_models --lr 1e-4 --decay 200-400-600-800 --epochs 1000 --chop --save_results --n_resblocks 32 --n_feats 256 --res_scale 0.1 --batch_size 16 --model NLSN --scale 2 --patch_size 96 --save NLSN_x2 --data_train DIV2K

Test

Quick start

Download benchmark datasets from SNU_CVLab
(optional) Download pretrained models for our paper.

All the models can be downloaded from Google Drive

Cd to 'src', run the following scripts.

Example command is in the file 'demo.sh'.

# No self-ensemble: NLSN
# Example X2 SR
python main.py --dir_data ../../ --model NLSN  --chunk_size 144 --data_test Set5+Set14+B100+Urban100 --n_hashes 4 --chop --save_results --rgb_range 1 --data_range 801-900 --scale 2 --n_feats 256 --n_resblocks 32 --res_scale 0.1  --pre_train model_x2.pt --test_only

Citation

If you find the code helpful in your resarch or work, please cite the following papers.

@InProceedings{Mei_2021_CVPR,
    author    = {Mei, Yiqun and Fan, Yuchen and Zhou, Yuqian},
    title     = {Image Super-Resolution With Non-Local Sparse Attention},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {3517-3526}
}
@InProceedings{Lim_2017_CVPR_Workshops,
  author = {Lim, Bee and Son, Sanghyun and Kim, Heewon and Nah, Seungjun and Lee, Kyoung Mu},
  title = {Enhanced Deep Residual Networks for Single Image Super-Resolution},
  booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
  month = {July},
  year = {2017}
}

Acknowledgements

This code is built on EDSR (PyTorch) and reformer-pytorch. We thank the authors for sharing their codes.

non-local-sparse-attention's People

Contributors

Stargazers

Watchers

Forkers

pjwang317 cvlinks trendingtechnology laoyangui zoonono njulj jeremyiv joyies mmcc1996 ml-edu yafesenes guangweigao sharifsagar80 anilanwoo lee-mode secrul h-wenfeng donyakh hebychen

non-local-sparse-attention's Issues

model code error

In src model common.py unsampler should add change like this
in function param, bias should be assigned or got error

m = []
        if (scale & (scale - 1)) == 0:    # Is scale = 2^n?
            for _ in range(int(math.log(scale, 2))):
                m.append(conv(n_feats, 4 * n_feats, 3, bias=bias)) # should assign bias!
                m.append(nn.PixelShuffle(2))
                if bn:
                    m.append(nn.BatchNorm2d(n_feats))
                if act == 'relu':
                    m.append(nn.ReLU(True))
                elif act == 'prelu':
                    m.append(nn.PReLU(n_feats))

        elif scale == 3:
            m.append(conv(n_feats, 9 * n_feats, 3, bias=bias)) # should assign bias!
            m.append(nn.PixelShuffle(3))
            if bn:
                m.append(nn.BatchNorm2d(n_feats))
            if act == 'relu':
                m.append(nn.ReLU(True))
            elif act == 'prelu':
                m.append(nn.PReLU(n_feats))
        else:
            raise NotImplementedError

x3的model

老师，您好。
我想问一下训练x3,x4的model只需要更改--scale --patch_size--pre_train这三个参数吗？为什么我把--pre_train设置成训练完成的x2的model之后会出现这个问题，如图所示

论文简单来说就是Non-local Neural Networks中的NLA+Reformer中的LSH attention?

看了一下论文，主要创新点其实是结合了NLA（Non-local Neural Networks）思路和Reformer中的LSH attention（Locality sensitive hashing attention），所以其实只是做了一个A+B的工作，并且把它应用到超分领域，不知道我的理解对不对。

How to test my own images?

Hi, i want to test my owm images, but it always failed ,can you tell me how to set parameters? Thanks~

Issue about the Evaluation Metrics

First of all, thank the author for providing the code.
I noticed that PSNR and SSIM were used as metrics in the experiments of the paper. However, only the calculation function of PSNR is offered in the code (calc_psnr func in utility.py). Calculating SSIM involves setting some parameters, which are not explained in detail in the paper or this git repository. In order to be consistent with the calculation process of your paper, can you provide the function to calculate SSIM (the cal_ssim function)?
Thanks for your kind attention and look forward your prompt reply.

I get image just full of white

log

hi ,Do you have a log of your training?I'd like to refer to it,thanks

Reason behind using optimizer.get_lr() over optimizer.get_last_lr()?

Hi there,

I have been able to reproduce both your experiments and have read your paper as well and I do not understand why you do not use the last learning rate from the previous epoch, ie through get_last_lr(), in the start of the train step, but instead you opt for get_lr(), which returns a value that is only scaled by some gamma factor (implementation of this function is here for your reference: https://github.com/pytorch/pytorch/blob/fde94e75568b527b424b108c272793e096e8e471/torch/optim/lr_scheduler.py#L344-L352).

Associated Pytorch warning is the following: lib/python3.8/site-packages/torch/optim/lr_scheduler.py:416: UserWarning: To get the last learning rate computed by the scheduler, please use get_last_lr().
warnings.warn("To get the last learning rate computed by the scheduler, "

Looking forward to hearing back from you soon and all the best,

Parsa Riahi

The specific version of torch in the training phase

Thank you for your code. I have trouble in testing. Can you provide the specific version of torch? Thank you very much!

My GPU has a small memory

How to clear video memory during training without affecting training

训练不了

RuntimeError: CUDA out of memory. Tried to allocate 348.00 MiB (GPU 1; 31.72 GiB total capacity; 1008.44 MiB already allocated; 349.62 MiB free; 59.56 MiB cached)

train

flops

hi ,how can i calculate the flops of NLSA?Is there any code in the program

Cannot load dataset

run the code:
python main.py --dir_data ../ --model NLSN --chunk_size 144 --data_test Set5 --n_hashes 4 --chop --save_results --rgb_range 1 --data_range 0-6 --scale 2 --n_feats 256 --n_resblocks 32 --res_scale 0.1 --pre_train ./pre_model/model_x2.pt --test_only

Some problem about testing

Hi
I changed all parameters to the default parameters. When running main.py, there are some problems:

Making model...
Loading model from ../experiment/test/model/model_x2.pt
Total params: 44.16M
Evaluation:
0it [00:00, ?it/s] [Set5 x4] PSNR: nan (Best: nan @epoch 1)
0it [00:00, ?it/s] [Set14 x4] PSNR: nan (Best: nan @epoch 1)
0it [00:00, ?it/s] [B100 x4] PSNR: nan (Best: nan @epoch 1)
0it [00:00, ?it/s] [Urban100 x4] PSNR: nan (Best: nan @epoch 1)

How to solve this problem?
Thank you.

parameter

I want to set the value of --batch_size to 8.How do I set the value of --test_every and --epochs because of the limitation of GPU'S memory.
thank you

Issue about the args.test_every

I noticed that the args.test_every in the code is used to indirectly control the number of times the dataset is reused in each epoch (the repeat in SRData class). It affects the total number of iterations. In order to reproduce the result of your paper, I want to know the value set for args.test_every in your experiments.
Thank you again and look forward to your reply.

how to test？

I have got the pre-trained models， and i followed the test commend in demo.sh. But i got the following answer。
D:\softwarezijianzhuangde\anaconda\envs\pytorch-1.9\python.exe "D:/data/experiments code/code/1/Non-Local-Sparse-Attention/src/main.py" --dir_data ../benchmarkdata/benchmark/benchmark --model NLSN --chunk_size 144 --data_test Set5+Set14+B100+Urban100 --n_hashes 4 --chop --save_results --rgb_range 1 --data_range 801-900 --scale 2 --n_feats 256 --n_resblocks 32 --res_scale 0.1 --pre_train ../experiment/test/model/model_x2.pt --test_only
Making model...
Loading model from ../experiment/test/model/model_x2.pt
Total params: 41.80M

Evaluation:
[Set5 x2] PSNR: nan (Best: nan @epoch 1)
[Set14 x2] PSNR: nan (Best: nan @epoch 1)
[B100 x2] PSNR: nan (Best: nan @epoch 1)
[Urban100 x2] PSNR: nan (Best: nan @epoch 1)
Forward: 0.00s

Saving...
Total: 0.00s

0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]

Process finished with exit code 0
please help me.

Computational complexity

I have questions about Computational complexity to ask you 。
1.input feature X ∈Rn×c Does this n and c refer to the length and width of the input image?
2.The sorting operation of a sequence with length n and m distinct numbers (bucket number) adds an additional O(nm) with quick sort 。What is the length n in this sentence?
thank you @HarukiYqM

Help with codes

Hi! Thank you for sharing this interesting work. Would you share the implementation of the non-local block using the local window? (i.e. the local window strategy mentioned in the ablation study)

How much memory of the GPUS was used for the test?

HI! Thank you for your sharing. Could you provide the memory of the GPUs used. I always got "RuntimeError: CUDA out of memory" when i test using V100 or 3090.

在Urban100上的测试结果偏差很大

由于显存限制，batch_size训练时设为8，以下是epoch=158的测试结果：
Evaluation:
100%|██████████| 5/5 [00:04<00:00, 1.04it/s]
[Set5 x2] PSNR: 38.051 (Best: 38.051 @epoch 1)
100%|██████████| 14/14 [00:18<00:00, 1.30s/it]
[Set14 x2] PSNR: 33.845 (Best: 33.845 @epoch 1)
100%|██████████| 100/100 [01:33<00:00, 1.07it/s]
[B100 x2] PSNR: 32.246 (Best: 32.246 @epoch 1)
100%|██████████| 100/100 [06:51<00:00, 4.11s/it]
[Urban100 x2] PSNR: 32.461 (Best: 32.461 @epoch 1)

Set5 、Set14、B100测试结果符合预期，但在Urban100上比论文结果（33.42）低了接近1DB。训练参数除了batch_size设为8其余保持默认。为什么在Urban100上会突然降低这么多，请问是什么原因呢？

训练的迭代次数

作者您好！
我想请问一下下，在论文中的实验设置下，当训练够1000epoch时，模型的总iterations是多少呀？
期待您的回复！

What does “common.MeanShift(args.rgb_range, rgb_mean, rgb_std, 1) ”do

Hi,Can you explain the function of "common.MeanShift(args.rgb_range, rgb_mean, rgb_std, 1)"

The number of NLSN operations?

Hi,
I set res_block=32, so the model should contain 5 NLSA modules. When I send an image to the model, it should go through 5 NLSA runs, but when I counted the NLSA runs, I found that the NLSA runs 20 times. Why?
For each image, run the model 4 times to calculate the average PSNR?

Thank you.

TypeError: conv2d() received an invalid combination of arguments

Traceback (most recent call last):
File "/home/sypark/mlvu_project/Non-Local-Sparse-Attention/src/main.py", line 36, in
main()
File "/home/sypark/mlvu_project/Non-Local-Sparse-Attention/src/main.py", line 30, in main
t.train()
File "/home/sypark/mlvu_project/Non-Local-Sparse-Attention/src/trainer.py", line 48, in train
sr = self.model(lr, 0)
File "/home/sypark/anaconda/envs/mlvu_torch/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/sypark/mlvu_project/Non-Local-Sparse-Attention/src/model/init.py", line 54, in forward
return self.model(x)
File "/home/sypark/anaconda/envs/mlvu_torch/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/sypark/mlvu_project/Non-Local-Sparse-Attention/src/model/nlsn.py", line 63, in forward
x = self.tail(res)
File "/home/sypark/anaconda/envs/mlvu_torch/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/sypark/anaconda/envs/mlvu_torch/lib/python3.9/site-packages/torch/nn/modules/container.py", line 141, in forward
input = module(input)
File "/home/sypark/anaconda/envs/mlvu_torch/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/sypark/anaconda/envs/mlvu_torch/lib/python3.9/site-packages/torch/nn/modules/container.py", line 141, in forward
input = module(input)
File "/home/sypark/anaconda/envs/mlvu_torch/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/sypark/anaconda/envs/mlvu_torch/lib/python3.9/site-packages/torch/nn/modules/conv.py", line 446, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/home/sypark/anaconda/envs/mlvu_torch/lib/python3.9/site-packages/torch/nn/modules/conv.py", line 442, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
TypeError: conv2d() received an invalid combination of arguments - got (Tensor, Parameter, Parameter, tuple, tuple, tuple, int), but expected one of:

(Tensor input, Tensor weight, Tensor bias, tuple of ints stride, tuple of ints padding, tuple of ints dilation, int groups)
didn't match because some of the arguments have invalid types: (Tensor, !Parameter!, !Parameter!, !tuple!, !tuple!, !tuple!, int)
(Tensor input, Tensor weight, Tensor bias, tuple of ints stride, str padding, tuple of ints dilation, int groups)
didn't match because some of the arguments have invalid types: (Tensor, !Parameter!, !Parameter!, !tuple!, !tuple!, !tuple!, int)

I got this conv2d() invalid combination of arguments issue while training DIV2K dataset. Can you help me with this?:'(

关于代码实现 bucket_score 变量的细节疑惑？

对照着论文描述和作图，我一步步仔细调试了您的代码，您的代码写的非常好！
我这里有个疑问，就是关于 bucket_score 变量（如下我贴了您的代码），它是求得的不同bucket之间的相关性权重，并在softmax归一化后用score表示，用于了与y_att_buckets矩阵相乘，这一步我很明白。
# unormalized attention score
raw_score = torch.einsum('bhkie,bhkje->bhkij', x_att_buckets, x_match) # [N, n_hashes, num_chunks, chunk_size, chunk_size*3]

    # softmax
    bucket_score = torch.logsumexp(raw_score, dim=-1, keepdim=True)
    score = torch.exp(raw_score - bucket_score)  # (after softmax)
    bucket_score = torch.reshape(bucket_score, [N, self.n_hashes, -1])
    
    # attention
    ret = torch.einsum('bukij,bukje->bukie', score, y_att_buckets)  # [N, n_hashes, num_chunks, chunk_size, C]
    ret = torch.reshape(ret, (N, self.n_hashes, -1, C*self.reduction))

我主要不明白的是后续的代码，以上求得的ret是multi-round的，需要将multi-round这一维融合起来才能得到最终输出NCHW尺寸的特征，我不太明白后续为什么要用bucket_score进行softmax归一化后加权求和呢？这个bucket_score是 “不同bucket之间的相关性权重”，这里再用来求解multi-round维度的加权求和（如下我贴了您的代码），总感觉怪怪的。
# recover the original order
ret = torch.reshape(ret, (N, -1, Cself.reduction)) # [N, n_hashesHW,C]
bucket_score = torch.reshape(bucket_score, (N, -1,)) # [N,n_hashesHW]
ret = batched_index_select(ret, undo_sort) # [N, n_hashesHW,C]
bucket_score = bucket_score.gather(1, undo_sort) # [N,n_hashesHW]

    # weighted sum multi-round attention
    ret = torch.reshape(ret, (N, self.n_hashes, L, C*self.reduction))  # [N, n_hashes*H*W,C]
    bucket_score = torch.reshape(bucket_score, (N, self.n_hashes, L, 1))
    probs = nn.functional.softmax(bucket_score, dim=1)
    ret = torch.sum(ret * probs, dim=1)

我个人觉得是，这里的multi-round，其实一定程度上是类似于Transformer中multi-head的，仿照它的操作，直接将multi-round维和channel维合并为multi-roundchannel，再用11 Conv映射到channel是不是应该更合理呢？

Computational complexity

老师您好，我有一个关于NLSA的计算复杂度的问题想请教您一下。
1.input feature X ∈Rn×c，这个n,c指的是输入图片的长和宽吗？
2.The sorting operation of a sequence with length n and m distinct numbers (bucket number) adds an additional O(nm) with quick sort 。这句话里的length n指的是什么呢

How do I run this.sh script in Pycharm

How do I run this.sh script in Pycharm？
python main.py --dir_data ../../ --n_GPUs 4 --rgb_range 1 --chunk_size 144 --n_hashes 4 --save_models --lr 1e-4 --decay 200-400-600-800 --epochs 1000 --chop --save_results --n_resblocks 32 --n_feats 256 --res_scale 0.1 --batch_size 16 --model NLSN --scale 2 --patch_size 96 --save NLSN_x2 --data_train DIV2K

Which parameter should be set to continue training after a break in training？

I stop training，Which parameter should I set to resume training from where I left off

do you need to use the x2 model like EDSR to train the x3,x4 model?

Hello, do you need to use the x2 model like EDSR to train the x3,x4 model?

Mean of query in the paper

Hello. Thanks for your great work.

I have a question on your paper.

I can't know about the exact mean of "query pixel" or "query signal" or "query location", etc.

I know that the "query" expression is indirectly derived from self-attention, but I'm not sure exactly what it means in the paper.

When I googled the query, the result is "a set of vectors you want to calculate attention for."

Is the expression used in this and the paper similar? Then how should this be interpreted for the paper?

Thank you for reading.