Giter VIP home page Giter VIP logo

non-local-sparse-attention's Introduction

About Me👋

Blog Badge Gmail Badge Linkedin Badge

🎵 Focusing on computer vision.

💬 Now PhD student at JHU. Obtained my bachelor's degree from UIUC

💌 Alumni of UIUC-IFP and SHI Lab

Research

I have developed a series of self-attention operations for image restoration:

Cross-Scale Non-Local Attention (CVPR20) [Paper] [Code]

Non-local Sparse Attention (CVPR21) [Paper] [Code]

Pyramid Attention Networks (IJCV23) [Paper] [Code]

Please feel free to check them.

non-local-sparse-attention's People

Contributors

harukiyqm avatar jeremyiv avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

non-local-sparse-attention's Issues

Mean of query in the paper

Hello. Thanks for your great work.

I have a question on your paper.

I can't know about the exact mean of "query pixel" or "query signal" or "query location", etc.

I know that the "query" expression is indirectly derived from self-attention, but I'm not sure exactly what it means in the paper.

When I googled the query, the result is "a set of vectors you want to calculate attention for."

Is the expression used in this and the paper similar? Then how should this be interpreted for the paper?

Thank you for reading.

how to test?

I have got the pre-trained models, and i followed the test commend in demo.sh. But i got the following answer。
D:\softwarezijianzhuangde\anaconda\envs\pytorch-1.9\python.exe "D:/data/experiments code/code/1/Non-Local-Sparse-Attention/src/main.py" --dir_data ../benchmarkdata/benchmark/benchmark --model NLSN --chunk_size 144 --data_test Set5+Set14+B100+Urban100 --n_hashes 4 --chop --save_results --rgb_range 1 --data_range 801-900 --scale 2 --n_feats 256 --n_resblocks 32 --res_scale 0.1 --pre_train ../experiment/test/model/model_x2.pt --test_only
Making model...
Loading model from ../experiment/test/model/model_x2.pt
Total params: 41.80M

Evaluation:
[Set5 x2] PSNR: nan (Best: nan @epoch 1)
[Set14 x2] PSNR: nan (Best: nan @epoch 1)
[B100 x2] PSNR: nan (Best: nan @epoch 1)
[Urban100 x2] PSNR: nan (Best: nan @epoch 1)
Forward: 0.00s

Saving...
Total: 0.00s

0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]

Process finished with exit code 0
please help me.

关于代码实现 bucket_score 变量的细节疑惑?

对照着论文描述和作图,我一步步仔细调试了您的代码,您的代码写的非常好!
我这里有个疑问,就是关于 bucket_score 变量(如下 我贴了您的代码),它是求得的不同bucket之间的相关性权重,并在softmax归一化后用score表示,用于了与y_att_buckets矩阵相乘,这一步我很明白。
# unormalized attention score
raw_score = torch.einsum('bhkie,bhkje->bhkij', x_att_buckets, x_match) # [N, n_hashes, num_chunks, chunk_size, chunk_size*3]

    # softmax
    bucket_score = torch.logsumexp(raw_score, dim=-1, keepdim=True)
    score = torch.exp(raw_score - bucket_score)  # (after softmax)
    bucket_score = torch.reshape(bucket_score, [N, self.n_hashes, -1])
    
    # attention
    ret = torch.einsum('bukij,bukje->bukie', score, y_att_buckets)  # [N, n_hashes, num_chunks, chunk_size, C]
    ret = torch.reshape(ret, (N, self.n_hashes, -1, C*self.reduction))

我主要不明白的是后续的代码,以上求得的ret是multi-round的,需要将multi-round这一维融合起来才能得到最终输出NCHW尺寸的特征,我不太明白后续为什么要用bucket_score进行softmax归一化后加权求和呢?这个bucket_score是 “不同bucket之间的相关性权重”,这里再用来求解multi-round维度的加权求和(如下 我贴了您的代码),总感觉怪怪的。
# recover the original order
ret = torch.reshape(ret, (N, -1, C
self.reduction)) # [N, n_hashesHW,C]
bucket_score = torch.reshape(bucket_score, (N, -1,)) # [N,n_hashesHW]
ret = batched_index_select(ret, undo_sort) # [N, n_hashesHW,C]
bucket_score = bucket_score.gather(1, undo_sort) # [N,n_hashesHW]

    # weighted sum multi-round attention
    ret = torch.reshape(ret, (N, self.n_hashes, L, C*self.reduction))  # [N, n_hashes*H*W,C]
    bucket_score = torch.reshape(bucket_score, (N, self.n_hashes, L, 1))
    probs = nn.functional.softmax(bucket_score, dim=1)
    ret = torch.sum(ret * probs, dim=1)

我个人觉得是,这里的multi-round,其实一定程度上是类似于Transformer中multi-head的,仿照它的操作,直接将multi-round维和channel维合并为multi-roundchannel,再用11 Conv映射到channel是不是应该更合理呢?

flops

hi ,how can i calculate the flops of NLSA?Is there any code in the program

Issue about the args.test_every

I noticed that the args.test_every in the code is used to indirectly control the number of times the dataset is reused in each epoch (the repeat in SRData class). It affects the total number of iterations. In order to reproduce the result of your paper, I want to know the value set for args.test_every in your experiments.
Thank you again and look forward to your reply.

log

hi ,Do you have a log of your training?I'd like to refer to it,thanks

Some problem about testing

Hi
I changed all parameters to the default parameters. When running main.py, there are some problems:

Making model...
Loading model from ../experiment/test/model/model_x2.pt
Total params: 44.16M
Evaluation:
0it [00:00, ?it/s] [Set5 x4] PSNR: nan (Best: nan @epoch 1)
0it [00:00, ?it/s] [Set14 x4] PSNR: nan (Best: nan @epoch 1)
0it [00:00, ?it/s] [B100 x4] PSNR: nan (Best: nan @epoch 1)
0it [00:00, ?it/s] [Urban100 x4] PSNR: nan (Best: nan @epoch 1)

How to solve this problem?
Thank you.

Reason behind using optimizer.get_lr() over optimizer.get_last_lr()?

Hi there,

I have been able to reproduce both your experiments and have read your paper as well and I do not understand why you do not use the last learning rate from the previous epoch, ie through get_last_lr(), in the start of the train step, but instead you opt for get_lr(), which returns a value that is only scaled by some gamma factor (implementation of this function is here for your reference: https://github.com/pytorch/pytorch/blob/fde94e75568b527b424b108c272793e096e8e471/torch/optim/lr_scheduler.py#L344-L352).

Associated Pytorch warning is the following: lib/python3.8/site-packages/torch/optim/lr_scheduler.py:416: UserWarning: To get the last learning rate computed by the scheduler, please use get_last_lr().
warnings.warn("To get the last learning rate computed by the scheduler, "

Looking forward to hearing back from you soon and all the best,

Parsa Riahi

Computational complexity

老师您好,我有一个关于NLSA的计算复杂度的问题想请教您一下。
1.input feature X ∈Rn×c,这个n,c指的是输入图片的长和宽吗?
2.The sorting operation of a sequence with length n and m distinct numbers (bucket number) adds an additional O(nm) with quick sort 。这句话里的length n指的是什么呢

Computational complexity

I have questions about Computational complexity to ask you 。
1.input feature X ∈Rn×c Does this n and c refer to the length and width of the input image?
2.The sorting operation of a sequence with length n and m distinct numbers (bucket number) adds an additional O(nm) with quick sort 。What is the length n in this sentence?
thank you @HarukiYqM

训练的迭代次数

作者您好!
我想请问一下下,在论文中的实验设置下,当训练够1000epoch时,模型的总iterations是多少呀?
期待您的回复!

TypeError: conv2d() received an invalid combination of arguments

Traceback (most recent call last):
File "/home/sypark/mlvu_project/Non-Local-Sparse-Attention/src/main.py", line 36, in
main()
File "/home/sypark/mlvu_project/Non-Local-Sparse-Attention/src/main.py", line 30, in main
t.train()
File "/home/sypark/mlvu_project/Non-Local-Sparse-Attention/src/trainer.py", line 48, in train
sr = self.model(lr, 0)
File "/home/sypark/anaconda/envs/mlvu_torch/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/sypark/mlvu_project/Non-Local-Sparse-Attention/src/model/init.py", line 54, in forward
return self.model(x)
File "/home/sypark/anaconda/envs/mlvu_torch/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/sypark/mlvu_project/Non-Local-Sparse-Attention/src/model/nlsn.py", line 63, in forward
x = self.tail(res)
File "/home/sypark/anaconda/envs/mlvu_torch/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/sypark/anaconda/envs/mlvu_torch/lib/python3.9/site-packages/torch/nn/modules/container.py", line 141, in forward
input = module(input)
File "/home/sypark/anaconda/envs/mlvu_torch/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/sypark/anaconda/envs/mlvu_torch/lib/python3.9/site-packages/torch/nn/modules/container.py", line 141, in forward
input = module(input)
File "/home/sypark/anaconda/envs/mlvu_torch/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/sypark/anaconda/envs/mlvu_torch/lib/python3.9/site-packages/torch/nn/modules/conv.py", line 446, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/home/sypark/anaconda/envs/mlvu_torch/lib/python3.9/site-packages/torch/nn/modules/conv.py", line 442, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
TypeError: conv2d() received an invalid combination of arguments - got (Tensor, Parameter, Parameter, tuple, tuple, tuple, int), but expected one of:

  • (Tensor input, Tensor weight, Tensor bias, tuple of ints stride, tuple of ints padding, tuple of ints dilation, int groups)
    didn't match because some of the arguments have invalid types: (Tensor, !Parameter!, !Parameter!, !tuple!, !tuple!, !tuple!, int)
  • (Tensor input, Tensor weight, Tensor bias, tuple of ints stride, str padding, tuple of ints dilation, int groups)
    didn't match because some of the arguments have invalid types: (Tensor, !Parameter!, !Parameter!, !tuple!, !tuple!, !tuple!, int)

I got this conv2d() invalid combination of arguments issue while training DIV2K dataset. Can you help me with this?:'(

parameter

I want to set the value of --batch_size to 8.How do I set the value of --test_every and --epochs because of the limitation of GPU'S memory.
thank you

Cannot load dataset

run the code:
python main.py --dir_data ../ --model NLSN --chunk_size 144 --data_test Set5 --n_hashes 4 --chop --save_results --rgb_range 1 --data_range 0-6 --scale 2 --n_feats 256 --n_resblocks 32 --res_scale 0.1 --pre_train ./pre_model/model_x2.pt --test_only

在Urban100上的测试结果偏差很大

由于显存限制,batch_size训练时设为8,以下是epoch=158的测试结果:
Evaluation:
100%|██████████| 5/5 [00:04<00:00, 1.04it/s]
[Set5 x2] PSNR: 38.051 (Best: 38.051 @epoch 1)
100%|██████████| 14/14 [00:18<00:00, 1.30s/it]
[Set14 x2] PSNR: 33.845 (Best: 33.845 @epoch 1)
100%|██████████| 100/100 [01:33<00:00, 1.07it/s]
[B100 x2] PSNR: 32.246 (Best: 32.246 @epoch 1)
100%|██████████| 100/100 [06:51<00:00, 4.11s/it]
[Urban100 x2] PSNR: 32.461 (Best: 32.461 @epoch 1)

Set5 、Set14、B100测试结果符合预期,但在Urban100上比论文结果(33.42)低了接近1DB。训练参数除了batch_size设为8其余保持默认。为什么在Urban100上会突然降低这么多,请问是什么原因呢?

How to test my own images?

Hi, i want to test my owm images, but it always failed ,can you tell me how to set parameters? Thanks~

x3的model

老师,您好。
我想问一下训练x3,x4的model只需要更改--scale --patch_size--pre_train这三个参数吗?为什么我把--pre_train设置成训练完成的x2的model之后会出现这个问题,如图所示
2021-09-27 11-03-17 的屏幕截图

The number of NLSN operations?

Hi,
I set res_block=32, so the model should contain 5 NLSA modules. When I send an image to the model, it should go through 5 NLSA runs, but when I counted the NLSA runs, I found that the NLSA runs 20 times. Why?
For each image, run the model 4 times to calculate the average PSNR?

Thank you.

How do I run this.sh script in Pycharm

How do I run this.sh script in Pycharm?
python main.py --dir_data ../../ --n_GPUs 4 --rgb_range 1 --chunk_size 144 --n_hashes 4 --save_models --lr 1e-4 --decay 200-400-600-800 --epochs 1000 --chop --save_results --n_resblocks 32 --n_feats 256 --res_scale 0.1 --batch_size 16 --model NLSN --scale 2 --patch_size 96 --save NLSN_x2 --data_train DIV2K

model code error

In src model common.py unsampler should add change like this
in function param, bias should be assigned or got error

m = []
        if (scale & (scale - 1)) == 0:    # Is scale = 2^n?
            for _ in range(int(math.log(scale, 2))):
                m.append(conv(n_feats, 4 * n_feats, 3, bias=bias)) # should assign bias!
                m.append(nn.PixelShuffle(2))
                if bn:
                    m.append(nn.BatchNorm2d(n_feats))
                if act == 'relu':
                    m.append(nn.ReLU(True))
                elif act == 'prelu':
                    m.append(nn.PReLU(n_feats))

        elif scale == 3:
            m.append(conv(n_feats, 9 * n_feats, 3, bias=bias)) # should assign bias!
            m.append(nn.PixelShuffle(3))
            if bn:
                m.append(nn.BatchNorm2d(n_feats))
            if act == 'relu':
                m.append(nn.ReLU(True))
            elif act == 'prelu':
                m.append(nn.PReLU(n_feats))
        else:
            raise NotImplementedError

训练不了

RuntimeError: CUDA out of memory. Tried to allocate 348.00 MiB (GPU 1; 31.72 GiB total capacity; 1008.44 MiB already allocated; 349.62 MiB free; 59.56 MiB cached)

Issue about the Evaluation Metrics

First of all, thank the author for providing the code.
I noticed that PSNR and SSIM were used as metrics in the experiments of the paper. However, only the calculation function of PSNR is offered in the code (calc_psnr func in utility.py). Calculating SSIM involves setting some parameters, which are not explained in detail in the paper or this git repository. In order to be consistent with the calculation process of your paper, can you provide the function to calculate SSIM (the cal_ssim function)?
Thanks for your kind attention and look forward your prompt reply.

Help with codes

Hi! Thank you for sharing this interesting work. Would you share the implementation of the non-local block using the local window? (i.e. the local window strategy mentioned in the ablation study)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.