Comments (3)
顺便还有一个细节问题,我还想请教一下,代码里有一句 hash_codes = hash_codes.detach()
为什么要加上这一句呢?detach()是用于切断梯度传播的,这里加上这一句是要切断此处的梯度传播吗?
我试过把这句注释掉,代码也能正常训练,求解。
from non-local-sparse-attention.
Hi, 这里使用softmax的motivation是来源于hash操作的随机性。这段代码的意思是衡量每个round所分到bucket元素之间的亲和力,如果分到的bucket元素更related,那么那个round的权重更大。supplementary里有visualization,可以参考理解。1x1conv 是learnable的,我认为应该也是可行的。detach应该可以去掉,hashcode并没有gradient回传,感谢指出。
from non-local-sparse-attention.
Hi, 这里使用softmax的motivation是来源于hash操作的随机性。这段代码的意思是衡量每个round所分到bucket元素之间的亲和力,如果分到的bucket元素更related,那么那个round的权重更大。supplementary里有visualization,可以参考理解。1x1conv 是learnable的,我认为应该也是可行的。detach应该可以去掉,hashcode并没有gradient回传,感谢指出。
好的,感谢回复
from non-local-sparse-attention.
Related Issues (20)
- train HOT 4
- 训练不了 HOT 7
- Issue about the Evaluation Metrics HOT 2
- flops
- Issue about the args.test_every HOT 2
- x3的model HOT 8
- 在Urban100上的测试结果偏差很大 HOT 2
- 论文简单来说就是Non-local Neural Networks中的NLA+Reformer中的LSH attention? HOT 1
- What does “common.MeanShift(args.rgb_range, rgb_mean, rgb_std, 1) ”do HOT 1
- Computational complexity HOT 2
- do you need to use the x2 model like EDSR to train the x3,x4 model? HOT 1
- Computational complexity HOT 2
- TypeError: conv2d() received an invalid combination of arguments HOT 1
- Mean of query in the paper HOT 1
- How much memory of the GPUS was used for the test? HOT 1
- model code error HOT 1
- I get image just full of white
- Help with codes
- Cannot load dataset
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from non-local-sparse-attention.