Giter VIP home page Giter VIP logo

psa's Introduction

Polarized Self-Attention: Towards High-quality Pixel-wise Regression

This is an official implementation of:

Huajun Liu, Fuqiang Liu, Xinyi Fan and Dong Huang. Polarized Self-Attention: Towards High-quality Pixel-wise Regression Arxiv Version

PWC PWC PWC

Citation:

@article{Liu2021PSA,
  title={Polarized Self-Attention: Towards High-quality Pixel-wise Regression},
  author={Huajun Liu and Fuqiang Liu and Xinyi Fan and Dong Huang},
  journal={Arxiv Pre-Print arXiv:2107.00782 },
  year={2021}
}

Codes and Pre-trained models will be uploaded soon~

Top-down 2D pose estimation models pre-trained on the MS-COCO keypoint task(Table4 in the Arxiv version).

Model Name Backbone Input Size AP pth file
UDP-Pose-PSA(p) HRNet-W48 256x192 78.9 to be uploaded
UDP-Pose-PSA(p) HRNet-W48 384x288 79.5 to be uploaded
UDP-Pose-PSA(s) HRNet-W48 384x288 79.4 to be uploaded

Setup and inference:

Semantic segmentation models pre-trained on Cityscapes (Table5 in the Arxiv version).

Model Name Backbone val mIoU pth file
HRNetV2-OCR+PSA(p) HRNetV2-W48 86.95 download
HRNetV2-OCR+PSA(s) HRNetV2-W48 86.72 download

Setup and inference:

psa's People

Contributors

dghuanggh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

psa's Issues

It seems that the implementation of Channel only self attention and Spatial only self attention change each other.

Thanks for sharing your work!

def spatial_pool(self, x):
input_x = self.conv_v_right(x)
batch, channel, height, width = input_x.size()
# [N, IC, H*W]
input_x = input_x.view(batch, channel, height * width)
# [N, 1, H, W]
context_mask = self.conv_q_right(x)
# [N, 1, H*W]
context_mask = context_mask.view(batch, 1, height * width)
# [N, 1, H*W]
context_mask = self.softmax_right(context_mask)
# [N, IC, 1]
# context = torch.einsum('ndw,new->nde', input_x, context_mask)
context = torch.matmul(input_x, context_mask.transpose(1,2))
# [N, IC, 1, 1]
context = context.unsqueeze(-1)
# [N, OC, 1, 1]
context = self.conv_up(context)
# [N, OC, 1, 1]
mask_ch = self.sigmoid(context)
out = x * mask_ch
return out

image

It seems that spatial_pool function is the same with Channel-only self attention module.

The result of HRNet+OCR

I can't find the val result of 84.9%mIoU in the OCR paper. Is this the result reproduced by yours? If so ,can you describe your implementation details?

How to use PSA in human pose estimation?

It's my honor to see such an excellent job.

I use hrnetv2.py in the network to train MPIIi dataset, but the effect has not changed. How do I use the PSA module?

Look forward to your human pose estimation code.Thank you!

PSA module missing in bottleneck block

The PSA module that is defined here is only used in the basic block of HRNet and not the bottleneck block. The paper says this though:

For any baseline networks with the bottleneck or basic residual blocks, such as ResNet and HRnet, we add PSAs after the first 3x3 convolution in every residual blocks, respectively.

How to export the model as ONNX

Hello,

I'm interested in testing the HRNetV2-OCR+PSA(p) with OpenCV, but I believe there is no way to load a .pth file. I'm checking how to use pytorch to load the network and then export it to ONNX, but It seems that I need to know the model definition to properly load the network. Am I going in the right direction? Sorry if the question seems naive, I don't have a lot of experience with ML, especially with pytorch.

Thanks in advance

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.