Giter VIP home page Giter VIP logo

imdn's Introduction

IMDN

Lightweight Image Super-Resolution with Information Multi-distillation Network (ACM MM 2019)

[arXiv] [Poster] [ACM DL]

✨ News

  • Nov 26, 2021. Add IMDN_RTC tflite model.

The IMDN+ got the Second Runner-up at NTIRE 2022 Efficient SR Challenge (Sub-Track2 - Overall Performance Track).


IMDB+

structural re-parameterization
Model complexity
number of parameters 275,844
FLOPs 17.9848G (input size: 3256256)
GPU memory consumption 2893M (DIV2K test)
number of activations 92.7990M (input size: 3256256)
runtime 0.026783s (RTX 2080Ti, DIV2K test)
PSNR / SSIM (Y channel) on 5 benchmark datasets.
Metrics Set5 Set14 B100 Urban100 Manga109
PSNR 32.11 28.63 27.58 26.10 30.55
SSIM 0.8934 0.7823 0.7358 0.7846 0.9072

The simplified version of IMDN won the first place at Contrained Super-Resolution Challenge (Track1 & Track2). The test code is available at Google Drive

The ultra lightweight version of IMDN won the first place at Super Resolution Algorithm Performance Comparison Challenge. (

class IMDN_RTC(nn.Module):
)

Degradation type: Bicubic

PyTorch Checkpoint

Tensorflow Lite Checkpoint

input_shape = (1, 720, 480, 3), AI Benchmark(OPPO Find X3-Qualcomm Snapdragon 870, FP16, TFLite GPU Delegate)

The down-up version of IMDN won the second place at Super Resolution Algorithm Performance Comparison Challenge. (

class IMDN_RTE(nn.Module):
)

Degradation type: Downsampling + noise

Checkpoint

Hightlights

  1. Our information multi-distillation block (IMDB) with contrast-aware attention (CCA) layer.

  2. The adaptive cropping strategy (ACS) to achieve the processing images of any arbitrary size (implementing any upscaling factors using one model).

  3. The exploration of factors affecting actual inference time.

Testing

Pytorch 1.1

  • Runing testing:
# Set5 x2 IMDN
python test_IMDN.py --test_hr_folder Test_Datasets/Set5/ --test_lr_folder Test_Datasets/Set5_LR/x2/ --output_folder results/Set5/x2 --checkpoint checkpoints/IMDN_x2.pth --upscale_factor 2
# RealSR IMDN_AS
python test_IMDN_AS.py --test_hr_folder Test_Datasets/RealSR/ValidationGT --test_lr_folder Test_Datasets/RealSR/ValidationLR/ --output_folder results/RealSR --checkpoint checkpoints/IMDN_AS.pth
  • Calculating IMDN_RTC's FLOPs and parameters, input size is 240*360
python calc_FLOPs.py

Training

python scripts/png2npy.py --pathFrom /path/to/DIV2K/ --pathTo /path/to/DIV2K_decoded/
  • Run training x2, x3, x4 model
python train_IMDN.py --root /path/to/DIV2K_decoded/ --scale 2 --pretrained checkpoints/IMDN_x2.pth
python train_IMDN.py --root /path/to/DIV2K_decoded/ --scale 3 --pretrained checkpoints/IMDN_x3.pth
python train_IMDN.py --root /path/to/DIV2K_decoded/ --scale 4 --pretrained checkpoints/IMDN_x4.pth

Results

百度网盘提取码: 8yqj or Google drive

The following PSNR/SSIMs are evaluated on Matlab R2017a and the code can be referred to Evaluate_PSNR_SSIM.m.

Pressure Test


Pressure test for ×4 SR model.

*Note: Using torch.cuda.Event() to record inference times.

PSNR & SSIM


Average PSNR/SSIM on datasets Set5, Set14, BSD100, Urban100, and Manga109.

Memory consumption


Memory Consumption (MB) and average inference time (second).

Model parameters


Trade-off between performance and number of parameters on Set5 ×4 dataset.

Running time


Trade-off between performance and running time on Set5 ×4 dataset. VDSR, DRCN, and LapSRN were implemented by MatConvNet, while DRRN, and IDN employed Caffe package. The rest EDSR-baseline, CARN, and our IMDN utilized PyTorch.

Adaptive Cropping


The diagrammatic sketch of adaptive cropping strategy (ACS). The cropped image patches in the green dotted boxes.

Visualization of feature maps


Visualization of output feature maps of the 6-th progressive refinement module (PRM).

Citation

If you find IMDN useful in your research, please consider citing:

@inproceedings{Hui-IMDN-2019,
  title={Lightweight Image Super-Resolution with Information Multi-distillation Network},
  author={Hui, Zheng and Gao, Xinbo and Yang, Yunchu and Wang, Xiumei},
  booktitle={Proceedings of the 27th ACM International Conference on Multimedia (ACM MM)},
  pages={2024--2032},
  year={2019}
}

@inproceedings{AIM19constrainedSR,
  title={AIM 2019 Challenge on Constrained Super-Resolution: Methods and Results},
  author={Kai Zhang and Shuhang Gu and Radu Timofte and others},
  booktitle={The IEEE International Conference on Computer Vision (ICCV) Workshops},
  year={2019}
}

imdn's People

Contributors

zheng222 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

imdn's Issues

Training and Testing

Hi Zheng,

This a great repo. Appreciate for letting us use it. I have 2 questions regarding training and testing.

1-) Why do we use testing data (testing folders) as validation data during training ?
In train_IMDN.py:
image

2-) Can we test our LR images without corresponding HR images? Ideally, we want to generate HR images from LR images? Do we have to have --test_hr_folder in order to test LR images?

python test_IMDN.py --test_hr_folder Test_Datasets/Set5/ --test_lr_folder Test_Datasets/Set5_LR/x2/ --output_folder results/Set5/x2 --checkpoint checkpoints/IMDN_x2.pth --upscale_factor 2

Memnet resulting image

Hi, thank you so much for the source code. Besides, could you please provide the Memnet resulting image of the degradation model?

About parameters

Hello! I want to ask how you caculate the parameters of IDN? I use torchsummaryX to caculate the model size, but the parameters is about 677k.

How to convert pth to tflite

您好,请问下 您是怎们把pth 转换成 tflite 格式的的,有代码或者教程链接么,谢谢

About the test_IMDN.py

Dear Zheng:
您好!
我在运行test_IMDN.py时,遇到两个问题:
1)我可以运行成功您提供的Set5文件夹下.bmp数据集,然而我自己再测试Set14时,无论是.bmp还是.png格式下的数据集,都会运行出错。 【TypeError: ‘NoneType’ object is not subscriptable】 自己调试程序,好像是并没有读取相应的图片,麻烦您能解答一下是为什么吗?非常感谢
2)我按照您checkpoint文件夹下的.pth文件(X2\X3\X4)去测试test_IMDN.py ,然而得出的PSNR和SSIM都低于您论文里的值,我也用生成的SR图片和原有的HR图片在matlab里面计算PSNR和SSIM,也都低于您论文里的值,并且我也从头到尾跑过了train_IMDN.py,也生成了相应的.pth文件。

  非常接待您的回复和解惑,谢谢!

about the experiment in IMDN

Hi,Zheng!
I trained the IMDN in a regular manner with the lr 2e-4.The lr reduced by half per 200 epochs for 1000 epochs totally.However,when I tested the network ,the results i got were lower than yours.After that, I increased the epochs to 300 when the lr is 2e-4 and 1e-4 and the total epochs was increased in 1200. But finally, the results were still lower than yours.The psnr and ssim i got were obtained by matlab in y channel.The experiment results i tested are as followed :
para|MAds| Set5 | Set14 | B100 | Urban100 | manga109 |
715 | 40.9 | 32.14/0.8942 | 28.53/0.7806 | 27.53/0.7351 | 26.00/0.7828 | 30.37/0.9067
So, could you tell me what i should do to further improve the final results?

Requset about ERROR

OSError: Failed to interpret file '/media/npu/Data/mdx/DIV2K_train_LR_bicubic/X4/0710x4.png' as a pickle
That's why?

Adaptive Cropping Strategy

你好,首先非常感谢您的开源工作,在阅读文章的时候对于其中的adaptive cropping strategy的内容理解的还不够透彻,想请问一下,ACS的具体作用是什么呢,使用ACS之后为什么就可以处理arbitrary size和any scale factor 了呢,如果有时间希望您可以帮忙解惑一下,感谢

Model training

Thank you for nice work and thank you for sharing the code.

python train_IMDN.py --root /path/to/DIV2K_decoded/ --scale 4 --pretrained checkpoints/IMDN_x4.pth

Why do we need the pretrained model for 4x when we train the x4 SR model?

How to train a model that has similar performance with the paper? Thanks.

test time

hello, i have a issue about test time.
i test the code with the Set5 by i5 cpu, but i get the time is about 4.4s;
and i know the cpu time will big but i think this time is too big, right?
because the flops is smaller than fsrcnn and almost models, and the gpu time is the same as others, even though the depth is different.

the reason is the depth?? if really it is, why the gpu time is still small?

adaptive cropping

thanks for the wonderful concept.
can u please let me know exactly where in the code is this strategy of adaptive cropping

Quesiton of CCA

hello, Zheng. I want to ask you why the last row of the code in function CCA is "return x * y " ,isn't "x + y "?

The running time of model

In your paper, the figure 9 shows that the speed of IDN is slower than the CARN and IMDN on Set5 for x4 SR.
However, I use the official code such as IDN, CARN and IMDN to evaluate the average inference time on Set5 x4 dataset. The average running time of these methods are 0.007s, 0.028s and 0.029s, respectively.
I find that the speed of IDN is faster than CARN and IMDN, and the running time of IMDN is close to CARN. I an confused about the result. Could you tell me the reasons?
My operation environment is as follows:

GPU: GTX 1080Ti
OS: Ubuntu 18.02 LTS
CUDA: 10.0
CUDNN: 7.4
Python version: 3.6
pytorch version: 1.0

Thank you!

PSNR

wheather is the psnr in these pictures based on y-channel or RGB-channel?
image
image

about train and checkpoint

hello,nice to meet you
I want to ask that
is checkpoint a pre-train model?
and how can i train it? thanks
looking forward your reply

PSNR is slightly lower in my experiments

I tested IMDN_X4 using the provided checkpoint and Set5 dataset. The PSNR is 32.19. I also trained the IMDN_X4 from scratch strictly following your code. The best PSNR is 32.14. These two results are slightly lower than the reported 32.21 in Table 5.

What's the possible reason for the two differences? Do I need to train a model based on your pre-trained checkpoint? (i.e., --pretrained checkpoints/IMDN_x4.pth) If so, how was this checkpoint trained?

REALSR training code

Is there any plan for the release of the training code on REALSR dataset using IMDN_AS arch?

Thanks.

How to find the best Epoch

I'm using your framework to train my own network. It seems that the process just saved every checkpoint, so, how did you find the best checkpoint.

Cannot run train_IMDN.py

Shouldn't the Line 8 in train_IMDN.py be "import utils" instead of "from utils import utils"? And where does the function "utils.adjust_learning_rate()" used in Line 123 in train_IMDN.py first defined? Thank you.

ycbcr and rgb

hi, thanks for your work
could you tell me why you use rgb in 'as' model , and use ycbcr in 'x2/3/4/' model, are there any tricks?

About the performance in x2 model

Thanks for your excellent work.
But I hava a doublt about the results on x2 model.
I can't get the results as you report in the article, so should I use L2 loss after 1000 epochs?
Thanks!

How to use train_AS.py

Hello, how do I use your code to train pictures in real scenes (high-resolution images and low-resolution images are the same size)?

About the memory.

Hi, thank you so much for the source code. Besides, could you please provide the method for calculating memory?

IMDN_AS任意比例

您好当我使用IMDN_AS时,一旦改变upscale就会报错,是不可以更改这个参数吗?size mismatch for upsampler.0.weight: copying a param with shape torch.Size([48, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([12, 64, 3, 3]).
size mismatch for upsampler.0.bias: copying a param with shape torch.Size([48]) from checkpoint,

EDSR's forward_chop replace the crop_forward

Thanks for your excellent works.
I have a question. I tested my images(3040x3040 or 6080x3040) with EDSR, little change in gpu memory . but, when i tested your IMDN with forward_chop, why the memory change so much?

Average Inference Time

Great job!
There is a question about the average inference time on BSD100, Urban100 and Manga109. In Table 6, for EDSR-baseilne, EDSR, RDN, and IMDN, the average inference time becomes less from BSD100 to Manga109. But the image size becomes larger from BSD100 to Manga109. That makes me confused. Did I miss something about this average inference time?

In addition, I used the "test_IMDN.py" script and tested the released "IMDN_x4.pth" on BSD100, Urban100 and Manga109 on 2080 Ti. The average inference times are 9.14ms, 14.37ms and 17.00ms. I think 2080 Ti should be faster than Titan Xp mentioned in your paper. Are there any wrong steps for me?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.