jixiaozhong / realsr Goto Github PK

View Code? Open in Web Editor NEW

729.0 26.0 102.0 14.66 MB

Real-World Super-Resolution via Kernel Estimation and Noise Injection

License: Apache License 2.0

Python 97.39% MATLAB 2.61%

super-resolution kernel-estimation noise-injection

realsr's People

Stargazers

Watchers

Forkers

liuguoyou conson0214 nihui monkeyking qaz734913414 pipigenius maizioo123 czxjlsn kaiguo-vision smeshing juingzhou npzl zhengxinchenee xiali-github abexit shinetzh xiaoye77 highland2019 fuuuyuuu youtang1993 yangtong1989 cv-ip q935970314 wh-forker wujinlonglovezhangmiao1314 zhaoyk1986 joeupwu zt706 gamayos lotayou hlzju jon-drugstore yicrane yufand jiandandan001 hongjunsong1 marcelyuhsin zhuangzhong daifeng2016 1lovesjohnny underdog2020 wangyixiang kingdary phexic wuyiwong luissalgueiro westcityinstitute bn999286 erjihaoshi linhong00316 xshone haorenkk123 fengzifrank fooyoo smmzhang angrybird210 zoq enurinm kaillerc ajinkyapuar shining-love xuelimin mornydew chrissem kolaye zyf1040895256 lwczy jacky8301 opensourcefuture serapann 1442170138 atlasgooo2 crazier9527 stephen0808 tinyriver leedaga2 chenpaopao lvcong8 brightchu ku1d33p bringlive devilteo911 gg-big-org berumotto-vermouth everythingismetaphor snserhello superjantung 2683321382 jeromyjsmith kongsewoon guruace krishan0507 chickenlamb shaowentian ogmkp dungmn webstorage119 don0149 bonheurswp egodreamt

realsr's Issues

Results in MSU Video Super Resolution Benchmark

Hello,
MSU Video Group has recently launched Video Super Resolution Benchmark and evaluated this algorithm.

It takes 7th place by subjective score, 9th place by PSNR, and 5th by our metric ERQAv1.0. You can see the results here.

If you have any other VSR method you want to see in our benchmark, we kindly invite you to participate.
You can submit it for the benchmark, following the submission steps.

是否考虑训练不做超分，只做去块降噪与去模糊的模型？

Can you provide the pretrained model?

Hi~
Thanks for your great job in real image super resolution.
I wonder can you provide the pretrained model (Track 2) to help me do some evaluation on my onw dataset?
It's important for me! So, if possible, please release it on Github~
Thank you very much!

DF2K-JPEG Model Training.

Hi,

@jixiaozhong can you share details about how the DF2K-JPEG model was trained and what was the data preparation strategy?

How to train this model?

I see a train code in this directory. Can you please tell how to run this code?

can you release train code?

I want to train this model in other dataset, but i don't know how to prepare train dataset , can you release train code?

向您请求帮助--DPEDipone training dataset

您好，

  下载训练集 (DPEDipone training dataset) 简直太困难了！请问您是否方便提供已经下载过的训练集呢？例如百度网盘的链接。这会极大的加速我获取到该数据集。非常感谢您！

祝好，
马海川

training data constructing code, please

Hello, I am very interested in your work, especially the kernel estimation and noise injection part. I couldn’t find that from your github release. Could you kindly provide us the code for training data constructing?

model on human faces posible

model on human faces is posible?, like make focus in faces deblur and denoise, also regenerate pixelate faces?
DF2K-JPEG Model, works really cool.
but sometimes is not enough, for human faces.
many thanks and keep going. 👍 )

Question about the Reported Results

Hi, thanks for your wonderful work.

I am confused about some details of the evaluation.

Are the results reported on the validation set or the test set?
If they are reported on the test set, how can I reproduce these quantitative results? Or how can I get the results on the validation set?
Are the results reported on the Y channel or RGB channels?
Are the LPIPS results calculated directly on the range of [0, 255] or [-1, 1] like the official repo do?

I will appreciate it a lot if you can help me with these questions.

Thanks.

codes/data/util.py bug

python3.8 util.py

Traceback (most recent call last):
File "util.py", line 461, in
img = img * 1.0 / 255
TypeError: unsupported operand type(s) for *: 'NoneType' and 'float'

Rename state_dict names, make it compatible with original ESRGAN

So the generator network of this is EXACTLY esrgan but with different layer(?) naming.
I made a state_dict name converting func like this:

def renamelayerz(state_dict):
  from collections import OrderedDict
  import rsrlayers
  new_state_dict = OrderedDict()

  for key, value in state_dict.items():
    new_key = rsrlayers.rsr2esr[key]
    new_state_dict[new_key] = value

  return new_state_dict



def mkESRGAN(model_path,scale,isRSR=False):

  if not os.path.isfile(model_path):
    model_path = '/content/drive/My Drive/TFMLz/ESRGAN_oldarch/models/'+model_path+'.pth'

  
  model = arch.RRDB_Net(3, 3, 64, 23, gc=32, upscale=scale, norm_type=None, act_type='leakyrelu', \
                        mode='CNA', res_scale=1, upsample_mode='upconv')
  dick=torch.load(model_path)

  if isRSR:
    dick = renamelayerz(dick)
  
  model.load_state_dict(dick, strict=True)
  model.eval()
  for k, v in model.named_parameters():
      v.requires_grad = False
  return model

rsrlayers.py.txt

And it works (with original ESRGAN scripts).

Can you revert those names back to original esrgan ones? It will make this easier to be adopted by other esrgan applications.

Your Results in New Super-Resolution Benchmarks

Hello,

MSU Graphics & Media Lab Video Group has recently launched two new Super-Resolution Benchmarks.

Video Upscalers Benchmark: Quality Enhancement determines the best upscaling methods for increasing video resolution and improving visual quality.
Super-Resolution for Video Compression benchmark aims to test Super-Resolution methods on compressed videos and select the best model for each video codec standard.

Your method achieved 4th place in Video Upscalers Benchmark: Quality Enhancement in 'Animation 2x' category and 1st place in Super-Resolution for Video Compression Benchmark in 'x264 compression' category. We congratulate you on your result and look forward to your future work!

We would be grateful for your feedback on our work.

Can i apply this method on any image?

Hi~
Can I adopt this method on any real world images directly? Such like some face images collected from net? I have tried it these days and i observe that your method may cause over-sharp problem. Can you give me some suggestion about this?
There is a couple of demos, i use the DPED.pth as the pretrained model.
LR:

SR:

Change batch size in test.py

Hello All
I am trying to run test.py on some images captured by my DSLR,
CUDA_VISIBLE_DEVICES=1 python3 test.py -opt options/df2k/test_df2k.yml

and I am always getting CUDA out of memory error. However, when I try to run it on the FLICKR dataset of DIV2K, I am not facing any such problems. Please can anybody help me resolve it?

I think if i reduce the batch size, then it must be able to resolve this issue, can anybody tell me where the batch-size option is? If you think it could be because of something else also, please let me know.

This is the error I am getting always

RuntimeError: CUDA out of memory. Tried to allocate 10.23 GiB (GPU 0; 23.65 GiB total capacity; 13.52 GiB already allocated; 5.71 GiB free; 14.81 GiB reserved in total by PyTorch)

Thank you in advance for your help.

神奇，平滑的LR4倍超上来，居然自带噪点

上图输入，下图输出
放大得很成功，边缘很锐利，居然还蒙上了一层噪点，有那味了
jpeg去除很成功

时间上还是跪了，没有严格测，但是720P的输入大概要半分钟多（P40）

Use without NVIDIA GPU + CUDA?

Hi. I'm just wondering, is there any possibility of using RealSR if I do not have NVIDIA GPU + CUDA?

Using lower resolution image gives better result

I have some old DV footage I'm testing RealSR on. The source size is 1024x576 pixels.
There are lots of artifacts and noise in it.

I first tried to upscale it to 4096px with RealSR, but there was nearly no improvement in the quality. It's almost a carbon copy of the original.

Then I tried something else: I downscaled the original to 512x288 pixels, and upscaled THAT to 2048px. And the result was amazing. No more noise, everything looks quite sharp, most halos are even gone, ... (The result in the headlight alone, wow.)

I uploaded the 4 images here, you can compare the 2 inputs to the 2 outputs:
http://www.framecompare.com/image-compare/screenshotcomparison/7776WNNX

Now the question is: why is that? Why did I first have to downscale the image in order to get a better result? It's kind of strange, because some of the details did get lost in the conversion from the 1024px to the 512px image. Could I somehow get an even better result with the original 1024px source?

Dataset link dead

https://competitions.codalab.org/competitions/22220#participate

https://data.vision.ee.ethz.ch/alugmayr/NTIRE2019/public/Corrupted-tr-x.zip
https://data.vision.ee.ethz.ch/alugmayr/NTIRE2019/public/Corrupted-tr-y.zip
https://data.vision.ee.ethz.ch/alugmayr/NTIRE2019/public/Corrupted-va-x.zip
https://data.vision.ee.ethz.ch/alugmayr/NTIRE2019/public/Corrupted-te-x.zip

https://data.vision.ee.ethz.ch/alugmayr/NTIRE2019/public/DPEDiphone-tr-x.zip
https://data.vision.ee.ethz.ch/alugmayr/NTIRE2019/public/DPEDiphone-tr-y.zip
https://data.vision.ee.ethz.ch/alugmayr/NTIRE2019/public/DPEDiphone-va.zip
https://data.vision.ee.ethz.ch/alugmayr/NTIRE2019/public/DPEDiphone-crop-te-x.zip

Can someone re-upload?

请问收集noise和kernel的估计是在代码的哪部分？

大致看了您的代码，但我有点迷茫，没找到我要找的部分（可能因为有点多.....)没看懂project的逻辑。另外请问根目录下的train.py是训练代码嘛？

Is the 'Noise Injection' not included in codes?

I have not found 'Noise Injection' in data codes, is it not included in release codes so far?

After fine-tuning the network, the output color of the model has changed

@jixiaozhong After fine-tuning the network, the output color of the Generate model has changed? How can I address this issue?
looking for your help !
Thank you!

About the 'Kernel Estimation' and 'Noise collection '

I have not found 'Kernel Estimation' and 'Noise collection ' in data codes, is it not included in release codes so far?

When I enable TTA mode,the program crashed...

System: Windows 10 1909
GPU: 1050TI
Drive: NVIDIA 446.14

When I enable TTA mode,the program crashed...
And the program shows

E:\Program Files\realsr>realsr-ncnn-vulkan.exe -i [图片地址] -o 123.png -x
[0 GeForce GTX 1050 Ti] queueC=2[8] queueG=0[16] queueT=1[2]
[0 GeForce GTX 1050 Ti] buglssc=0 bugsbn1=0 bugihfa=0
[0 GeForce GTX 1050 Ti] fp16p=1 fp16s=1 fp16a=0 int8s=1 int8a=1
decode 708 764 3
vkWaitForFences failed -4
0.00%
vkQueueSubmit failed -4
6.25%
vkQueueSubmit failed -4
12.50%
vkQueueSubmit failed -4
18.75%
vkAllocateMemory failed -2

After the "decode 708 764 3" shows,one of my monitor turns to black(I have 2 monitors,the other is still work) and I have to re-plug HDMI cable to make it rework.

Is the reason that my VRAM is too small?Can I resolve it except upgrade my hardware?

SOLVED: the reason is default tile-size is too big,then I add -t 32,it worked.

Another question is when I input a big photo,the program seems work not as good as a small picture ,I must cut the big photo or zoom it out to get the better results.

Thanks a lot,I am not good at English,please forgive me if there are anything wrong in this issue :)

[NO output images!]after i run test.py, where are the output imgs?

As the readme says, the output images is saved in '../results/'. but then i checked the folder. there was a 'results' folder and a 'Track2' folder was in it, and a 'DPED' folder is in 'Track2' but no imgs in the 'DPED'.
where did the imgs go?
did i miss something?
plz........

Inference results conflict?

Hello, I have the same problem. Use the pre-trained DPED.pth to test the Corrupted-te-x dataset, but I cannot get your results, such as 0913.png and 0935.png. But when testing the DPEDiphone-crop-te-x data set, the effect is okay. What is the reason for this? Is it a parameter setting problem?

Originally posted by @usstdqq in #29 (comment)

benchmark data

Could you kindly provide benchmark data related to frames per second (fps) and signal-to-noise ratio (SNR)?

How to prepare the DF2K training data?

Q1:
I notice that the data preparing process of NTIRE2020 Real World SR chanllenge is very complex.
How to reproduce the work for any other datasets?

Q2:
How did you add the jpeg artifact for the DF2K_JPEG model training data? (add what kind of artifact + the level range of artifact, for the input low resolution image)

Another question:
就我用pytorch的经验，如果单跑一张图片的话，脚本会有个初始的.cuda()加载模型进显存的时间还挺长的。如果是做成http接口的形式，模型显存常驻，什么时候需要过某张图，就只要一个前向的时间就行了，不需要初始加载进显存的成本。
ncnn-vulkan模型是命令行交互，输入输出图片路径。如果是命令行调用的话，相当于处理每一张图片都需要这个初始的时间吗？有没有http服务的形式或者批量处理一批图片的形式来省去这个初始时间？还是说ncnn-vulkan模型其实主要时间成本还是在前向上，不需要管这个，只管命令行调用就行了？@nihui

How to use the trainset

Is target domain dataset used in training?

How to use model ‘DF2K-JPEG‘’

Hi，I got an OOM error when using model DF2K-JPEG.ptg to test jpg images, but it works fine when I test png images by DPED.pth and DF2K.pth

So I want to know the need of GPU memory when using DF2K-JPEG model or the right way to use DF2K-JPEG.pt

[Request] - Train a model for upscaling movie images / movies

I would like to see a model which has been trained to upscale movies.
Currently DF2K-JPEG works really good on pictures with compression artifacts (it's like magic, wow). But for real life footage which is older (like on DVDs or old movies) it isn't really that big of a difference compared to non-upscaled footage.

So is there a possibility we could get a movie model in the future?

Kindly provide a new download link

Please provide the link to download the pretrained models from another platform as Baidu is not supported in a few countries.

question about clean-up using bicubic downsampling

Hi, I have a question about clean-up section.
realSR "adopt bicubic downsampling on the real image in the source domain to remove noise and make the image sharper".

But I think bicubic downsampling preserve as much information as possible, including noise and content, but rather than "remove noise". I tested on DIV2K train image 0048.png, which has severe noise. area / bicubic / bilinear downsampled 720*720 images are compared.

bicubic downsampled [bottom left] same visual quaility as HR;
area downsampled [top right] eliminate most background noise;
bilinear downsampled [bottom right] slightly denoise ;

did I miss something?

DF2k-JPEG trainig strategy

@jixiaozhong during inference we are splitting as patches when LR resolution in larger that cant fit in gpu, so how we set up the patched strategy and jpeg lr to train model which will handle jpeg noise and resolution enhancer , can you tell this

About Kernel Estimation losses

Hi,

Thanks for your paper and code, I really liked the way you built your LR/HR pairs.

In the paper, you use 4 different losses to estimate the kernel based on what was done in KernelGAN. However, you slightly modify the original KernelGAN loss by removing the sparsity loss and by adding the following loss:

Can you explain to me the advantages of such changes in comparison to the classical KernelGAN method?

Thanks,
Charles

Comparison with Baidu image_quality_enhance

We compare RealSR with Baidu image_quality_enhance.
Here is an image from their test examples.

Baidu result:

RealSR result:

Inference results conflict?

Interesting work. Thanks for sharing.

However, I have tried to inference on some samples with the provided pretrained weights. But for some reason, I couldn't get the expected results. I tried with both executable files and also source files. For example, while using source files, I have checked both weights (DF2K.pth and DPED.pth). Please see below:

Input

Using DF2K

Using DPED

Now, as it's demonstrated the visual results here, I expected the same or reasonable results. Any catch?

Apart from this, here is another issue, in the testing time, isn't it possible to place multiple low-resolution images at a time? For example, as demonstrated here, if I place (let's say) 5 images in 'dataroot_LR' -- test images dir; it throws CUDA out of memory: RunTimeError.

No module named 'RRDBNet_arch

hello. i've got this error when trying to run test.py
what am i doing wrong?

22-09-05 05:02:28.757 - INFO: Loading model for G [D:\Bots\RealSR\model_new.pth] ...
Traceback (most recent call last):
File "D:\Bots\RealSR\test.py", line 35, in
model = create_model(opt)
File "D:\Bots\RealSR\models_init_.py", line 18, in create_model
m = M(opt)
File "D:\Bots\RealSR\models\SRGAN_model.py", line 127, in init
self.load() # load G and D if needed
File "D:\Bots\RealSR\models\SRGAN_model.py", line 352, in load
self.load_network(load_path_G, self.netG, self.opt['path']['strict_load'])
File "D:\Bots\RealSR\models\base_model.py", line 90, in load_network
load_net = torch.load(load_path)
File "C:\Users\Wolh\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\serialization.py", line 713, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "C:\Users\Wolh\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\serialization.py", line 930, in _legacy_load
result = unpickler.load()
File "C:\Users\Wolh\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\serialization.py", line 746, in find_class
return super().find_class(mod_name, name)
ModuleNotFoundError: No module named 'RRDBNet_arch'

What is the variance cutoff you used for selecting the noise patches
What were the sizes of those noise patches while training (i.e. LR resolution while training).
Do you apply the variance cutoff on the variance of gray scale image or something else?
How many kernels and noise patches did you have in your Pool?
What is the Clean Up scale factor used?

Thanks!