jixiaozhong / realsr Goto Github PK
View Code? Open in Web Editor NEWReal-World Super-Resolution via Kernel Estimation and Noise Injection
License: Apache License 2.0
Real-World Super-Resolution via Kernel Estimation and Noise Injection
License: Apache License 2.0
Hello,
MSU Video Group has recently launched Video Super Resolution Benchmark and evaluated this algorithm.
It takes 7th place by subjective score, 9th place by PSNR, and 5th by our metric ERQAv1.0. You can see the results here.
If you have any other VSR method you want to see in our benchmark, we kindly invite you to participate.
You can submit it for the benchmark, following the submission steps.
Hi~
Thanks for your great job in real image super resolution.
I wonder can you provide the pretrained model (Track 2) to help me do some evaluation on my onw dataset?
It's important for me! So, if possible, please release it on Github~
Thank you very much!
Hi,
@jixiaozhong can you share details about how the DF2K-JPEG model was trained and what was the data preparation strategy?
I see a train code in this directory. Can you please tell how to run this code?
I want to train this model in other dataset, but i don't know how to prepare train dataset , can you release train code?
您好,
下载训练集 (DPEDipone training dataset) 简直太困难了!请问您是否方便提供已经下载过的训练集呢?例如百度网盘的链接。这会极大的加速我获取到该数据集。非常感谢您!
祝好,
马海川
Hello, I am very interested in your work, especially the kernel estimation and noise injection part. I couldn’t find that from your github release. Could you kindly provide us the code for training data constructing?
model on human faces is posible?, like make focus in faces deblur and denoise, also regenerate pixelate faces?
DF2K-JPEG Model, works really cool.
but sometimes is not enough, for human faces.
many thanks and keep going. 👍 )
Hi, thanks for your wonderful work.
I am confused about some details of the evaluation.
I will appreciate it a lot if you can help me with these questions.
Thanks.
python3.8 util.py
Traceback (most recent call last):
File "util.py", line 461, in
img = img * 1.0 / 255
TypeError: unsupported operand type(s) for *: 'NoneType' and 'float'
So the generator network of this is EXACTLY esrgan but with different layer(?) naming.
I made a state_dict name converting func like this:
def renamelayerz(state_dict):
from collections import OrderedDict
import rsrlayers
new_state_dict = OrderedDict()
for key, value in state_dict.items():
new_key = rsrlayers.rsr2esr[key]
new_state_dict[new_key] = value
return new_state_dict
def mkESRGAN(model_path,scale,isRSR=False):
if not os.path.isfile(model_path):
model_path = '/content/drive/My Drive/TFMLz/ESRGAN_oldarch/models/'+model_path+'.pth'
model = arch.RRDB_Net(3, 3, 64, 23, gc=32, upscale=scale, norm_type=None, act_type='leakyrelu', \
mode='CNA', res_scale=1, upsample_mode='upconv')
dick=torch.load(model_path)
if isRSR:
dick = renamelayerz(dick)
model.load_state_dict(dick, strict=True)
model.eval()
for k, v in model.named_parameters():
v.requires_grad = False
return model
And it works (with original ESRGAN scripts).
Can you revert those names back to original esrgan ones? It will make this easier to be adopted by other esrgan applications.
Hello,
MSU Graphics & Media Lab Video Group has recently launched two new Super-Resolution Benchmarks.
Your method achieved 4th place in Video Upscalers Benchmark: Quality Enhancement in 'Animation 2x' category and 1st place in Super-Resolution for Video Compression Benchmark in 'x264 compression' category. We congratulate you on your result and look forward to your future work!
We would be grateful for your feedback on our work.
Hi~
Can I adopt this method on any real world images directly? Such like some face images collected from net? I have tried it these days and i observe that your method may cause over-sharp problem. Can you give me some suggestion about this?
There is a couple of demos, i use the DPED.pth as the pretrained model.
LR:
Hello All
I am trying to run test.py on some images captured by my DSLR,
CUDA_VISIBLE_DEVICES=1 python3 test.py -opt options/df2k/test_df2k.yml
and I am always getting CUDA out of memory error. However, when I try to run it on the FLICKR dataset of DIV2K, I am not facing any such problems. Please can anybody help me resolve it?
I think if i reduce the batch size, then it must be able to resolve this issue, can anybody tell me where the batch-size option is? If you think it could be because of something else also, please let me know.
This is the error I am getting always
RuntimeError: CUDA out of memory. Tried to allocate 10.23 GiB (GPU 0; 23.65 GiB total capacity; 13.52 GiB already allocated; 5.71 GiB free; 14.81 GiB reserved in total by PyTorch)
Thank you in advance for your help.
Hi. I'm just wondering, is there any possibility of using RealSR if I do not have NVIDIA GPU + CUDA?
I have some old DV footage I'm testing RealSR on. The source size is 1024x576 pixels.
There are lots of artifacts and noise in it.
I first tried to upscale it to 4096px with RealSR, but there was nearly no improvement in the quality. It's almost a carbon copy of the original.
Then I tried something else: I downscaled the original to 512x288 pixels, and upscaled THAT to 2048px. And the result was amazing. No more noise, everything looks quite sharp, most halos are even gone, ... (The result in the headlight alone, wow.)
I uploaded the 4 images here, you can compare the 2 inputs to the 2 outputs:
http://www.framecompare.com/image-compare/screenshotcomparison/7776WNNX
Now the question is: why is that? Why did I first have to downscale the image in order to get a better result? It's kind of strange, because some of the details did get lost in the conversion from the 1024px to the 512px image. Could I somehow get an even better result with the original 1024px source?
https://competitions.codalab.org/competitions/22220#participate
https://data.vision.ee.ethz.ch/alugmayr/NTIRE2019/public/Corrupted-tr-x.zip
https://data.vision.ee.ethz.ch/alugmayr/NTIRE2019/public/Corrupted-tr-y.zip
https://data.vision.ee.ethz.ch/alugmayr/NTIRE2019/public/Corrupted-va-x.zip
https://data.vision.ee.ethz.ch/alugmayr/NTIRE2019/public/Corrupted-te-x.zip
https://data.vision.ee.ethz.ch/alugmayr/NTIRE2019/public/DPEDiphone-tr-x.zip
https://data.vision.ee.ethz.ch/alugmayr/NTIRE2019/public/DPEDiphone-tr-y.zip
https://data.vision.ee.ethz.ch/alugmayr/NTIRE2019/public/DPEDiphone-va.zip
https://data.vision.ee.ethz.ch/alugmayr/NTIRE2019/public/DPEDiphone-crop-te-x.zip
Can someone re-upload?
大致看了您的代码,但我有点迷茫,没找到我要找的部分(可能因为有点多.....)没看懂project的逻辑。另外请问根目录下的train.py是训练代码嘛?
I have not found 'Noise Injection' in data codes, is it not included in release codes so far?
@jixiaozhong After fine-tuning the network, the output color of the Generate model has changed? How can I address this issue?
looking for your help !
Thank you!
I have not found 'Kernel Estimation' and 'Noise collection ' in data codes, is it not included in release codes so far?
System: Windows 10 1909
GPU: 1050TI
Drive: NVIDIA 446.14
When I enable TTA mode,the program crashed...
And the program shows
E:\Program Files\realsr>realsr-ncnn-vulkan.exe -i [图片地址] -o 123.png -x
[0 GeForce GTX 1050 Ti] queueC=2[8] queueG=0[16] queueT=1[2]
[0 GeForce GTX 1050 Ti] buglssc=0 bugsbn1=0 bugihfa=0
[0 GeForce GTX 1050 Ti] fp16p=1 fp16s=1 fp16a=0 int8s=1 int8a=1
decode 708 764 3
vkWaitForFences failed -4
0.00%
vkQueueSubmit failed -4
6.25%
vkQueueSubmit failed -4
12.50%
vkQueueSubmit failed -4
18.75%
vkAllocateMemory failed -2
After the "decode 708 764 3" shows,one of my monitor turns to black(I have 2 monitors,the other is still work) and I have to re-plug HDMI cable to make it rework.
Is the reason that my VRAM is too small?Can I resolve it except upgrade my hardware?
SOLVED: the reason is default tile-size is too big,then I add -t 32,it worked.
Another question is when I input a big photo,the program seems work not as good as a small picture ,I must cut the big photo or zoom it out to get the better results.
Thanks a lot,I am not good at English,please forgive me if there are anything wrong in this issue :)
As the readme says, the output images is saved in '../results/'. but then i checked the folder. there was a 'results' folder and a 'Track2' folder was in it, and a 'DPED' folder is in 'Track2' but no imgs in the 'DPED'.
where did the imgs go?
did i miss something?
plz........
Hello, I have the same problem. Use the pre-trained DPED.pth to test the Corrupted-te-x dataset, but I cannot get your results, such as 0913.png and 0935.png. But when testing the DPEDiphone-crop-te-x data set, the effect is okay. What is the reason for this? Is it a parameter setting problem?
Originally posted by @usstdqq in #29 (comment)
Could you kindly provide benchmark data related to frames per second (fps) and signal-to-noise ratio (SNR)?
Q1:
I notice that the data preparing process of NTIRE2020 Real World SR chanllenge is very complex.
How to reproduce the work for any other datasets?
Q2:
How did you add the jpeg artifact for the DF2K_JPEG model training data? (add what kind of artifact + the level range of artifact, for the input low resolution image)
Another question:
就我用pytorch的经验,如果单跑一张图片的话,脚本会有个初始的.cuda()加载模型进显存的时间还挺长的。如果是做成http接口的形式,模型显存常驻,什么时候需要过某张图,就只要一个前向的时间就行了,不需要初始加载进显存的成本。
ncnn-vulkan模型是命令行交互,输入输出图片路径。如果是命令行调用的话,相当于处理每一张图片都需要这个初始的时间吗?有没有http服务的形式或者批量处理一批图片的形式来省去这个初始时间?还是说ncnn-vulkan模型其实主要时间成本还是在前向上,不需要管这个,只管命令行调用就行了?@nihui
Is target domain dataset used in training?
I would like to see a model which has been trained to upscale movies.
Currently DF2K-JPEG works really good on pictures with compression artifacts (it's like magic, wow). But for real life footage which is older (like on DVDs or old movies) it isn't really that big of a difference compared to non-upscaled footage.
So is there a possibility we could get a movie model in the future?
Please provide the link to download the pretrained models from another platform as Baidu is not supported in a few countries.
Hi, I have a question about clean-up section.
realSR "adopt bicubic downsampling on the real image in the source domain to remove noise and make the image sharper".
But I think bicubic downsampling preserve as much information as possible, including noise and content, but rather than "remove noise". I tested on DIV2K train image 0048.png, which has severe noise. area / bicubic / bilinear downsampled 720*720 images are compared.
bicubic downsampled [bottom left] same visual quaility as HR;
area downsampled [top right] eliminate most background noise;
bilinear downsampled [bottom right] slightly denoise ;
did I miss something?
@jixiaozhong during inference we are splitting as patches when LR resolution in larger that cant fit in gpu, so how we set up the patched strategy and jpeg lr to train model which will handle jpeg noise and resolution enhancer , can you tell this
Hi,
Thanks for your paper and code, I really liked the way you built your LR/HR pairs.
In the paper, you use 4 different losses to estimate the kernel based on what was done in KernelGAN. However, you slightly modify the original KernelGAN loss by removing the sparsity loss and by adding the following loss:
Can you explain to me the advantages of such changes in comparison to the classical KernelGAN method?
Thanks,
Charles
We compare RealSR with Baidu image_quality_enhance.
Here is an image from their test examples.
Interesting work. Thanks for sharing.
However, I have tried to inference on some samples with the provided pretrained weights. But for some reason, I couldn't get the expected results. I tried with both executable files and also source files. For example, while using source files, I have checked both weights (DF2K.pth and DPED.pth). Please see below:
Now, as it's demonstrated the visual results here, I expected the same or reasonable results. Any catch?
Apart from this, here is another issue, in the testing time, isn't it possible to place multiple low-resolution images at a time? For example, as demonstrated here, if I place (let's say) 5 images in 'dataroot_LR' -- test images dir
; it throws CUDA out of memory: RunTimeError
.
hello. i've got this error when trying to run test.py
what am i doing wrong?
22-09-05 05:02:28.757 - INFO: Loading model for G [D:\Bots\RealSR\model_new.pth] ...
Traceback (most recent call last):
File "D:\Bots\RealSR\test.py", line 35, in
model = create_model(opt)
File "D:\Bots\RealSR\models_init_.py", line 18, in create_model
m = M(opt)
File "D:\Bots\RealSR\models\SRGAN_model.py", line 127, in init
self.load() # load G and D if needed
File "D:\Bots\RealSR\models\SRGAN_model.py", line 352, in load
self.load_network(load_path_G, self.netG, self.opt['path']['strict_load'])
File "D:\Bots\RealSR\models\base_model.py", line 90, in load_network
load_net = torch.load(load_path)
File "C:\Users\Wolh\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\serialization.py", line 713, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "C:\Users\Wolh\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\serialization.py", line 930, in _legacy_load
result = unpickler.load()
File "C:\Users\Wolh\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\serialization.py", line 746, in find_class
return super().find_class(mod_name, name)
ModuleNotFoundError: No module named 'RRDBNet_arch'
Hello, I have the same problem. Use the pre-trained DPED.pth to test the Corrupted-te-x dataset, but I cannot get your results, such as 0913.png and 0935.png. But when testing the DPEDiphone-crop-te-x data set, the effect is okay. What is the reason for this? Is it a parameter setting problem?
Originally posted by @innat in #29 (comment)
Where find code for convert PyTorch model to ncnn?
Thanks you
Can you please answer 5 questions because I could not find the details in your paper:
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.