Giter VIP home page Giter VIP logo

esrgan's Introduction

ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks

Pipeine for Image Super-Resolution task that based on a frequently cited paper, ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks (Wang Xintao et al.), published in 2018.

In few words, image super-resolution (SR) techniques reconstruct a higher-resolution (HR) image or sequence from the observed lower-resolution (LR) images, e.g. upscaling of 720p image into 1080p.

One of the common approaches to solving this task is to use deep convolutional neural networks capable of recovering HR images from LR ones. And ESRGAN (Enhanced SRGAN) is one of them. Key points of ESRGAN:

  • SRResNet-based architecture with residual-in-residual blocks;
  • Mixture of context, perceptual, and adversarial losses. Context and perceptual losses are used for proper image upscaling, while adversarial loss pushes neural network to the natural image manifold using a discriminator network that is trained to differentiate between the super-resolved images and original photo-realistic images.

ESRGAN architecture

Technologies

  • Catalyst as pipeline runner for deep learning tasks. This new and rapidly developing library. can significantly reduce the amount of boilerplate code. If you are familiar with the TensorFlow ecosystem, you can think of Catalyst as Keras for PyTorch. This framework is integrated with logging systems such as the well-known TensorBoard;
  • Pytorch and torchvision as main frameworks for deep learning;
  • Albumentations and PIQ for data processing.

Quick Start

Setup environment

pip install git+https://github.com/leverxgroup/esrgan.git

Run an experiment

catalyst-dl run -C esrgan/config.yml --benchmark

where esrgan/config.yml is a path to the config file.

Results

Some examples of work of ESRGAN model trained on DIV2K dataset:

LR
(low resolution)
ESRGAN
(original)
ESRGAN
(ours)
HR
(high resolution)

Documentation

Full documentation for the project is available at https://esrgan.readthedocs.io/

License

esrgan is released under a CC BY-NC-ND 4.0 license. See LICENSE for additional details about it.

esrgan's People

Contributors

bagxi avatar lx-ykachan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

esrgan's Issues

OOM in forward pass of ESREncoder

I get an OOM error when forwarding an input of size 16*3*128*128 (as used by the paper) through ESREncoder with 16 RRDB blocks (as used by the paper), to reproduce:

import torch
from esrgan.model.module.esrnet import ESREncoder

esr = ESREncoder(num_basic_blocks=16).to('cuda')
input_ = torch.randn((16,3,128,128)).to('cuda')
esr(input_)

I'm running on 1080ti with 11GB, which is very similar in terms of memory to Titan Xp.
Environment wise I cloned the repo into a Conda environment with torch 1.9.0+cu111 and CUDA 11.2.

What am I missing?
Thanks!

学习率调度器

请问您预训练的学习率调度器采用的是stepLR,源代码配置文件好像是CosineAnnealingRestartLR,您是如何选择此学习率调度器呢?还有训练集和验证集最后分别是有多少patch才训练了10000轮呢?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.