Giter VIP home page Giter VIP logo

waen's Introduction

Wavelet Attention Embedding Networks for Video Super-Resolution

Young-Ju Choi, Young-Woon Lee, and Byung-Gyu Kim

Intelligent Vision Processing Lab. (IVPL), Sookmyung Women's University, Seoul, Republic of Korea


This repository is the official PyTorch implementation of the paper published in 2020 25th International Conference on Pattern Recognition (ICPR).

paper


Summary of paper

Abstract

Recently, Video super-resolution (VSR) has become more crucial as the resolution of display has been grown. The majority of deep learning-based VSR methods combine the convolutional neural networks (CNN) with motion compensation or alignment module to estimate a high-resolution (HR) frame from low-resolution (LR) frames. However, most of the previous methods deal with the spatial features equally and may result in the misaligned temporal features by the pixel-based motion compensation and alignment module. It can lead to the damaging effect on the accuracy of the estimated HR feature. In this paper, we propose a wavelet attention embedding network (WAEN), including wavelet embedding network (WENet) and attention embedding network (AENet), to fully exploit the spatio-temporal informative features. The WENet is operated as a spatial feature extractor of individual low and high-frequency information based on 2-D Haar discrete wavelet transform. The meaningful temporal feature is extracted in the AENet through utilizing the weighted attention map between frames. Experimental results verify that the proposed method achieves superior performance compared with state-of-the-art methods.

Network Architecture

Experimental Results


Getting Started

Dependencies and Installation

  • Anaconda3

  • Python == 3.6

    conda create --name waen python=3.6
  • PyTorch (NVIDIA GPU + CUDA)

    Trained on PyTorch 1.4.0 CUDA 10.0

    conda install pytorch==1.4.0 torchvision cudatoolkit=10.0 -c pytorch

    Recently (2022-03-29), we constructed the virtual environment as below (PyTorch 1.8.1 CUDA 10.2). However, when we tested using the pre-trained model in this environment, we observed that it did not match the results of the original paper. Please note this. For your information, we attatched the testing log files in Model Zoo section.

    conda install pytorch==1.8.1 torchvision==0.9.1 torchaudio==0.8.1 cudatoolkit=10.2 -c pytorch
  • tqdm, pyyaml, tensorboard, opencv-python, lmdb

    conda install -c conda-forge tqdm pyyaml tensorboard
    pip install opencv-python
    pip install lmdb

Dataset Preparation

We used Vimeo90K dataset for training and Vid4 dataset for testing.

  • Download

    Please refer to Dataset.md in our Deep-Video-Super-Resolution repository for more details.

    Put the datasets in ./datasets/

  • Prepare for Vimeo90K

    Run in ./codes/data_processing_scripts/

    Generate LR data

    python generate_LR_Vimeo90K.py

    Generate LMDB

    python generate_lmdb_Vimeo90K.py
  • Prepare for Vid4

    Run in ./codes/data_processing_scripts/

    Generate LR data

    python generate_LR_Vid4.py

Model Zoo

Pre-trained models and testing log files are available in below link.

google-drive


Training

Run in ./codes/

  • WAEN P

    Using single GPU

    python train.py -opt options/train/train_WAEN_P.yml

    Using multiple GPUs (nproc_per_node means the number of GPUs) with setting CUDA_VISIBLE_DEVICES in .yml file

    For example, set 'gpu_ids: [0,1,2,3,4,5,6,7]' in .yml file for 8 GPUs

    python -m torch.distributed.launch --nproc_per_node=8 --master_port=4321 train.py -opt options/train/train_WAEN_P.yml --launcher pytorch
  • WAEN S

    Using single GPU

    python train.py -opt options/train/train_WAEN_S.yml

    Using multiple GPUs (nproc_per_node means the number of GPUs) with setting CUDA_VISIBLE_DEVICES in .yml file

    For example, set 'gpu_ids: [0,1,2,3,4,5,6,7]' in .yml file for 8 GPUs

    python -m torch.distributed.launch --nproc_per_node=8 --master_port=4321 train.py -opt options/train/train_WAEN_S.yml --launcher pytorch

Testing

Run in ./codes/

python test_Vid4.py

Citation

@inproceedings{choi2021wavelet,
  title={Wavelet attention embedding networks for video super-resolution},
  author={Choi, Young-Ju and Lee, Young-Woon and Kim, Byung-Gyu},
  booktitle={2020 25th International Conference on Pattern Recognition (ICPR)},
  pages={7314--7320},
  year={2021},
  organization={IEEE}
}

Acknowledgement

The codes are heavily based on EDVR. Thanks for their awesome works.

EDVR : Wang, Xintao, et al. "Edvr: Video restoration with enhanced deformable convolutional networks." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 2019.

waen's People

Contributors

younggjuuchoi avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.