Giter VIP home page Giter VIP logo

adair's Introduction

AdaIR: Adaptive All-in-One Image Restoration via Frequency Mining and Modulation

Yuning Cui, Syed Waqas Zamir, Salman Khan, Alois Knoll, Mubarak Shah, and Fahad Shahbaz Khan

paper


Abstract: In the image acquisition process, various forms of degradation, including noise, blur, haze, and rain, are frequently introduced. These degradations typically arise from the inherent limitations of cameras or unfavorable ambient conditions. To recover clean images from their degraded versions, numerous specialized restoration methods have been developed, each targeting a specific type of degradation. Recently, all-in-one algorithms have garnered significant attention by addressing different types of degradations within a single model without requiring the prior information of the input degradation type. However, these methods purely operate in the spatial domain and do not delve into the distinct frequency variations inherent to different degradation types. To address this gap, we propose an adaptive all-in-one image restoration network based on frequency mining and modulation. Our approach is motivated by the observation that different degradation types impact the image content on different frequency subbands, thereby requiring different treatments for each restoration task. Specifically, we first mine low- and high-frequency information from the input features, guided by the adaptively decoupled spectra of the degraded image. The extracted features are then modulated by a bidirectional operator to facilitate interactions between different frequency components. Finally, the modulated features are merged into the original input for a progressively guided restoration. With this approach, the model achieves adaptive reconstruction by accentuating the informative frequency subbands according to different input degradations. Extensive experiments demonstrate that the proposed method, named AdaIR, achieves state-of-the-art performance on different image restoration tasks, including image denoising, dehazing, deraining, motion deblurring, and low-light image enhancement.


Network Architecture

Installation and Data Preparation

See INSTALL.md for the installation of dependencies and dataset preperation required to run this codebase.

Training

After preparing the training data in data/ directory, use

python train.py

to start the training of the model. Use the de_type argument to choose the combination of degradation types to train on. By default it is set to all the 5 degradation tasks (denoising, deraining, dehazing, deblurring, enhancement).

Example Usage: If we only want to train on deraining and dehazing:

python train.py --de_type derain dehaze

Testing

After preparing the testing data in test/ directory, place the mode checkpoint file in the ckpt directory. The pre-trained model can be downloaded here. To perform the evaluation, use

python test.py --mode {n}

n is a number that can be used to set the tasks to be evaluated on, 0 for denoising, 1 for deraining, 2 for dehazing, 3 for deblurring, 4 for enhancement, 5 for three-degradation all-in-one setting and 6 for five-degradation all-in-one setting.

Example Usage: To test on all the degradation types at once, run:

python test.py --mode 6

Results

Performance results of the AdaIR framework trained under the all-in-one setting.

Three Distinct Degradations (click to expand)
Five Distinct Degradations (click to expand)

The visual results can be downloaded here.

Citation

If you use our work, please consider citing:

@misc{cui2024adair,
      title={AdaIR: Adaptive All-in-One Image Restoration via Frequency Mining and Modulation}, 
      author={Yuning Cui and Syed Waqas Zamir and Salman Khan and Alois Knoll and Mubarak Shah and Fahad Shahbaz Khan},
      year={2024},
      eprint={2403.14614},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Contact

Should you have any questions, please contact [email protected]

Acknowledgment: This code is based on the PromptIR repository.

adair's People

Contributors

c-yn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

adair's Issues

May I ask for a demo.py file?

Sorry to bother you in your busy schedule! I wanna ask do you have a "demo.py" file similar to the one provided by "PromptIR" that I can use to test other images? Thanks a lot!

FMiM

where is FMiM

Untrained parameter problem

AdaIR/net/model.py

Lines 339 to 347 in 69e13fb

x = self.conv1(x)
mask = torch.zeros(x.shape).to(x.device)
h, w = x.shape[-2:]
threshold = F.adaptive_avg_pool2d(x, 1)
threshold = self.rate_conv(threshold).sigmoid()
for i in range(mask.shape[0]):
h_ = (h//n * threshold[i,0,:,:]).int()
w_ = (w//n * threshold[i,1,:,:]).int()

I found a problem, since I was training with DDP, that would indicate the presence of parameters that were not involved in the training. Through my investigation, self.score_gen and self.conv are unnecessary, and these problems are not serious. But the most important thing is that self.rate_conv will not participate in the gradient calculation, because the operation of generating mask with threshold is not differentiable.

Training cost and GPU requirements

Hi @c-yn ,

Thanks for your interesting explorations in IR from the frequency perspective.

May I know how many GPUs (And which type you used) are needed for training the 3-task setting and the 5-task setting ?

Especially, how long it costs to train a full model with your GPU setting.

Thanks in advance, this would help me a lot as a reference to follow this nice work.

Best and have a nice day,

About the five-degradation table in paper

In the top super-row of the five-task table mentioned in the paper, was the accuracy of the single-task models trained with your training settings? I did not find the accuracies provided in your paper in these papers.

BSD400 Denoising Dataset.

Hello,

Thank you for sharing this interesting study.

I have one question about the denoise dataset. In the "denoise.txt" file, the BSD400 data file names appear to differ from the acutal image names (e.g. 'a117025.jpg' in denoise.txt / '117025'jpg' in the actual image name). Could you please clarify if the BSD400 image are used for training in the All-in-one IR??

Thank you for your help.

Code for inference

Hello,

terrific work, is available the code for running inference? are the model weights available? Could be possible to run it without the need for training?

Thank you for your help

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.