Giter VIP home page Giter VIP logo

Comments (15)

JingyunLiang avatar JingyunLiang commented on May 8, 2024 1

It depends on what you want. If you just care about PSNR, using pixel loss for training is enough (first stage). PSNR provides a good quantitative metric for comparing different methods, but pixel loss often does not have a good visual quality.

If you want better visual quality, you should fine-tine the model from the first stage using a combination of pixel loss, perceptual loss and GAN loss, but it will decrease the PSNR.

By the way, I might know why you suffer from a sudden large drop of PSNR. GAN training is very unstable. Generally you should fine-tune the model from the first stage, instead of training it from scratch. The EMA strategy can also help stabilize the convergence. Note that PSNR is also not a good metric when you are training towards good visual quality.

from swinir.

JingyunLiang avatar JingyunLiang commented on May 8, 2024

Do you mean that training SwinIR (middle size, dim=180) sometime have doubled loss? I don't think it is a problem. When some images in a training patch is hard to reconstruct, the loss of that batch would be large. If you look at the PSNR on the validation set, the model converges smoothly.

In our implementation, all settings are similar to CNN-based SR models and we did not use any special tricks. Detailed settings can be found in the supplementary. Training codes will be released in KAIR in a few days.

from swinir.

shengkelong avatar shengkelong commented on May 8, 2024

I don’t mean that the loss of a batch has suddenly doubled, but the average loss of 100 batches has doubled, and psnr will also drop close to 1db and recover after a few epochs.

from swinir.

JingyunLiang avatar JingyunLiang commented on May 8, 2024

We trained SwinIR (middle size, dim=180) for 500K iterations with batch_size=32. The learning rate is initialized as 2e-4 and halved at [250K, 400K, 450K, 475K]. We use Adam optimizer (betas=[0.9, 0.99]) without weight decay. The loss is the mean L1 pixel loss. The training loss and PSNR on validation set (Set 5) are attached as follows.

We did not notice any sudden large PSNR drop on the validation set.
swinir_div2k_x2_loss_psnr

from swinir.

shengkelong avatar shengkelong commented on May 8, 2024

Thank you, I will check my code and explore whether the slight parameter difference will have a big impact。

from swinir.

hcleung3325 avatar hcleung3325 commented on May 8, 2024

We trained SwinIR (middle size, dim=180) for 500K iterations with batch_size=32. The learning rate is initialized as 2e-4 and halved at [250K, 400K, 450K, 475K]. We use Adam optimizer (betas=[0.9, 0.99]) without weight decay. The loss is the mean L1 pixel loss. The training loss and PSNR on validation set (Set 5) are attached as follows.

We did not notice any sudden large PSNR drop on the validation set.
swinir_div2k_x2_loss_psnr

Hi thank you for your work.
May I know that does the SwinIR (for image SR) need to trained as GAN with discriminator?
If I just train for image SR, can I just train it without GAN?
Thanks.

from swinir.

yzcv avatar yzcv commented on May 8, 2024

pixel loss. The training loss and PSNR on validation set (Set 5) are attached as follows.

Hi, @JingyunLiang

Thanks for the loss plot. May I ask based on your experience, is it normal for a transformer framework that the training loss oscillates seriously? I am currently training a transformer and the loss seems just fluctuate repeatedly and there is no trend of convergence. So do you think this is a normal phenomenon for most transformers?

Thanks very much.

from swinir.

JingyunLiang avatar JingyunLiang commented on May 8, 2024

I don't think so. There is no such problems as you can see in Fig.3 (f) of the paper. By the way, our training code will be released in 1-2 days. Please use that for training. Thank you.

from swinir.

yzcv avatar yzcv commented on May 8, 2024

Yes, I see. Fig. 3(f) is the PSNR plot. Just as shown in your earlier reply, the PSNR is stable. But the L1 loss oscillates. I am confused about the fluctuation of the L1 loss. Thanks a lot. @JingyunLiang

from swinir.

JingyunLiang avatar JingyunLiang commented on May 8, 2024

PSNR and L1 loss on validation loss are highly related because PSNR has a MSE(pred, gt) term. If the validation PSNR is stable, the validation loss should also be stable. The training loss is shown in the top figure of my previous answer, it may fluctuate a bit because each batch has different images (some of them are hard to super-resolve).

We trained SwinIR (middle size, dim=180) for 500K iterations with batch_size=32. The learning rate is initialized as 2e-4 and halved at [250K, 400K, 450K, 475K]. We use Adam optimizer (betas=[0.9, 0.99]) without weight decay. The loss is the mean L1 pixel loss. The training loss and PSNR on validation set (Set 5) are attached as follows.

We did not notice any sudden large PSNR drop on the validation set.
swinir_div2k_x2_loss_psnr

from swinir.

yzcv avatar yzcv commented on May 8, 2024

Thanks so much for your explanation! In this case, have you tried to increase the batch size to reduce this training loss fluctuation? Does a large batch size alleviate this instability?

from swinir.

JingyunLiang avatar JingyunLiang commented on May 8, 2024

No, I always use batch_size=32 (so we only need 500k iterations). You can try it later on our training code.

from swinir.

yzcv avatar yzcv commented on May 8, 2024

I see. Thank you very much. I think 32 is large enough for the image-to-image task due to the huge cost of the transformer.

from swinir.

JingyunLiang avatar JingyunLiang commented on May 8, 2024

@z625715875 @shengkelong @hcleung3325 We have release the SwinIR training code at KAIR. We also add an interactive online Colab demo for real-world image SR.

from swinir.

JingyunLiang avatar JingyunLiang commented on May 8, 2024

Feel free to open it if you have more questions.

from swinir.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.