Hello. Could you please give some explanation and advise for the scenario explained be

Because StarGAN-v2 uses L1 norm for the reconstruction loss. I th

Because StarGAN-v2 uses L1 norm for the reconstruction loss.

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Yes, the style reconstruction loss. In many cases, the results wi

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Why the style encoder maps all the images to the same style code? about tunit HOT 7 CLOSED

clovaai commented on August 25, 2024

Why the style encoder maps all the images to the same style code?

from tunit.

Comments (7)

FriedRonaldo commented on August 25, 2024

Because StarGAN-v2 uses L1 norm for the reconstruction loss. I think that other works that use L1 or L2 norms might be cited in the same line.
Well... I think it is not an issue related to the training scheme or objectives if the results at the very last iteration look good. I mean L1 loss does not a problem but the issue might be related to the inference.

It might be ...
i) The checkpoint is not properly loaded (especially batch normalization layer)
ii) The images are not properly processed as the training phase (e.g. resizing size, normalization ... )

from tunit.

ammar-deep commented on August 25, 2024

Because StarGAN-v2 uses L1 norm for the reconstruction loss. I think that other works that use L1 or L2 norms might be cited in the same line.
Are you talking about the style reconstruction loss used in StarGAN-v2? If yes then if we use Style reconstruction loss in TUNIT as a replacement to G's style contrastive loss what behavior do you expect? (Will it have the same problem I am facing? as you mentioned in the paper "By doing so, we avoid the degenerated solution where the encoder
maps all the images to the same style code of the reconstruction loss [5] based on L1 or L2 norm." )
Actually my inference looks fine
a) style encoder gives around 99% testing accuracy on the classification task when loaded from the checkpoints
b) images are perfectly processed. :)

from tunit.

ammar-deep commented on August 25, 2024

@FriedRonaldo BTW the quality of the generated samples at inference are great they only lack the style 😄

from tunit.

FriedRonaldo commented on August 25, 2024

Yes, the style reconstruction loss. In many cases, the results will be similar because the optimization does not always reach a degenerated solution but the contrastive-based loss can avoid the degenerated solution.
I might not understand the issue exactly before. You mean, the outputs reflect the style image very well (If you use a black cat as a style image, then, is the output a black cat?) but the style codes from the different images are the same point. Is it right? The scenario sounds very weird.

After BTW the quality of the generated samples at inference are great they only lack the style

The meaning of "if the results at the very last iteration look good" is not just about the quality but how well the samples reflect the style images.

If you use a black cat as a style image, then, is the output a black cat?

If yes in the training and if not in the inference, more information should be provided to diagnose the issue. Blind diagnosis is somewhat difficult...

from tunit.

ammar-deep commented on August 25, 2024

Sorry for causing some confusions.

If you use a black cat as a style image, then, is the output a black cat?
Yes (only in training but not inference). As you mentioned during the later iterations of training phase the output reflects the black cat. However during inference the style codes from the different images map to the same point.

I did an experiment where I overfit the model during testing i.e. gave the model a seen content and seen style image which it perfectly generated during training, but surprisingly it failed during inference as well i.e. style code doesn't match the style image however the content is fine.

If yes in the training and if not in the inference, more information should be provided to diagnose the issue. Blind diagnosis is somewhat difficult
Could you please let me know what information should I provide you?

from tunit.

FriedRonaldo commented on August 25, 2024

In this case, the problem is just about the inference, so using the style contrastive loss does not solve the problem.

In my experience, there is an issue in the batch normalization layer and if I use "eval()", the model does not work, but without it, the model works well. It might be related to the normalization layer. Or EMA might raise the problems. I recommand to use the generator that is non-EMA version.

Because it is not related to our source code, I am sorry for that I can not help you more.

from tunit.

ammar-deep commented on August 25, 2024

@FriedRonaldo Thank you for helping me this far and giving some ideas. I will try with your feedback hopefully I can resolve it.

from tunit.

Why the style encoder maps all the images to the same style code? about tunit HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent