Hello - I've been getting this issue consistently while running the code as is, with t

Getting NAN tensor from encoder about probabilistic-unet-pytorch HOT 8 CLOSED

stefanknegt commented on June 5, 2024

Getting NAN tensor from encoder

from probabilistic-unet-pytorch.

Comments (8)

JasperLinmans commented on June 5, 2024

Hi Zabboud,

I'm working on my own implementation, so I can't comment on this exact codebase. But what I found: changing the initialisation (in this case from nn.init.kaiming_normal_(m.weight, mode='fan_in', nonlinearity='relu') to nn.init.normal_(m.weight, std=0.001) solves the problem for me.

Curious if you found some more information on this in the meantime!

from probabilistic-unet-pytorch.

zabboud commented on June 5, 2024

Hi Zabboud,

I'm working on my own implementation, so I can't comment on this exact codebase. But what I found: changing the initialisation (in this case from nn.init.kaiming_normal_(m.weight, mode='fan_in', nonlinearity='relu') to nn.init.normal_(m.weight, std=0.001) solves the problem for me.

Curious if you found some more information on this in the meantime!

Actually changing the learning rate (decreasing it) fixed the problem for me. I'm still unsure as to why it happens, do you have an idea of why it happens? I'd be interested to test out the different initialization!

from probabilistic-unet-pytorch.

stefanknegt commented on June 5, 2024

The model is quite sensitive and a too high learning rate and some initialization methods can cause the loss to go to NaN.

from probabilistic-unet-pytorch.

zabboud commented on June 5, 2024

Thank you - I figured out that part -- I was wondering if you have some insight on why with other datasets the loss seems not to decrease, however, I can see that the predictions are improving through visual feedback -- any insight on this issue?

from probabilistic-unet-pytorch.

stefanknegt commented on June 5, 2024

Hmm I think you should look at the 2 components of the loss function and how they evolve over time. Maybe this can give you some insight into why the loss is not decreasing while the predictions seem to improve.

from probabilistic-unet-pytorch.

zabboud commented on June 5, 2024

Both the total ELBO loss and the KL loss are just stagnant - there's little to no change. Do you have any suggestions on what to tune from the parameters (whether it's latent dimension, gamma, beta, num_convs_fcomb)? I've been playing around with the preprocessing of the data (liver dataset) -but with no luck to make the model learn to predict lesion location.

I've tested the model on the lung dataset - and it works, I have some diversity in the predictions, and there's a progression in the loss - but unfortunately no progress on the liver dataset, whether in predicting liver or lesion.

from probabilistic-unet-pytorch.

stefanknegt commented on June 5, 2024

I am not sure why that happens and guess that changing things like the latent dimension and num_convs_fcomb is not going to help. I've only tested it on the LIDC and although I sometimes had issues with the loss, it never remained stagnant. Good luck!

from probabilistic-unet-pytorch.

zabboud commented on June 5, 2024

Thank you - I realized that often the KL divergence term goes to 0 -- what would be the cause of that? Probably an indicator of why the model is not training properly

from probabilistic-unet-pytorch.

Getting NAN tensor from encoder about probabilistic-unet-pytorch HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent