Hi there! I have been experimenting with the 2D version of your BSP_

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

I cannot remember why I was setting the learning rate to 0.00002. I tried 0.0001

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Net outs collapse to zero values in continuous phase on 2D experiments? about bsp-net-pytorch HOT 5 CLOSED

czq142857 commented on July 24, 2024

Net outs collapse to zero values in continuous phase on 2D experiments?

from bsp-net-pytorch.

Comments (5)

czq142857 commented on July 24, 2024

Hi,

I am not familiar with your experimental setup, and I have never experienced such collapses, therefore my comments might not be very helpful.

I suspect the collapses might have something to do with the gradients. There are many ReLU and clipping operations in the shallow MLP, therefore once the network outputs zeros, it will receive no gradients and will be stuck there forever. I guess sometimes a specific batch could bring abnormally strong gradients and make the network stuck instantly, and this is totally random. I am not sure how this could be fixed.

from bsp-net-pytorch.

yeshwanth95 commented on July 24, 2024

Hi @czq142857. Thanks for the reply!

Actually, you are right about the network being stuck after it outputs zeros. The gradient propagation stops entirely at this point. Although this happens pretty much around the same epoch in each run. So it could be due to a specific 'difficult' batch that causes this, given that the batch ordering is not randomized.
I too think the many clamping operations could cause this behaviour. I'll have to take a closer look though.

from bsp-net-pytorch.

yeshwanth95 commented on July 24, 2024

Hi @czq142857. A few quick questions.

I've just been wondering if there is any reason you used a learning rate of 0.00002 (quite a low value) in the 2D experiments with the toy dataset? It appears that the above-mentioned behaviour only occurs when I try to increase the learning rate above this value with the toy dataset.
Also, when I use more complex binary masks, the learning rate needs to be lowered even further. When I lower the learning rate, I'm able to bypass the issue of outputs collapsing to zero however at the cost of the model converging extremely slowly. Are there any limitations to BSPNet I'm overlooking that need to be addressed so that it generalises well to more complex shapes?

I've attached some examples of the binary masks I'm trying to train BSPNet on. Would love to hear your thoughts!

from bsp-net-pytorch.

czq142857 commented on July 24, 2024

I cannot remember why I was setting the learning rate to 0.00002. I tried 0.0001 just now and it also worked. But the loss fluctuated between 0.001 and 0.005 at the end, so I guess I used a small learning rate to avoid such fluctuation. Nevertheless, it should not cause the collapse. Please try the original implementation and see if there is something different in your pytorch implementation,
BSP-Net requires the shapes in the dataset to have some sort of correspondences (see Figure 4 in the paper). Each part (convex) is shared by different shapes, and ideally it should represent the same or corresponding parts in those different shapes. I do not see such correspondences in your dataset, therefore I do not think BSP-Net would work, unless you try to overfit on a handful of samples.

from bsp-net-pytorch.

yeshwanth95 commented on July 24, 2024

Hi @czq142857! Thanks for the clarifications. I'll check again to see if there are any differences between my implementation and the original.

Although, I think you are right about the dataset. I'm running some experiments with a simpler version of this dataset (smaller patches basically) and BSP-Net seems to perform better than the earlier runs. However, I still feel my dataset also has correspondences between parts across different objects (since it's just gt masks of mostly rectangular buildings in airborne images). I'll try a few more experiments with the same dataset on the original tf1 implementation to make sure this is not an implementation issue.

from bsp-net-pytorch.

Net outs collapse to zero values in continuous phase on 2D experiments? about bsp-net-pytorch HOT 5 CLOSED

Comments (5)

Related Issues (18)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent