Comments (5)
Hi,
I am not familiar with your experimental setup, and I have never experienced such collapses, therefore my comments might not be very helpful.
I suspect the collapses might have something to do with the gradients. There are many ReLU and clipping operations in the shallow MLP, therefore once the network outputs zeros, it will receive no gradients and will be stuck there forever. I guess sometimes a specific batch could bring abnormally strong gradients and make the network stuck instantly, and this is totally random. I am not sure how this could be fixed.
from bsp-net-pytorch.
Hi @czq142857. Thanks for the reply!
Actually, you are right about the network being stuck after it outputs zeros. The gradient propagation stops entirely at this point. Although this happens pretty much around the same epoch in each run. So it could be due to a specific 'difficult' batch that causes this, given that the batch ordering is not randomized.
I too think the many clamping operations could cause this behaviour. I'll have to take a closer look though.
from bsp-net-pytorch.
Hi @czq142857. A few quick questions.
-
I've just been wondering if there is any reason you used a learning rate of 0.00002 (quite a low value) in the 2D experiments with the toy dataset? It appears that the above-mentioned behaviour only occurs when I try to increase the learning rate above this value with the toy dataset.
-
Also, when I use more complex binary masks, the learning rate needs to be lowered even further. When I lower the learning rate, I'm able to bypass the issue of outputs collapsing to zero however at the cost of the model converging extremely slowly. Are there any limitations to BSPNet I'm overlooking that need to be addressed so that it generalises well to more complex shapes?
I've attached some examples of the binary masks I'm trying to train BSPNet on. Would love to hear your thoughts!
from bsp-net-pytorch.
- I cannot remember why I was setting the learning rate to 0.00002. I tried 0.0001 just now and it also worked. But the loss fluctuated between 0.001 and 0.005 at the end, so I guess I used a small learning rate to avoid such fluctuation. Nevertheless, it should not cause the collapse. Please try the original implementation and see if there is something different in your pytorch implementation,
- BSP-Net requires the shapes in the dataset to have some sort of correspondences (see Figure 4 in the paper). Each part (convex) is shared by different shapes, and ideally it should represent the same or corresponding parts in those different shapes. I do not see such correspondences in your dataset, therefore I do not think BSP-Net would work, unless you try to overfit on a handful of samples.
from bsp-net-pytorch.
Hi @czq142857! Thanks for the clarifications. I'll check again to see if there are any differences between my implementation and the original.
Although, I think you are right about the dataset. I'm running some experiments with a simpler version of this dataset (smaller patches basically) and BSP-Net seems to perform better than the earlier runs. However, I still feel my dataset also has correspondences between parts across different objects (since it's just gt masks of mostly rectangular buildings in airborne images). I'll try a few more experiments with the same dataset on the original tf1 implementation to make sure this is not an implementation issue.
from bsp-net-pytorch.
Related Issues (18)
- training time HOT 1
- Questions about the segmentation experiment HOT 1
- Qusetion about the processed data from ShapeNet HOT 8
- SVR training with RGB images HOT 3
- Train SVR with different models HOT 1
- Normalizing meshes HOT 2
- How many model I need for training AE HOT 3
- How to sample point clouds from the mesh surface? HOT 1
- About texture HOT 2
- Train loss HOT 5
- Reconstruction from point clouds HOT 1
- Phase 3, overlap loss HOT 1
- SVR from RGB image input using pretrained model HOT 2
- Sensitive to initial value? HOT 2
- sample files HOT 6
- small question HOT 1
- Prerained model of SVR HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bsp-net-pytorch.