Comments (2)
Hi,
yes, num_non_zero_triplets
is the number of triplets with non-zero loss value, which we call active triplets in the paper.
I cannot answer the question about the differences in loss value. But the loss value can be misleading - as the loss is calculated (in BatchHardTripletLossWithMasksHelper class) by averaging the loss of each active triplet (triplets with non-zero loss value). So when the number of active triplets goes down during the training, the value of the training loss may increase - as harder active triplets retain.
Because of this fact, in the paper, we argued that to monitor the training progress and overfitting behavior, it's better to observe the number of active triplets, not the loss value. In the later phases of training, likely the number of active triplets can still decrease while the loss stagnates (because in our implementation the loss is calculated by taking a mean of the loss from active triplets only).
When auxiliary uni-modal losses are disabled (alpha=0 and beta=0 - see Table IV in the paper), there are:
- significantly fewer active triplets for images than for point clouds on the training set
- but more active triplets for images than for point clouds on the validation set
We concluded that the reason is that the image-processing sub-network was more prone to overfitting than the point cloud-processing sub-network. When constructing the final descriptor, the entire network learns to focus only on input images and mostly ignore the point clouds - because such a strategy gives good results during the training. But due to overfitting, the results on the test set are much worse.
To counteract this problem, we can increase the weight of the point cloud modality (parameter alpha). Now, during the training, the network will focus on both modalities.
I'm not sure what's the reason.
Even if alpha or beta (or both alpha and beta) are zeros, both uni-modal subnetworks will learn something, and the uni-modal loss (and the number of active triplets) will usually get down. This is because we minimize the final loss function - and to minimize the final loss, each uni-modal network should generate discriminative descriptors. This will indirectly decrease the uni-modal loss.
from minklocmultimodal.
Sorry for the late reply and thanks for your answer, it provides me with some new perspectives to think about the question.
from minklocmultimodal.
Related Issues (20)
- Question about how to generate batches? HOT 2
- loss become nan during training HOT 8
- Question about the model structure HOT 3
- How to normalize the RobotCar pointcloud data to [-1,1]? HOT 2
- How to evaluate on KITTI dataset? HOT 3
- Question about pointcloud data HOT 5
- Question about the code
- Bug in lidar2image_ndx generation for val queries HOT 1
- the Oxford Robotcar Dataset unavailable HOT 2
- Something is wrong with the mapping file lidar2image_ndx.pickle HOT 7
- The pre-processed RobotCar images are unavailable HOT 2
- Oxford Dataset RGB Image Process HOT 5
- RobotCar images HOT 2
- KITTI dataset details HOT 5
- MinkLoc++ (RGB-only) generation HOT 4
- bin file data format HOT 2
- How about the MinkLoc++ inference efficiency compared to MinkLoc3D HOT 4
- About how to get image to lidar dataset? HOT 4
- Question about the code- "positives_masks" & "negatives_masks" HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from minklocmultimodal.