Comments (9)
@Dref360 Did you change anything at around step 7?
The main losses to pay attention to are the individual losses like rpn_class_loss, mrcnn_bbox_loss, ..etc. You'd want to see nice graphs on those like the ones posted by @Dref360 above.
The total loss is the sum of the individual losses plus the L1 weight regularization
loss. The L1 weight regularization
loss is the sum across all trainable weights, so it could change drastically if you change the number of layers included in the training. So if you train the heads only and then switch to training all the layers, you'd see a big jump in the total loss because you're including more layers and therefore the sum of the L1 of the weights is larger. This is okay.
It might be a good idea to divide the L1 regularization by the number of weights to get a mean rather than a sum, and that should remove that unexpected behavior. I'll look into doing that this weekend.
from mask_rcnn.
Doesn't seem normal. Does it go down afterwards? You can try a smaller learning rate and see if that improves the training.
from mask_rcnn.
The rpn_loss and mrcnn_loss are normal while the loss(l1_loss) is jumps a high value(like epoch 40: loss = 1.9,while epoch 41:loss =13.1,other loss are normal).And I try a smaller learing rate(lr = 0.001,0.0001),but it is also in this situation.
from mask_rcnn.
Yeah I have a similar problem. All the losses are small but this one.
from mask_rcnn.
from mask_rcnn.
# Add L2 Regularization
reg_losses = [keras.regularizers.l2(self.config.WEIGHT_DECAY)(w) for w in self.keras_model.trainable_weights if 'gamma' not in w.name and 'beta' not in w.name]
Gamma and beta parameters shouldn't be included in regularization loss. (batch norm isn't updated by the backprop)
from mask_rcnn.
@leicaand Good catch. I pushed the fix. Thanks.
I also pushed an update to divide the weight regularization by the number of weights so the loss is the mean of the L2 rather than the sum. This removes the confusing jump in the total loss in the graphs.
from mask_rcnn.
I am confused about this issue.
First, in batchnorm layer, setting trainable False means not updating the running mean and std but not the beta and gamma, and they are still trainable. Because I think beta and gamma is updated via gradient but not this update op. Also another evidence, the beta and gamma in trained model is not zero and one, indicating that they have been updated during training.
Second, does it make sense to divide the l2 loss by its size? Cause its gradient is also divided by this factor, the bigger the size of a weight matrix, the less it's updated every step by the weight regularization loss. I don't think it is a good idea.
from mask_rcnn.
Batchnorm has 4 different weights, running mean and std is updated by moving average operation while beta and gamma are updated via gradient. If you want to skip those aren't updated during bp, you should exclude 'moving_mean' and 'moving_variance' but not 'beta' and 'gamma'
from mask_rcnn.
Related Issues (20)
- @kecyarchana @SebastianNagy @jamajk @babanthierry94 I am having the same problem. Could you please kindly share me the exact older tag version or send the file to me? My email is [email protected]
- Current edition of model is not able to be executed with on tensorflow 2.15.0 HOT 1
- error: the following arguments are required: <command>, --weights
- Dataset Bias Concerns in Mask RCNN
- code
- OK image learning
- Mask RCNN TF2 not working properly HOT 1
- ValueError: Error when checking input: expected input_image_meta to have shape (14,) but got array with shape (93,) HOT 1
- model = MaskRCNN(mode='training', model_dir=DEFAULT_LOGS_DIR, config=config)
- AttributeError: module 'tensorflow' has no attribute 'random_shuffle'
- how to download this project HOT 1
- Create annotations for polyp image dataset HOT 1
- How can I add an attention mechanism to Mask R-CNN? Is there a specific code implementation available? HOT 1
- NotImplementedError: Save or restore weights that is not an instance of `tf.Variable` is not supported in h5, use `save_format='tf'` instead. Got a model or layer Conv2D with weights [<KerasVariable shape=(7, 7, 3, 64), dtype=float32, path=conv1/kernel>, <KerasVariable shape=(64,), dtype=float32, path=conv1/bias>] HOT 2
- Unresolved reference hdf5_format
- mrcnn not valid
- No module named 'keras.engine' HOT 5
- Problem in importing maskrcnn
- Maskrcnn import error HOT 2
- 'set_weights' error
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mask_rcnn.