Comments (11)
@dereyly I met the same problem. I see your train log. The loss becomes very large at 500 iters and finally the loss_rpn_cls is still about 0.3. It means the model is not trained correctly. The case is similar to me. Do you know the reason?
from gcnet.
Sorry for the late reply
As mentioned in #1 (comment), the SyncBN has some stability issues. You are suggested to train the SyncBN model with batch size a least 16.
Also, it is not the trivial solution in my opinion. You are suggested to train a model without GC first to see if the mAP with drops.
from gcnet.
Hi, thanks for your work. I'm dealing with this issue too, can GN be used to replace SyncBN?
from gcnet.
Hi, thanks for your work. I'm dealing with this issue too, can GN be used to replace SyncBN?
Hi, we haven't try it. We may try it in the future. You are welcome to try it by yourself.
from gcnet.
Because I only have access to one GPU (P40), I can't use SyncBN. I have tried GN with a batch size of 1 and 10241024 image input and default normalization setting with a batch size of 6 and 512512 image input (small input due to memory limitation), the later worked fine, but the first just gave no bbox prediction. Then by print, I found that when using GN, the feature map after the backbone net will become with the same pixels (like [-3.1547, -3.1547, -3.1547 .....]) after some training.
from gcnet.
By the way the model is cascade_mask_rcnn_r16_gcb_dconv_c3-c5_x101_32x4d_fpn_syncbn_1x
from gcnet.
I want to train a SyncBN model in a single P40, any suggestion is welcome. Thank you.
from gcnet.
There is no need for SyncBN when only one GPU is available. You could use BN instead. Linear scale up strategy should be adopted according to your effective batch size.
I don't think non SyncBN would have Zero mAP issue. The fixBN results could be found in the updated readme.
from gcnet.
It turns out that the batch size of 1 works if I use a smaller learning rate with longer time consumed. So I think this issue may have two causes, one is a too large learning rate, the other is some data input error (as mentioned in tensorflow/models#6273).
from gcnet.
In my machine learning course, I have learned that small learning rates may lead to local optima, but is no prediction a local optimum for object detection tasts like this?
By the way, in the paper https://arxiv.org/pdf/1803.08494.pdf, one can see the shortcoming of small batchs without GN, and my machine learning course also taught me that a smaller batch will cause a worse result if the dataset is not large enough, so I want use some method to deal with the problem of a small batch. Thanks for you replying.
from gcnet.
By the way, the batch size of 1 didn't work very well in the first 1/3 period of training. I'm still finding better settings.
from gcnet.
Related Issues (20)
- What‘s the value of transform module mean? HOT 3
- Did anyone use GCNet on Optical Flow features?
- How can I use gc block in resnet18? HOT 3
- Simplified NL HOT 1
- gcnet performs not good on segmentation tasks. HOT 1
- Questions about training
- AP, AR=-1 while evaluation at the end of each epoch HOT 8
- Attention maps in Different query position HOT 1
- GCNet with pretrianed model on COCO detection?
- Could it be used in 3D data? HOT 1
- possible replacements for layernorm HOT 1
- 找个GC Block这么难? HOT 3
- Mask for training
- Visualization code wanted HOT 1
- change mask-rcnn to faster-rcnn?
- Does GCNet have 1d? HOT 1
- some of the problems HOT 3
- Add location based on yolov7
- Welcome update to OpenMMLab 2.0
- Weight download problem
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gcnet.