Comments (5)
Based on the reruns, the behaviour is affected noticeably by enabling the training mode. I reverted the change but added a flag to enable training mode. To make the default behaviour reproducing the results, by default the flag is disabled.
Regarding why the results are affected, we only have guesses. Results for VGG19 that only has dropout is affected more compared to Resnet with only batchnorm. In general, dropout is not necessarily needed for all models. Especially, in learning joint embeddings it might make the optimization more difficult. Maybe the amount of data matters. In any case, we will investigate further.
Thanks again for reporting.
from vsepp.
Revisiting this issue, the test loss seems to be lower for when the training mode is reset correctly compared to when it's only reset at the beginning of the epoch. There is a difference in the performance as measured by R@K but it's not a better or worse solution considering all K's. The master branch stays on the PyTorch 0.2 compatible code to keep the results reproducible. PyTorch >0.4 forces us to fix this issue, so there is a new branch for compatibility with PyTorch 0.4.
from vsepp.
Good catch. Thank you. I'm checking if it has any effect on the results but I don't expect it to have much. The image models used are pre-trained and kept fixed unless finetuning is enabled. So batch norm should be mostly fine. But dropout could help by regularizing the model.
from vsepp.
Hi,
I also found this problem and I found that in my experiment, fixing this bug will cause the performance drop when finetuning CNN is enabled, which is strange....
from vsepp.
Thanks @xh-liu for reporting. I'm reopening this issue as I'm observing non-negligible changes in the results.
from vsepp.
Related Issues (20)
- How to caculate the scores on MSCOCO 1k test images? HOT 3
- Metrics for 1k test images on MS COCO HOT 1
- Loss stuck, not decreasing HOT 2
- The question about loss function HOT 1
- How to build vocab? HOT 2
- Can't reproduce the result using pytorch 0.4.1 branch HOT 3
- questions on dataset construction HOT 3
- encoding data
- about use dataset HOT 1
- Runs file too large HOT 1
- FileNotFoundError when try to reproduce results of pretrained model HOT 1
- train on synthetic dataset HOT 2
- Reproducing results HOT 5
- Same meanr being logged by tb_logger during validation HOT 1
- The number of COCO validation images HOT 1
- loss gap between train and test HOT 1
- Question about your model ? HOT 1
- Where are your model weights stored? HOT 1
- Doubt
- RuntimeError: mat1 and mat2 shapes cannot be multiplied (4608x2048 and 4096x1024) HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vsepp.