Giter VIP home page Giter VIP logo

Comments (4)

yjxiong avatar yjxiong commented on May 29, 2024 3

Generally, you should use the provided init models for initialization instead of the models trained on UCF101/HMDB51. The high gradient norm is usually due to increase iter_size, which is fine. You can increase the value of clip_gradient in your solver to fit it.

from temporal-segment-networks.

yjxiong avatar yjxiong commented on May 29, 2024

test_iter is computed by total_sample_num / batch_size / iter_size / num_gpu.

However, the error you list is not related to test_iter. If you look at your log, you will find that it is related to reshape layer. So you may have some shape mismatch instead of anything to do with test_iter.

from temporal-segment-networks.

yjxiong avatar yjxiong commented on May 29, 2024

One thing to add. This on-the-fly validation is just for monitoring the training. All experimental results should be reported with the provide testing scripts.

from temporal-segment-networks.

Xiao-G1023 avatar Xiao-G1023 commented on May 29, 2024

Thanks for your reply!! Now I know how to set the parameter test_iter.
About the above errors, I have find the missing point. In the layer
layer { name: "reshape_fc" type: "Reshape" bottom: "fc" top: "reshape_fc" reshape_param { shape { dim: [-1, 1, 3, 101] } } },
the last parameter is 101, which I didn't change the corresponding classes 2.
Now the fine-tuned TSN could work, but it does not work well when I use the ucf101_split_1_tsn_rgb_reference_bn_inception.caffemodel as the pretrained model, and shows as follows:
I0225 09:38:46.088831 16686 solver.cpp:631] Iteration 0, lr = 0.0001
I0225 09:38:46.588022 16686 solver.cpp:616] Gradient clipping: scaling down gradients (L2 norm 100.16 > 40) by scale factor 0.399362
I0225 09:38:47.023706 16686 solver.cpp:616] Gradient clipping: scaling down gradients (L2 norm 80.797 > 40) by scale factor 0.495068
I0225 09:38:47.458351 16686 solver.cpp:616] Gradient clipping: scaling down gradients (L2 norm 46.5655 > 40) by scale factor 0.859006
I0225 09:38:47.894577 16686 solver.cpp:616] Gradient clipping: scaling down gradients (L2 norm 51.9353 > 40) by scale factor 0.770189
I0225 09:38:48.331284 16686 solver.cpp:616] Gradient clipping: scaling down gradients (L2 norm 75.4552 > 40) by scale factor 0.530116
I0225 09:38:48.769134 16686 solver.cpp:616] Gradient clipping: scaling down gradients (L2 norm 63.1562 > 40) by scale factor 0.63335
......
What's wrong with it? Should I reduce the base_lr or some other parameters ?? Could you give me some advice?
Thank you !!

from temporal-segment-networks.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.