Giter VIP home page Giter VIP logo

Comments (3)

bhancock8 avatar bhancock8 commented on August 29, 2024

Hmm, hard to say without more information. It's possible that your batches are not randomly shuffled somehow (by default they should be). But if your model was seeing all batches of one task followed by all of the other, that could explain the oscillating effect you described. Have you also tried lowering the learning rate to see if that calms things down?

from metal.

Peter-Devine avatar Peter-Devine commented on August 29, 2024

So I managed to fix this problem by changing the optimiser config to this:

optimizer_config = {
        "optimizer": "sgd",
        "optimizer_common": {"lr": 0.005},
        "sgd_config": {"momentum": 0.01},
    }

Which has meant that my datasets' dev accuracies both peak within a few epochs at a level that I would expect.
I am currently under the assumption that this was a problem with my datasets, instead of with the implementation of the default Adam optimiser. I would recommend this fix to anyone else with this problem working on obscure datasets as it has worked well for me in this case.

from metal.

bhancock8 avatar bhancock8 commented on August 29, 2024

Thanks for sharing! Glad you got it working for your problem.

from metal.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.