Giter VIP home page Giter VIP logo

Comments (3)

Jungyhuk avatar Jungyhuk commented on July 24, 2024

From my understanding of your colab and your email, at least some of the model checkpoints in your later training process should have a good performance if you evaluate it.

When you perform the training process, the rewriting traces are randomly sampled according to the learned distribution, and due to the randomness, they may not be optimal and even not good occasionally. Also, I manually increase the magnitude of the training losses in the dumped log file, so they may not seem stable by the values themselves.

To evaluate the model, you may adjust your command in the following way:

run_jsp.py --eval --load_model /path/to/ckpt --max_reduce_steps 100

The README includes the description of some flags above. For jsp with a larger size, it would be beneficial to allow a larger number of rewriting steps during evaluation to achieve better performance.

If you have more detailed questions, feel free to reply to our email thread for further clarification.

from neural-rewriter.

LucasBoTang avatar LucasBoTang commented on July 24, 2024

Thank you for your clarification. I think now I know what I missed -- the reward you used for the jsp task is the slowdown (In fact, it is minus slowdown so we can minimize the average slowdown), which confused me a lot previously because the logs and printouts are not the minus one.

It might also be confusing for others since we expect the "reward" is something we want to maximize.

Thank you for your time again. I really appreciate!

from neural-rewriter.

Jungyhuk avatar Jungyhuk commented on July 24, 2024

Oh I see, yes your understanding is correct, sorry for the confusion!

from neural-rewriter.

Related Issues (11)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.