Giter VIP home page Giter VIP logo

Comments (7)

akhilsanand avatar akhilsanand commented on May 15, 2024

hello,
interestingly the action bound in your code affects the learning very much. In my custom environment there are only 2 actions required and they are within a range of [0,1]. But when I try with [-1, 1] for action bound in your code it doesn't get stuck in local minima anymore unlike with [0,1]. Could you please explain this phenomena.

Regards,
Akhil

from reinforcement-learning-with-tensorflow.

MorvanZhou avatar MorvanZhou commented on May 15, 2024

Hi Akhil,

It may be related to the activation function you selected. For example, if mapping action to (-1, 1), you will choose tanh as the mapping function and sigmoid for (0, 1), these two mappings have different derivative which may affect your training.

from reinforcement-learning-with-tensorflow.

akhilsanand avatar akhilsanand commented on May 15, 2024

hello Zhou,

Thank you very much for the reply, I have tried the sigmoid activation function with [0,1] action bound, but it still get stuck in the local minima. But with sigmoid act. function and [-1,1] as action bound it again starts learning really well. Do u have some idea about it?

regards,
akhil

from reinforcement-learning-with-tensorflow.

MorvanZhou avatar MorvanZhou commented on May 15, 2024

Then I think it is likely that the backprop with tanh is better than sigmoid. This might be one of the reasons.

from reinforcement-learning-with-tensorflow.

akhilsanand avatar akhilsanand commented on May 15, 2024

hello Zhou,

But I am getting good results with sigmoid act. function and action bound of [-1, 1]. This makes me confused on how the action bound is really affecting even after using a sigmoid activation function.

from reinforcement-learning-with-tensorflow.

MorvanZhou avatar MorvanZhou commented on May 15, 2024

The calculation of action bound is tf.clip_by_value(output, lower_bound, upper_bound), Due to the action is normal-distributed, sometimes the output can still be less than 0 when using sigmoid mean. Therefore, taking action bound of (-1,1) still affect the final result.

from reinforcement-learning-with-tensorflow.

akhilsanand avatar akhilsanand commented on May 15, 2024

thanks zhou

from reinforcement-learning-with-tensorflow.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.