Comments (3)
Hi, think in the following way may help. If we have a positive value for q target, the q table will also add some value at the end. So the values in q table will somehow explode.
from reinforcement-learning-with-tensorflow.
Do you mean large positive values or just positive values. If the values are between 0 to 1, then it may not.
But later I saw that you confirmed to Qdash(S,A) - Q(S,A).
In the very example q_table.ix[S, A] += ALPHA * (q_target) is working well and converges faster. Would be interesting to understand when you could end up in blow out.
Thanks for responding to the question though.
from reinforcement-learning-with-tensorflow.
It will show the right behaviours in this example, but it will never coverge. Actrally, no matter the value's sign, any value will give you an unconverged but right behaviour result.
If you keep running the script in your way, you will find your q table will exceed its capacity to hold one values, it may show NaN at the end
from reinforcement-learning-with-tensorflow.
Related Issues (20)
- Pytorch version of your code
- Validating the trained model with a provided trajectory
- pytorch
- Prioritized Experience Replay 中设置transition的priority
- 请问一下gym配置文件是哪一个
- Q-learning 的 Maze的红方块不显示颜色
- 模型保存
- 请问如何在tensorboard中展示DDPG reward值的变化趋势?
- Curiosity algorithm
- DQN的代码中,计算q_target时未考虑done为true的情况
- treasure on right例子中的程序报错
- 2D car project
- 关于Q_learning章节中某个方法已经deprecated的疑惑
- 计算机资源利用率低
- 关于open AI gym运行报错 HOT 2
- 每次运行实例都会出现中断,产生keyerror: HOT 1
- 关于DDPG算法 HOT 1
- pandas==1.4.4 FutureWarning解决:关于'df.append' use 'pandas.concat' instead. HOT 2
- INPUT and OUTPUT-solve classifier-question
- 关于10_A3C文件夹里面后三个代码文件出现如下问题:tuple indices must be integers or slices, not tuple的解决办法
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from reinforcement-learning-with-tensorflow.