What do you want to improve? In the images depicting the Q-table i

See <a href="https://github.com/huggingface/deep-rl-class/pull/454" data-hovercard-typ

[UPDATE] Misleading characterization of state in the Q-Table about deep-rl-class HOT 3 CLOSED

lutzvdb commented on June 20, 2024

[UPDATE] Misleading characterization of state in the Q-Table

from deep-rl-class.

Comments (3)

lutzvdb commented on June 20, 2024 1

Thank you for weighing in. I think you're right in that for these simple environments, the position alone is considered sufficient for describing the state of the environment. A clarification in the text explaining this would however be very helpful in understanding. I'll create a PR adding a clarification statement!

from deep-rl-class.

Ivan-267 commented on June 20, 2024

Hello, as I am relatively new to RL terminology, my comment below is partially guessing.

I think in this case, the state is only the current position of the agent. In the hands-on, this environment is used: https://www.gymlibrary.dev/environments/toy_text/frozen_lake/

If we take a look at the observations:

Observation Space
The observation is a value representing the agent’s current position as current_row * nrows + current_col (where both the row and col start at 0). For example, the goal position in the 4x4 map can be calculated as follows: 3 * 4 + 3 = 15. The number of possible observations is dependent on the size of the map. For example, the 4x4 map has 16 possible observations.

So for an environment such as this, we can calculate the best action for the agent to take based on its position in the grid.

While the full state of the environment itself may include various additional information (position of everything on the grid, image data for rendering graphical elements, additional internal game state), the agent receives only the observations that are necessary for learning, and in a simple environment, that can be only the current position of the agent.

However, the inputs to the Q function are called states (the function gives you the value for any state and action that you input).
From: https://huggingface.co/learn/deep-rl-course/unit2/q-learning

Given a state and action, our Q-function will search its Q-table for the corresponding value.

There is a note here that clarifies the terminology used:
https://huggingface.co/learn/deep-rl-course/unit1/rl-framework

In this course, we use the term "state" to denote both state and observation, but we will make the distinction in implementations.

Of course, for a more complex environment, we may have to provide the agent with more information about the current state of the environment.

from deep-rl-class.

lutzvdb commented on June 20, 2024

See PR 454. Closing this issue!

from deep-rl-class.

Recommend Projects

[UPDATE] Misleading characterization of state in the Q-Table about deep-rl-class HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent