Comments (6)
When using the gym.vector.VectorEnv
API (which is what you're using in this case since batch_size=2048
), you don't need to reset the individual envs because they already auto-reset when the episode in any given env is done.
Also just FYI:
- when you have
done=True
for the environment at a given indexenv_idx
, thenobs[env_idx]
is the first observation of the next episode, not the final observation of the previous episode. - Likewise, the reward for env
env_idx
whendone[env_idx]==True
is the reward associated with the last action you sent. You will have non-zero rewards at that index after that step: since those will be the rewards of the next episode!
Hope this helps :)
from brax.
We have some new reset logic plumbing that should resolve this issue in the next couple of days. Originally, we found that we didn't really need to reset during rollouts--we'd just run a rollout for a fixed episode length and then mask out frames after the point at which a done
was triggered. Admittedly, this is not what folks are used to, so we'll introduce a wrapper that does that actual resetting logic as you'd usually expect.
from brax.
OK! This should be addressed. Envs by default now reset after done=True. You can still get the old behavior if you wish to control auto-resetting yourself, by calling envs.create(..., auto_reset=False)
from brax.
@lebrice Thank you very much for the help!
But if they're on an auto-reset mode, done[env_idx] of next episode should be False.
I have checked the next value of the same env_idx but it was still True. That's a bug?
from brax.
done[env_idx] of next episode should be False.
No! (Edit: maybe I'm misunderstanding your problem though)
Are you saying that you get done[env_idx]==True
multiple steps in a row for the same env_idx?
from brax.
Are you saying that you get
done[env_idx]==True
multiple steps in a row for the same env_idx?
@lebrice Yes.
from brax.
Related Issues (20)
- Cannot run simple MJX example on standard v4-8 Cloud TPU VM HOT 2
- Pusher environment with Spring pipeline HOT 3
- Optimizer with MultiTransform throws ValueError HOT 2
- Documentation bug: wrong observation and action spaces ordering HOT 3
- `mjx.ncon` removed as of MuJoCo 3.1.5 HOT 2
- policy callback 'policy_params_fn' for other algorithms? HOT 1
- ptxas version missmatch HOT 1
- CUDA OOM with jax/pytorch notebook HOT 1
- very very slow on local computer (even with GPU) HOT 2
- What replaced the old "pmap.is_synchornized" called in new brax versions? HOT 1
- Jacobian of State Dynamics HOT 3
- how to do n controlled physic steps per every control step
- NaNs at Inference HOT 7
- Setting Initial Camera Position in Brax Visualizer
- Rendered Plane Texture Issue in Visualizer
- Brax's Simulator Engine Swap
- TypeError: RandomNumberGenerator._generator_ctor() takes from 0 to 1 positional arguments but 2 were given
- Domain randomization with mjx backend
- Reference backend for RL literature
- Wrong flag definition in `brax.training`
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from brax.