The super-mario-bros-dqn from roclark

BrokenPipeError

got broken pip error when all env.step() in test, any suggestion helps

Traceback (most recent call last):
  File "train.py", line 109, in <module>
    main()
  File "train.py", line 106, in main
    train(env, model, target_model, optimizer, replay_buffer, args, device)
  File "train.py", line 90, in train
    device, info, episode)
  File "train.py", line 81, in run_episode
    episode, epsilon, stats, args.action_space)
  File "train.py", line 46, in complete_episode
    test_new_model(model, environment, info, action_space)
  File "train.py", line 33, in test_new_model
    test(environment, action_space, info.new_best_counter)
  File "/home/xiaosongwen0313_gmail_com/RL_final/super-mario-bros-dqn/test.py", line 26, in test
    state, reward, done, info = env.step(action)
  File "/home/xiaosongwen0313_gmail_com/RL_final/super-mario-bros-dqn/core/wrappers.py", line 116, in step
    state, reward, done, info = self.env.step(action)
  File "/opt/conda/envs/RL/lib/python3.7/site-packages/gym/core.py", line 261, in step
    observation, reward, done, info = self.env.step(action)
  File "/opt/conda/envs/RL/lib/python3.7/site-packages/gym/core.py", line 261, in step
    observation, reward, done, info = self.env.step(action)
  File "/opt/conda/envs/RL/lib/python3.7/site-packages/gym/core.py", line 261, in step
    observation, reward, done, info = self.env.step(action)
  [Previous line repeated 1 more time]
  File "/home/RL_final/super-mario-bros-dqn/core/wrappers.py", line 37, in step
    obs, reward, done, info = self.env.step(action)
  File "/opt/conda/envs/RL/lib/python3.7/site-packages/nes_py/wrappers/joypad_space.py", line 74, in step
    return self.env.step(self._action_map[action])
  File "/opt/conda/envs/RL/lib/python3.7/site-packages/gym/wrappers/monitor.py", line 32, in step
    done = self._after_step(observation, reward, done, info)
  File "/opt/conda/envs/RL/lib/python3.7/site-packages/gym/wrappers/monitor.py", line 174, in _after_step
    self.video_recorder.capture_frame()
  File "/opt/conda/envs/RL/lib/python3.7/site-packages/gym/wrappers/monitoring/video_recorder.py", line 116, in capture_frame
    self._encode_image_frame(frame)
  File "/opt/conda/envs/RL/lib/python3.7/site-packages/gym/wrappers/monitoring/video_recorder.py", line 166, in _encode_image_frame
    self.encoder.capture_frame(frame)
  File "/opt/conda/envs/RL/lib/python3.7/site-packages/gym/wrappers/monitoring/video_recorder.py", line 303, in capture_frame
    self.proc.stdin.write(frame.tobytes())
**BrokenPipeError: [Errno 32] Broken pipe**
ERROR: VideoRecorder encoder exited with status 1
Exception ignored in: <function Monitor.__del__ at 0x7f9ee4e7a9e0>
Traceback (most recent call last):
  File "/opt/conda/envs/RL/lib/python3.7/site-packages/gym/wrappers/monitor.py", line 229, in __del__
    self.close()
  File "/opt/conda/envs/RL/lib/python3.7/site-packages/gym/wrappers/monitor.py", line 134, in close
    super(Monitor, self).close()
  File "/opt/conda/envs/RL/lib/python3.7/site-packages/gym/core.py", line 236, in close
    return self.env.close()
  File "/opt/conda/envs/RL/lib/python3.7/site-packages/gym/core.py", line 236, in close
    return self.env.close()
  File "/opt/conda/envs/RL/lib/python3.7/site-packages/nes_py/nes_env.py", line 338, in close
    raise ValueError('env has already been closed.')
**ValueError: env has already been closed.**

Unable to reproduce the results shown in the demo using the .dat files provided.

I'm using the following code in my terminal:

python train.py --action-space complex --environment SuperMarioBros-1-1-v0 --transfer

using the SuperMarioBros-1-1-v0.dat file provided by you in the pretrained_models folder, in order to view the results from the parameters stored in the .dat file. However, the results are not at all similar to the ones shown in your demo under the 'Progress' section.

Could you please double check this. I'm currently having this problem with your super-mario-bros-dqn project. I haven't yet tested it on the super-mario-bros-a3c one.

Little/no progress in agent's performance

No offence, but have you faked your results? I've been training for over 24 hours and the agent is nowhere near the 'optimal' performance demonstrated in the description. How many episodes did you train your agent to achieve that performance?

CustomReward class's idea

Hi.

Could you please give an explanation about how you came up with the idea of the changes that you have applied to the reward signal?
specifically I mean this class.

Thanks in advance.

roclark / super-mario-bros-dqn Goto Github PK

super-mario-bros-dqn's People

Contributors

Stargazers

Watchers

Forkers

super-mario-bros-dqn's Issues

BrokenPipeError

Unable to reproduce the results shown in the demo using the .dat files provided.

Little/no progress in agent's performance

CustomReward class's idea

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent