pwhiddy / pokemonredexperiments Goto Github PK

View Code? Open in Web Editor NEW

6.6K 63.0 575.0 67.62 MB

Playing Pokemon Red with Reinforcement Learning

License: MIT License

Python 1.40% Jupyter Notebook 98.60% Shell 0.01%

pokemonredexperiments's Introduction

Train RL agents to play Pokemon Red

New 1-29-24! - Multiplayer Live Training Broadcast 🎦 🔴 View Here

Stream your training session to a shared global game map using the Broadcast Wrapper

See how in Training Broadcast section

Watch the Video on Youtube!

Join the discord server

Running the Pretrained Model Interactively 🎮

🐍 Python 3.10+ is recommended. Other versions may work but have not been tested.
You also need to install ffmpeg and have it available in the command line.

Windows Setup

Refer to this Windows Setup Guide

Linux / MacOS

Copy your legally obtained Pokemon Red ROM into the base directory. You can find this using google, it should be 1MB. Rename it to PokemonRed.gb if it is not already. The sha1 sum should be ea9bcae617fdf159b045185467ae58b2e4a48b9a, which you can verify by running shasum PokemonRed.gb.
Move into the baselines/ directory:
cd baselines
Install dependencies:
pip install -r requirements.txt
It may be necessary in some cases to separately install the SDL libraries.
Run:
python run_pretrained_interactive.py

Interact with the emulator using the arrow keys and the a and s keys (A and B buttons).
You can pause the AI's input during the game by editing agent_enabled.txt

Note: the Pokemon.gb file MUST be in the main directory and your current directory MUST be the baselines/ directory in order for this to work.

Training the Model 🏋️

10-21-23: Updated Version!

This version still needs some tuning, but it can clear the first gym in a small fraction of the time and compute resources. It can work with as few as 16 cores and ~20G of RAM. This is the place for active development and updates!

Previous steps 1-3
Run:
python run_baseline_parallel_fast.py

Tracking Training Progress 📈

Training Broadcast

Stream your training session to a shared global game map using the Broadcast Wrapper on your environment like this:

env = StreamWrapper(
            env, 
            stream_metadata = { # All of this is part is optional
                "user": "pw", # choose your own username
                "env_id": id, # environment identifier
                "color": "#0033ff", # choose your color :)
                "extra": "", # any extra text you put here will be displayed
            }
        )

Hack on the broadcast viewing client or set up your own local stream with this repo:

https://github.com/pwhiddy/pokerl-map-viz/

Local Metrics

The current state of each game is rendered to images in the session directory.
You can track the progress in tensorboard by moving into the session directory and running:
tensorboard --logdir .
You can then navigate to localhost:6006 in your browser to view metrics.
To enable wandb integration, change use_wandb_logging in the training script to True.

Static Visualization 🐜

Map visualization code can be found in visualization/ directory.

Supporting Libraries

Check out these awesome projects!

PyBoy

Stable Baselines 3

pokemonredexperiments's People

Contributors

Stargazers

Watchers

Forkers

mollybeach carsontang baekalfen ko9ma7 jonatthu salvos jamie-bell silverasdf timov2003 jpsieben7 jtshiv zmeygoryth fr0gue kevinvitale richardgottschalk hbcbh1999 daking pixelplotline hughmungis tangweejieleslie bad1dea jack-white9 joshualiangxy assassinsgreed ethangamma24 alexmfv cece95 viraj-s15 michaeldlee1 anxiousnix dothebesttogetthebest smark0 nazrat450 abid123456 min3craftdud3 salismaxima navneetnayak cpuchner chavezmarchant case7026 econds alexkrolak marco-secci rrmina ailabteam aakashjammula dineshsenapathi farhadfa22 nopponaim603 elliotwood thuywwtf11 spidey358 bpblakely laposata maximilian22x korolevna k2m5t2 arunleer wrathofzombies mcowar kjbaker-uk chriskalahiki johnthompson6311256 genechu7 aminbahrami71 josegron xsa-dev jlbkmve chalkmeyer lidor1602 gamerquant9z aadityasaraiya latunjibams tranthigam vpegasus wysstartgo berutodo cecato jjxx0124 chacix kailabtw tnvmadhav klong4 jorgecastillag halexs itcast-fronts gilesveronica eric0112358 alexhitrov1 liyangwang2023 lolibra2k4 fcjc14 alexyen1000 nilutz eltociear 1773909 tsalemink kyroskoh local-test-org techthiyanes

pokemonredexperiments's Issues

[reference] pokemon emerald ai

https://www.youtube.com/watch?v=9YyQJIuN7n0

Thought I would reference this item. as it appears relevant

https://remptongames.com/2020/06/22/how-i-taught-an-ai-to-play-pokemon-emerald/

https://github.com/RemptonGames/PokemonEmeraldAI

more advanced reference: https://www.youtube.com/watch?v=rhvj7CmTRkg

Discord Server

My issue is that there isn't a discord to discuss this project :(

Add Pokémon Red hash to readme for ROM validity

I would suggest adding an expected hash for the copy of the game that you're expecting the AI to run; something like the output of md5sum PokemonRed.gb. That way, it'd be far easier to validate that this is the correct ROM to be running in this environment, rather than one that might be slightly tweaked, and thus might have memory values in unexpected locations. Simply saying it's "about 1MB" unfortunately isn't that helpful.

how to start such a project ?

Hi,
I'm French (sorry bad English) and I have a a school project to train autonomous model cars for real races . Here the link https://ajuton-ens.github.io/CourseVoituresAutonomesSaclay/

What i want to know is how to approach the task ? I'm planning to use gymnasium and stable-baselines3 for training. However, I find that you manage to show the progress of the IA really well, something I hope to achieve.

So the big questions : how many hours have you put in it ? From witch angle do you start the project ?

Your project already inspired mine so I will start working to hopefully have a car that make a least one lap.
Thank you for your amazing work !

حرية شخصية

👍

.

@PWhiddy delete this issue please. thank you

Running this on Blue Kaizo seems like a probable idea, with the built in level curves, and necessities to build teams around super effective strategies, it would learn more to be applied to regular red/blue

run_baseline_parallel.py does not seem to restart from checkpoint?

might be line 50?
in run_baseline_parallel.py it is defined differently as
file_name = 'session_e41c9eff/poke_38207488_steps' #'session_e41c9eff/poke_250871808_steps'
in run_baseline.py it is defined differently as
file_name = 'poke_' #'best_12-7/poke_12_b'

I have no folder 'session_e41c9eff', so this seems to be misconfigured.

Can this be fixed?

Program 'ffmpeg' is not found

I'm running on Windows 11, and I got this error:
Program 'ffmpeg' is not found

How to solve?

Where did you get 100 Gb of ram

are you using some google collab or something?

Visualization Tutorials

Would someone be able to do a quick tutorial on some of the visualization options using notebook for this project? I'm not super familiar with notebook and I seem to keep getting errors. I'd like to be able to see the map and the charts from my training runs that show where I have been moving and what actions were taken. Also, is the program CPU only? I have an Nvidia P40 I use for other AI programs and it would be really cool to use that to speed this up.

ArrayMemoryError

Traceback (most recent call last):
  File "C:\Users\x\Desktop\PokemonRedExperiments\baselines\run_pretrained_interactive.py", line 50, in <module>
    model.rollout_buffer.reset()
  File "C:\Python311\Lib\site-packages\stable_baselines3\common\buffers.py", line 364, in reset
    self.observations = np.zeros((self.buffer_size, self.n_envs, *self.obs_shape), dtype=np.float32)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 960. GiB for an array with shape (16777216, 1, 3, 128, 40) and data type float32

Windows 11, Python 3.11.5.

Crashes with IndexError Out of Bounds

Crashes after ~40min - 1 hour

File "/home/gingi/kick/PokemonRedExperiments/baselines/run_pretrained_interactive.py", line 70, in
env.render()
File "/home/gingi/kick/PokemonRedExperiments/baselines/red_gym_env.py", line 165, in render
self.create_exploration_memory(),
File "/home/gingi/kick/PokemonRedExperiments/baselines/red_gym_env.py", line 314, in create_exploration_memory
make_reward_channel(explore)
File "/home/gingi/kick/PokemonRedExperiments/baselines/red_gym_env.py", line 304, in make_reward_channel
memory[:col, row] = 255
IndexError: index 40 is out of bounds for axis 1 with size 40
start.sh: line 3: 845058 Segmentation fault (core dumped) python run_pretrained_interactive.py

[SUGGESTION]Use numba to visualize heavy numpy programs

https://numba.pydata.org
https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html

Numba is a JIT compiler made to handle mathematical and statistical tasks faster. If you're examining data after the run, it could crunch numbers faster than running numpy through regular CPython. could serve as a bandaid for the performance issue mentioned in #79

No idea if it could help any kind of live visualization, since you'll still need to run the same methods every frame

I'm slow and need help

I have no clue what im doing, i'm trying to understand the code and instructions and need help.

Can someone please add a video or dumb it down for me to install this program

Please Help
TheRichRicky

Add data table to readme with hyperparameters and training results

Requested by ballman_fm

Create a shared database to post training results and beating the game collaboratively?

Yo Peter,

Congrats for this project and thanks for bringing back the fun of beating this game again.

What if we all used our shared resources to allocate CPU power and training results until we beat this game together

I was wondering if it would be useful to you (and eventually this community of PokeAiNerds) to create a DB with an endpoint to store and retrieve training results and connect it to the repo. I could build it and let it open for donations and we as a community could share the costs and the training results to feed the model.

We could also create tasks groups with specific configs to tackle different areas of the game with different problems to solve until we find the great solution.

Well lmk if this is smth that sounds interesting to you and i get my hands to work

I am experiencing the following error. pip install -r requirements.txt

I am getting the following error. trying with python 3.10
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Error installing dependecies (pip install -r requirements.txt)

Hello,
Firstly thank you for reading and helping with my problem.
I am trying to install dependencies (pip install -r requirements.txt)
but keep bumping into the same errors - this being the first -

AttributeError: module 'pkgutil' has no attribute 'ImpImporter'. Did you mean: 'zipimporter'?
[end of output]

and then also -

Can someone point me to the right direction? I am pretty new to this and cant seem to figure this out despite googling around.

Thank you again.

changing novelty reward to encourage return to pokemon center to heal

Watched the video last night and wow, loved the challenge and the topic.

One thing that stood out to me is how the agent just goes forward till it gets wiped out and repeats. I assume that the current strategy of the agent is to go forward till it wipes out, and repeat until the team is strong enough to go through the next area.

Was wondering if you had thought of any changes to the novelty reward to make it play more like a human does in the sense that we seek novelty when the team is high hp, but seek familiarity (a pokemon center) when the team is low hp.

Had an idea of making a formula that weighs the novelty reward with the current % of HP (maybe something else to prevent a bias against getting a team full of snorlax/chansey). So that when you are healthy, you seek novelty, but as your hp begins to get lower, the novelty rewards flips and you reject new frames and seek old ones in an attempt to heal your pokemon back up to prevent being wiped out and losing $.

Looking forward to working on this project when i get the chance, but again curious if you have thought of a solution to get the agent to go back to the pokemon center before being wiped out

Migrate metrics tracking to tensorboard

Current notebook setup is very shabby. Reward data and model stats, and even gameplay clips can all be automatically be sent to tensorboard for easy visualization.

pyproject.toml error

I'm having issues installing the requirements. I'm trying to get this going on windows 10 but I keep getting this error.

Preparing metadata (pyproject.toml) ... error
error: subprocess-exited-with-error

× Preparing metadata (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [12 lines of output]
+ meson setup C:\Users\Taylor\AppData\Local\Temp\pip-install-ywd_jsyp\contourpy_234d85d7ae30461d8bba25ed36f0cad5 C:\Users\Taylor\AppData\Local\Temp\pip-install-ywd_jsyp\contourpy_234d85d7ae30461d8bba25ed36f0cad5.mesonpy-b5w3yq6i -Dbuildtype=release -Db_ndebug=if-release -Db_vscrt=md --vsenv --native-file=C:\Users\Taylor\AppData\Local\Temp\pip-install-ywd_jsyp\contourpy_234d85d7ae30461d8bba25ed36f0cad5.mesonpy-b5w3yq6i\meson-python-native-file.ini
The Meson build system
Version: 1.2.2
Source dir: C:\Users\Taylor\AppData\Local\Temp\pip-install-ywd_jsyp\contourpy_234d85d7ae30461d8bba25ed36f0cad5
Build dir: C:\Users\Taylor\AppData\Local\Temp\pip-install-ywd_jsyp\contourpy_234d85d7ae30461d8bba25ed36f0cad5.mesonpy-b5w3yq6i
Build type: native build
Project name: contourpy
Project version: 1.1.0

  ..\meson.build:1:0: ERROR: Could not parse vswhere.exe output

  A full log can be found at C:\Users\Taylor\AppData\Local\Temp\pip-install-ywd_jsyp\contourpy_234d85d7ae30461d8bba25ed36f0cad5\.mesonpy-b5w3yq6i\meson-logs\meson-log.txt
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Is there something I'm missing or doing wrong?

No "start" button to interact while ai is playing?

I noticed that there is no corresponding key for the start button.. or do i miss something? can't find the key assignments for interaction with a running game.

very impressive work :)

VisualizeProgress with errors

There are some errors messages on jupyter notebook that I think could be fixed (maybe some update of jupyter that crashed the old notebook), don't know if it matters to much for the purpourse of the project...

WebGL map visualization

Currently map visualization is a slow numpy program which is used after a training run. This could be made to run in realtime using webgl so that different views could be explored interactively, and progress could be visualized in realtime.

Unable to install pyboy

Has anyone run into this issue? Any help will be appreciated:

(pokey) fabriziomendez@Fabrizios-MacBook-Pro baselines % python3.10 -m pip install pyboy     
Collecting pyboy
  Using cached pyboy-1.5.6.tar.gz (5.7 MB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error
  
  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [50 lines of output]
      <string>:40: _DeprecatedInstaller: setuptools.installer and fetch_build_eggs are deprecated.
      !!

Add basic tests, CI

Wrong explore value in the score

Hi and thanks for the code sharing, the project is cool ! :)

I tried the code and noticed that the 'explore' field in the score was not correct, or I didn't understand what it was supposed to do.
It seemed to me that it should allow the agent to "progress" and therefore increment only when a new area never reached was seen by the agent. However, when you go back and forth on a piece of map, you see that exploration always climbs, which shouldn't be the case. I think this may cause the agent to get stuck (perhaps on Mt. Moon).
Do you agree or have I misunderstood?

I'm analyzing the code to understand and do otherwise but it's complicated without any comment. If anyone has already analyzed it, don't hesitate. I saw that the 'post' variable in the get_knn_reward() function was useless because it was always 0.

FileNotFoundError even though I have the ROM saved in the specified path.

I'm getting this error:

Traceback (most recent call last): File "/Users/fabriziomendez/Documents/AI and ML/PokemonRedExperiments/baselines/run_pretrained_interactive.py", line 42, in <module> env = make_env(0, env_config)() #SubprocVecEnv([make_env(i, env_config) for i in range(num_cpu)]) File "/Users/fabriziomendez/Documents/AI and ML/PokemonRedExperiments/baselines/run_pretrained_interactive.py", line 21, in _init env = RedGymEnv(env_conf) File "/Users/fabriziomendez/Documents/AI and ML/PokemonRedExperiments/baselines/red_gym_env.py", line 92, in __init__ self.pyboy = PyBoy( File "/Users/fabriziomendez/Library/Python/3.10/lib/python/site-packages/pyboy/pyboy.py", line 79, in __init__ raise FileNotFoundError(f"ROM file {gamerom_file} was not found!") FileNotFoundError: ROM file ../PokemonRed.gb was not found!

However, you may see that I have the PokemonRed.gb file where its supposed to be:

Any tips on how to solve this issue?

No Sound on Script?

It Runs like a Charm but i have no sound when i let the pretrained Script running.

And 1 more question how can i run more then 10 Runs Visualises i would like to let 10 Emulations go on and watch them :D gj for the nice work

SyntaxError when running python run_pretrained_interactive.py

The aforementioned command leads to the following output

File "run_pretrained_interactive.py", line 29
sess_path = f'session_{str(uuid.uuid4())[:8]}'

SyntaxError: invalid syntax

Visualizations Troubles

I'm unable to get the visualizations to work after a few attempts to debug. I'm able to successfully open up the Jupyter Notebook and run the cells, but my first issue appears when I get to cell 7 or so:
"plt.figure(figsize = (32, 32))
plt.imshow(get_latest_grid('baselines/session_24b49e0b'))"

I've tried changing the session file name and I get the same sorts of errors like so. Has anyone seen this before/know how to get the visualizations up and running? I'm brand new to all of this.

Errors output in Jupyter Notebook when running the two lines in cell 7:

EinopsError Traceback (most recent call last)
File ~\anaconda3\Lib\site-packages\einops\einops.py:412, in reduce(tensor, pattern, reduction, **axes_lengths)
411 recipe = _prepare_transformation_recipe(pattern, reduction, axes_lengths=hashable_axes_lengths)
--> 412 return _apply_recipe(recipe, tensor, reduction_type=reduction)
413 except EinopsError as e:

File ~\anaconda3\Lib\site-packages\einops\einops.py:235, in _apply_recipe(recipe, tensor, reduction_type)
233 backend = get_backend(tensor)
234 init_shapes, reduced_axes, axes_reordering, added_axes, final_shapes =
--> 235 _reconstruct_from_shape(recipe, backend.shape(tensor))
236 tensor = backend.reshape(tensor, init_shapes)

File ~\anaconda3\Lib\site-packages\einops\einops.py:200, in _reconstruct_from_shape_uncached(self, shape)
199 if isinstance(length, int) and isinstance(known_product, int) and length % known_product != 0:
--> 200 raise EinopsError("Shape mismatch, can't divide axis of length {} in chunks of {}".format(
201 length, known_product))
203 unknown_axis = unknown_axes[0]

EinopsError: Shape mismatch, can't divide axis of length 6 in chunks of 4

During handling of the above exception, another exception occurred:

EinopsError Traceback (most recent call last)
Cell In[8], line 2
1 plt.figure(figsize = (32, 32))
----> 2 plt.imshow(get_latest_grid('baselines/session_24b49e0b'))

Cell In[5], line 3, in get_latest_grid(pth)
1 def get_latest_grid(pth):
2 imgs = np.array([np.array(Image.open(p)) for p in Path(pth).glob('curframe*.jpeg')])
----> 3 grid = rearrange(imgs, '(h2 w2) h w c -> (h2 h) (w2 w) c', w2=4)
4 return grid

File ~\anaconda3\Lib\site-packages\einops\einops.py:483, in rearrange(tensor, pattern, **axes_lengths)
481 raise TypeError("Rearrange can't be applied to an empty list")
482 tensor = get_backend(tensor[0]).stack_on_zeroth_dimension(tensor)
--> 483 return reduce(cast(Tensor, tensor), pattern, reduction='rearrange', **axes_lengths)

File ~\anaconda3\Lib\site-packages\einops\einops.py:420, in reduce(tensor, pattern, reduction, **axes_lengths)
418 message += '\n Input is list. '
419 message += 'Additional info: {}.'.format(axes_lengths)
--> 420 raise EinopsError(message + '\n {}'.format(e))

EinopsError: Error while processing rearrange-reduction pattern "(h2 w2) h w c -> (h2 h) (w2 w) c".
Input tensor shape: (6, 144, 160, 3). Additional info: {'w2': 4}.
Shape mismatch, can't divide axis of length 6 in chunks of 4

Pokemon Violet / Scarlet compatibility

When will you have compatibility with Switch's Violet / Scarlet??
Thanks man

Can't get anything running on Windows.

I downloaded the files, and have the game in the folder PokemonRedExperiments-master folder, but bring up a python window and trying to run anything always gives me 'invalid syntax" what am I missing?

Disable agent from being able to use start/B buttons

It makes sense in my mind that further restricting what actions an agent can take will allow for faster learning and mastery of what it can do

I believe the start button will eventually be required to use HM but I believe that isn't required until Vermillion city?

has anyone experimented round with this? When I finish my current experiment I am happy to give this a try and see what happens

Error: CPU is stuck -> wrong rom!

when launching the run_pretrained_interactive.py

I will get the following:

window with pyboy is opening
terminal output:

step:     16 event:  0.00 level:  0.00 heal:  0.00 op_lvl:  0.00 dead: -0.00 badge:  0.00 explore:  0.00 sum:  0.004116     pyboy.core.cpu                 ERROR    CPU is stuck: 
A: 72, F: 10, B: 00, C: BA, D: 00, E: 04, HL: C0C0, SP: C0E4, PC: 801A ()
Opcode: []  Interrupts - IME: False, IE: 00001101, IF: 00010000
LCD Intr.: 62, LY:102, LYC:0
Timer Intr.: 65536
halted:False, interrupt_queued:False, stopped:False

I installed python version 3.10.13
I am on an M1 Macbook

Issues running on Ubuntu LXC - What am I doing wrong...

LXC Container: Ubuntu 22.04
vCores: 24
Ram: 128GB
GPU: GTX 1080ti
Access via SSH

I followed all the instructions and (i hope) have done it correctly.

When running run_baseline_parallel.py I'm getting the following error
Running the single thread baseline run_baseline.py seems to work though.

Is this an issue due to the errors from pyboy about the emulation_speed?
or is this an NNPACK issue due to my CPU not having AVX2?

root@pokeAI:~/PokemonRedExperiments/baselines# python3 run_baseline_parallel.py
UserWarning: Using SDL2 binaries from pysdl2-dll 2.28.0
5387     pyboy.plugins.window_headless  WARNING  This window type does not support frame-limiting. `pyboy.set_emulation_speed(...)` will have no effect, as it's always running at full speed.
5708     pyboy.plugins.window_headless  WARNING  This window type does not support frame-limiting. `pyboy.set_emulation_speed(...)` will have no effect, as it's always running at full speed.
5794     pyboy.plugins.window_headless  WARNING  This window type does not support frame-limiting. `pyboy.set_emulation_speed(...)` will have no effect, as it's always running at full speed.
5658     pyboy.plugins.window_headless  WARNING  This window type does not support frame-limiting. `pyboy.set_emulation_speed(...)` will have no effect, as it's always running at full speed.
5774     pyboy.plugins.window_headless  WARNING  This window type does not support frame-limiting. `pyboy.set_emulation_speed(...)` will have no effect, as it's always running at full speed.
5739     pyboy.plugins.window_headless  WARNING  This window type does not support frame-limiting. `pyboy.set_emulation_speed(...)` will have no effect, as it's always running at full speed.
5895     pyboy.plugins.window_headless  WARNING  This window type does not support frame-limiting. `pyboy.set_emulation_speed(...)` will have no effect, as it's always running at full speed.
5812     pyboy.plugins.window_headless  WARNING  This window type does not support frame-limiting. `pyboy.set_emulation_speed(...)` will have no effect, as it's always running at full speed.
5929     pyboy.plugins.window_headless  WARNING  This window type does not support frame-limiting. `pyboy.set_emulation_speed(...)` will have no effect, as it's always running at full speed.
6050     pyboy.plugins.window_headless  WARNING  This window type does not support frame-limiting. `pyboy.set_emulation_speed(...)` will have no effect, as it's always running at full speed.
6084     pyboy.plugins.window_headless  WARNING  This window type does not support frame-limiting. `pyboy.set_emulation_speed(...)` will have no effect, as it's always running at full speed.
6068     pyboy.plugins.window_headless  WARNING  This window type does not support frame-limiting. `pyboy.set_emulation_speed(...)` will have no effect, as it's always running at full speed.
6120     pyboy.plugins.window_headless  WARNING  This window type does not support frame-limiting. `pyboy.set_emulation_speed(...)` will have no effect, as it's always running at full speed.
6191     pyboy.plugins.window_headless  WARNING  This window type does not support frame-limiting. `pyboy.set_emulation_speed(...)` will have no effect, as it's always running at full speed.
6302     pyboy.plugins.window_headless  WARNING  This window type does not support frame-limiting. `pyboy.set_emulation_speed(...)` will have no effect, as it's always running at full speed.
6491     pyboy.plugins.window_headless  WARNING  This window type does not support frame-limiting. `pyboy.set_emulation_speed(...)` will have no effect, as it's always running at full speed.
6475     pyboy.plugins.window_headless  WARNING  This window type does not support frame-limiting. `pyboy.set_emulation_speed(...)` will have no effect, as it's always running at full speed.
6555     pyboy.plugins.window_headless  WARNING  This window type does not support frame-limiting. `pyboy.set_emulation_speed(...)` will have no effect, as it's always running at full speed.
6491     pyboy.plugins.window_headless  WARNING  This window type does not support frame-limiting. `pyboy.set_emulation_speed(...)` will have no effect, as it's always running at full speed.
6685     pyboy.plugins.window_headless  WARNING  This window type does not support frame-limiting. `pyboy.set_emulation_speed(...)` will have no effect, as it's always running at full speed.
6770     pyboy.plugins.window_headless  WARNING  This window type does not support frame-limiting. `pyboy.set_emulation_speed(...)` will have no effect, as it's always running at full speed.
6737     pyboy.plugins.window_headless  WARNING  This window type does not support frame-limiting. `pyboy.set_emulation_speed(...)` will have no effect, as it's always running at full speed.
6824     pyboy.plugins.window_headless  WARNING  This window type does not support frame-limiting. `pyboy.set_emulation_speed(...)` will have no effect, as it's always running at full speed.
6958     pyboy.plugins.window_headless  WARNING  This window type does not support frame-limiting. `pyboy.set_emulation_speed(...)` will have no effect, as it's always running at full speed.
Using cuda device
Wrapping the env in a VecTransposeImage.
[W NNPACK.cpp:64] Could not initialize NNPACK! Reason: Unsupported hardware.
step:     31 event:  0.00 level:  0.00 heal:  0.00 op_lvl:  0.00 dead: -0.00 badge:  0.00 explore:  0.03 sum:  0.03corrupted size vs. prev_size
step:     32 event:  0.00 level:  0.00 heal:  0.00 op_lvl:  0.00 dead: -0.00 badge:  0.00 explore:  0.05 sum:  0.05Traceback (most recent call last):
  File "/root/PokemonRedExperiments/baselines/run_baseline_parallel.py", line 65, in <module>
    model.learn(total_timesteps=(ep_length)*num_cpu*1000, callback=checkpoint_callback)
  File "/usr/local/lib/python3.10/dist-packages/stable_baselines3/ppo/ppo.py", line 308, in learn
    return super().learn(
  File "/usr/local/lib/python3.10/dist-packages/stable_baselines3/common/on_policy_algorithm.py", line 259, in learn
    continue_training = self.collect_rollouts(self.env, callback, self.rollout_buffer, n_rollout_steps=self.n_steps)
  File "/usr/local/lib/python3.10/dist-packages/stable_baselines3/common/on_policy_algorithm.py", line 178, in collect_rollouts
    new_obs, rewards, dones, infos = env.step(clipped_actions)
  File "/usr/local/lib/python3.10/dist-packages/stable_baselines3/common/vec_env/base_vec_env.py", line 197, in step
    return self.step_wait()
  File "/usr/local/lib/python3.10/dist-packages/stable_baselines3/common/vec_env/vec_transpose.py", line 95, in step_wait
    observations, rewards, dones, infos = self.venv.step_wait()
  File "/usr/local/lib/python3.10/dist-packages/stable_baselines3/common/vec_env/subproc_vec_env.py", line 130, in step_wait
    results = [remote.recv() for remote in self.remotes]
  File "/usr/local/lib/python3.10/dist-packages/stable_baselines3/common/vec_env/subproc_vec_env.py", line 130, in <listcomp>
    results = [remote.recv() for remote in self.remotes]
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 250, in recv
step:     32 event:  0.00 level:  0.00 heal:  0.00 op_lvl:  0.00 dead: -0.00 badge:  0.00 explore:  0.03 sum:  0.03    buf = self._recv_bytes()
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 414, in _recv_bytes
    buf = self._recv(4)
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 383, in _recv
    raise EOFError
EOFError

Incompatibility with WSL

Hi there,

I'm running the package with Python 3.11 under Debian in WSL, I keep getting this error which I cant find much about online, heres the stack trace

run_baseline.py headless

For fun i'm messing around with some of the reward variables. However, when i run run_baseline.py with headless = False, it does not show the emulator UI

It works perfectly fine when i run the interactive.py. Am i missing something to make it appear elsewhere?

reward ai for choosing an effective move

I think a good way to improve the ai would be to add an reward for using an effective move, so it hopefully learn to use the types of moves to it's advantage.

Problems starting maybe with the SDL libraries?

Hi! I was trying to set up everything and I'm unsure how to proceed. After installing everything and running python run_pretrained_interactive.py I get the following output:

(pokelearn) User: ~/PokemonRedExperiments/baselines$ python run_pretrained_interactive.py
UserWarning: Using SDL2 binaries from pysdl2-dll 2.28.4
X Error of failed request: BadValue (integer parameter out of range for operation)
Major opcode of failed request: 148 (GLX)
Minor opcode of failed request: 3 (X_GLXCreateContext)
Value in failed request: 0x0
Serial number of failed request: 145
Current serial number in output stream: 146

Problem installing GitHub packages: "Cannot find command 'git' - do you have 'git' installed and in your PATH?"

C:\Users\Erick\Downloads\PokemonRedExperiments-master\PokemonRedExperiments-master\baselines>pip install -r requirements.txt
Collecting mediapy@ git+https://github.com/PWhiddy/mediapy.git@45101800d4f6adeffe814cad93de1db67c1bd614 (from -r requirements.txt (line 94))
Cloning https://github.com/PWhiddy/mediapy.git (to revision 45101800d4f6adeffe814cad93de1db67c1bd614) to c:\users\erick\appdata\local\temp\pip-install-n111e2vg\mediapy_764d51226f3847c0aa82f633e4c029ce
ERROR: Error [WinError 2] The system cannot find the file specified while executing command git version
ERROR: Cannot find command 'git' - do you have 'git' installed and in your PATH?

I have already checked that 'git' is in the PATH and I've also uninstalled git and reinstalled it but it's not working

Having More than one session

When I start python run_baseline_parallel.py
I always just does 1 game at a time, how do I start 10 or 20 at the same time?
Thank you

RuntimeError: Program 'ffmpeg' is not found; perhaps install ffmpeg using 'apt-get install ffmpeg'.

When i run the python file, "run_pretrained_interactive.py" i get the error RuntimeError: Program 'ffmpeg' is not found; perhaps install ffmpeg using 'apt-get install ffmpeg'. can someone tell me what this means and how to fix it?

Issue in the requirements specifically "hnswlib"

Building wheels for collected packages: hnswlib
Building wheel for hnswlib (pyproject.toml) ... error
error: subprocess-exited-with-error

× Building wheel for hnswlib (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [5 lines of output]
running bdist_wheel
running build
running build_ext
building 'hnswlib' extension
error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for hnswlib
Failed to build hnswlib
ERROR: Could not build wheels for hnswlib, which is required to install pyproject.toml-based projects

I already installed the required VSBuild Tools v17.7.5 which is already higher than the required v14.0 already did restart hoping it would solve the issue but yeah im still stuck.. already used chatgpt in which the suggestion was the same about installing the VSBuild Tools..

Finding the training very slow

I wanted to start from the init state and with no prior training. Ran it over night for millions of steps across 24 cpus and lots of iterations and the ai still only makes it out of the starting room sometimes. Its confusing to me because the reward sum when it leaves the room is often higher than ~.27 and when it doesn't the most it reaches is about .17.

What it accomplished in those 8 hours was figure out how to grab the potion from the PC, which is kinda cool but not even sure why it does it since there's no reward mechanism for adding heal items.

Is there something I can change for the speed of learning from optimal paths? I made plenty of changes to the code but I can't pinpoint the issue. All I can think of is I decreased the model.learn() from multiplying by a factor of 1000 to 10, because it'd didn't save the model at all since it was taking so many hours to go through one learn_step. But there's gotta be some kind of tuning that could help make it learn from rewarding paths more.

FILE_PATH = "output_for_simulations.txt"
num_cpu = 24 #64 #46  # Also sets the number of episodes per training iteration
env = None
global model
model = None
global currentLearnStep

# Get all saved models
saved_models = glob.glob(f'session_*/pokemans_*.zip')
saved_models.sort(key=lambda x: int(Path(x).stem.split("_")[1]), reverse=True)


# Initialize the output text file with placeholders
with open(FILE_PATH, 'w') as file:
    for rank in range(num_cpu):
        file.write(f"Rank {rank} - initializing...\n")

lock = multiprocessing.Lock()  # You're using multiprocessing, so use its Lock


def signal_handler(signum, frame):
    print("Interrupted! Closing environments...")
    save_model()
    if env is not None:
        env.close()
    sys.exit(0)

# Set the signal handler
signal.signal(signal.SIGINT, signal_handler)


def make_env(rank, env_conf, should_render = False, seed=0, pretraining_steps=0):
    def _init():
        env = RedGymEnv(env_conf, should_render, rank=rank, lock=lock, file_path=FILE_PATH, original_steps=pretraining_steps, ep_length=ep_length, num_ranks=num_cpu)  # Passing lock and file path
        env.reset(seed=(seed + rank))
        return env
    set_random_seed(seed)
    return _init


def save_model():
    if model is not None:
        print("current learn step +1: ", currentLearnStep + 1)
        print("pretraining: ", pretraining_steps)
        total_steps_so_far = pretraining_steps + ep_length*num_cpu*(currentLearnStep+1)
        print("total steps so far: ", total_steps_so_far)
        model_path = os.path.join(sess_path, f"pokemans_{total_steps_so_far}_steps")
        model.save(model_path)
        print("Saving Most recent model at: " ,total_steps_so_far, " steps.")


if __name__ == '__main__':

    ep_length = 512
    if saved_models:
        pretraining_steps = int(re.search('pokemans_(\d+)_steps', saved_models[0]).group(1))
    else:
        pretraining_steps = 0

    sess_path = f'session_{str(uuid.uuid4())[:8]}'
    if not os.path.exists(sess_path):
        os.makedirs(sess_path)
    args = get_args('run_baseline_parallel.py', ep_length=ep_length, sess_path=sess_path)

    env_config = {
                'headless': False, 'save_final_state': True, 'early_stop': False,
                'action_freq': 24, 'init_state': '../init.state', 'max_steps': ep_length, 
                'print_rewards': True, 'save_video': False, 'fast_video': True, 'session_path': sess_path,
                'gb_path': '../PokemonRed.gb', 'debug': False, 'sim_frame_dist': 2_000_000.0
            }
    
    env_config = change_env(env_config, args)
    
    env = SubprocVecEnv([make_env(i, env_config, should_render=(i == 0), pretraining_steps=pretraining_steps) for i in range(num_cpu)])
    
    #checkpoint_callback = CheckpointCallback(save_freq=ep_length, save_path=sess_path,
    #                                 name_prefix='poke')
    #env_checker.check_env(env)
    learn_steps = 20

    print("checking if saved models exist: ", saved_models)
    if saved_models:
        
        print(f'\nloading model: {saved_models[0]}')
        model = PPO.load(saved_models[0], env=env)
        print("Starting at Model with ",pretraining_steps, " steps.")
        model.n_steps = ep_length
        model.n_envs = num_cpu
        model.rollout_buffer.buffer_size = ep_length
        model.rollout_buffer.n_envs = num_cpu
        model.rollout_buffer.reset()
    else:
        print('\n Loading untrained Model: ')
        model = PPO('CnnPolicy', env, verbose=1, n_steps=ep_length, batch_size=512, n_epochs=1, gamma=0.999)
        pretraining_steps = 0
    
    for currentLearnStep in range(learn_steps):
        print("entering currentLearnStep: ", currentLearnStep)
        model.learn(total_timesteps=(ep_length)*num_cpu*10) #, callback=checkpoint_callback

        # Save the model
        save_model()
        
        
        # Render steps after learning
        obs = env.reset()
        for _ in range(100):  # Render 100 steps, adjust as needed
            env.render()
            action, _ = model.predict(obs)
            obs, _, _, _ = env.step(action)
        time.sleep(1)  # Pause for a second between iterations, adjust as needed

Don't get any video output

It seems to be running for me, although I needed to install some of the dependencies manually. I don't get a video screen, just text with updates on steps etc.

Not sure if this is because I'm using WSL, but I have got ffmpeg installed and I'm using Python3.

I'm using fish as my shell there, but I did try it with bash too.

I can help, DM me

Hi,

Awesome work! Have wanted Pokemon as an RL env for years. I dev pufferlib (pufferai.github.io), a library for aggregating RL environments and making them easier to work with. I've done an initial integration for your Pokemon Red Gym env and would like to help you with the project. I have a box capable of running experiments as well. My Discord is jsuarez5341, let's chat

How difficult to edit this to function with gba games?

I'm looking for a way to teach it Emerald or Fire Red. Is that possible in the current state or does PyBoy not allow it and thus it would be a huge change?