farama-foundation / arcade-learning-environment Goto Github PK
View Code? Open in Web Editor NEWThe Arcade Learning Environment (ALE) -- a platform for AI research.
License: GNU General Public License v2.0
The Arcade Learning Environment (ALE) -- a platform for AI research.
License: GNU General Public License v2.0
in ale_interface.hpp and main.cpp
Not sure if we can consolidate the two (it's not possible to include main.cpp in the ale_interface.hpp file b/c of multiple definitions of main). The other option would be to use a #define to define the version number. There are other #defines located in src/common/Constants.h.
I'm using the ALE C lib, and I'm trying to run multiple separate games from different threads. Up until about 3 threads this is stable and seems to work fine, but at thread 4 I start getting this error:
File: [...]/Arcade-Learning-Environment/src/emucore/TIA.cxx, Line 1147
Expression: (s1 % 4 == 0) && (s2 % 8 == 0)
An unknown error occurred
Each thread locally invokes ALE_new and loads its own ROM, so they're not sharing instances. This problem does not occur when running multiple programs that individually link with the ALE C shared library with &.
This should probably be reproduced by someone else first to see if this is a local anomaly.
Document the FIFO protocol implemented in fifo_controller.cpp.
In manual.pdf, the description of run-length encoding in section 6.2.2 specifies that lengths are encoded as 1 less than the actual run length. This doesn't seem to be correct.
makefile.mac links against the directory /opt/local/lib. This directory doesn't exist on my machine and results in an error when building. Is this line necessary?:
https://github.com/mgbellemare/Arcade-Learning-Environment/blob/master/makefile.mac#L16
Found an odd segfault in Montezuma's Revenge. A specific sequence of actions causes the game to segfault. The sequence of actions is: take noop actions until frame 49126 at which point the agent should take action 17. Segfault promptly results. All of these actions are legal actions. The bug can be replicated most easily by replacing some of the code in Random agent and then running the random agent on montezuma's revenge.
Random Agent:
#include "RandomAgent.hpp"
#include "random_tools.h"
RandomAgent::RandomAgent(OSystem* _osystem, RomSettings* _settings) :
PlayerAgent(_osystem, _settings) {
}
Action RandomAgent::act() {
static int frameNum = 0;
frameNum++;
Action a = Action(0);//legal_actions[rand() % legal_actions.size()];
if (frameNum == 49126)
a = Action(17);
//return choice(&available_actions);
return a;
}
Output from run:
Arcade-Learning-Environment$ ./ale -player_agent random_agent -max_num_frames 50000 ~/projects/ale-assets/roms/montezuma_revenge.bin
A.L.E: Arcade Learning Environment (version 0.3)
[Powered by Stella]
Use -help for help screen.
Game console created:
ROM file: /home/matthew/projects/ale-assets/roms/montezuma_revenge.bin
Cart Name: Montezuma's Revenge - Starring Panama Joe (1983) (Parker Bros)
Cart MD5: 3347a6dd59049b15a38394aa2dafa585
Display Format: AUTO-DETECT ==> NTSC
ROM Size: 8192
Bankswitch Type: AUTO-DETECT ==> E0
Running ROM file...
Random Seed: Time
Game will be controlled internally.
Segmentation fault
Backtrace of the segfault shows that it's segfaulting in the nitty gritty emulator code:
Program received signal SIGSEGV, Segmentation fault.
0x000000000043452b in TIA::poke (this=0x7d3a00, addr=17, value=193 '\301') at src/emucore/TIA.cxx:2431
2431 Int8 when = ourPlayerPositionResetWhenTable[myNUSIZ1 & 7][myPOSP1][newx];
(gdb) bt
#0 0x000000000043452b in TIA::poke (this=0x7d3a00, addr=17, value=193 '\301') at src/emucore/TIA.cxx:2431
#1 0x000000000045ece0 in System::poke (this=0x7d2630, addr=273, value=193 '\301')
at src/emucore/m6502/src/System.cxx:341
#2 0x000000000044cb2d in M6502Low::poke (this=0x7d35c0, address=273, value=193 '\301')
at src/emucore/m6502/src/M6502Low.cxx:72
#3 0x000000000044b8d3 in M6502Low::execute (this=0x7d35c0, number=19580) at src/emucore/m6502/src/M6502Low.ins:4205
#4 0x000000000042f7ec in TIA::update (this=0x7d3a00) at src/emucore/TIA.cxx:516
#5 0x0000000000423941 in OSystem::mainLoop (this=0x7cea70) at src/emucore/OSystem.cxx:748
#6 0x000000000040809a in main (argc=6, argv=0x7fffffffdff8) at src/main.cpp:171
I feel like some crucial information is missing from the help display:
In some cases it seems that cmake is unable to detect the version of the SDL library that is installed, leading to issues in the build process as described here:
https://groups.google.com/forum/#!topic/arcade-learning-environment/-PnnqTg35HM
CMakeLists.txt (https://github.com/mgbellemare/Arcade-Learning-Environment/blob/master/CMakeLists.txt#L21) poses issues when SDL version is not detected.
Hi, thanks for this project! I have the following issue:
I installed the most current version of ALE on a box running Ubuntu 14.04.
I am using Python 2.7.6.
I built with cmake as described in the manual (with the use SDL flag set to true) and ran pip install .
in the root directory of the project.
cding to doc/examples and executing python python_example.py breakout.bin
works as advertised.
However, changing line 23 to USE_SDL = True
yields the following runtime error:
Screen display requires directive __USE_SDL to be defined. Please recompile with flag '-D__USE_SDL'. See makefile for more information.
This is being thrown from ./src/emucore/OSystem.cxx, according to grep.
It comes from the following block:
`
std::cerr << "Screen display requires directive __USE_SDL to be defined."
<< " Please recompile with flag '-D__USE_SDL'."
<< " See makefile for more information."
<< std::endl;
exit(1);
`
I thought this might have something to do with cmake not playing nicely with SDL, but sharedLibraryInterfaceExample runs with no issues (so I think I can rule that out?).
I will keep looking at this, and I'm sure I'll be able to find some sort of resolution by messing with the ale_python interface to get it to define __USE_SDL, but I raise this issue because I expect that you guys would like this to work out of the box.
If we try to set an inexistent flag, the ALE is not going to forbid us. E.g.: ale.setFloat("stochasticity", 0.00)
works just fine (no exception is thrown, no error message is given, it just ignores the command).
I don't think we should allow it. I think we should give at least a warning. While inexistent flags may be unlikely, typos are not.
MacPorts installs SDL in /opt/local/include/SDL. CMake correctly finds the library, and adds
-I/opt/local/include/SDL
to the include path used to compile ALE. However, a number of files include "SDL/SDL.h" (rather than SDL), and so building with SDL enabled fails on my machine.
Suggested fix: directly include "SDL.h" and fix the include paths in makefile.mac/unix accordingly.
We should prepare for a move to C++11. Among suggested changes are #81's move from auto_ptr to unique_ptr, and the use of the official MT RNG instead of TinyMT.
It's not too hard to loop the score in Space Invaders. When that happens the reward will be a high negative number.
installing.pdf java_tutorial.pdf and ale_java_agent are all missing from the github repo. Depending on how we are intending to package the release this may not be an issue.
This is necessary to enable some degree of "stochasticity" when initializing the game. See M6532.cxx.
Two issues with the RLGlueController:
Solution to problem 1: By default set action_spec="ACTIONS INTS (0 legal_action_set_size)". However do we want to include player B's actions?
Possible solutions to problem 2:
We have implemented the stochasticity in stella_environment.cpp, and the stochasticity parameter is set in the variable repeat_prob. However, it is not a parameter we can set using flags, but something written in the code and to change it we need to change the default value and then re-compile the ALE. We should probably make a flag to it.
Hello Marc,
Thanks much for putting out this great source code. I know this is an emulator platform.
I'd like to ask if it is possible to modify the games (e.g. size/shape of characters, frequency of appearance of enemy objects) slightly in some way using this source code?
I greatly appreciate your answer.
Best,
Anh
At the moment, the C interface allows saving and loading of the game state onto an internal, opaque stack. It would be useful to be able to explicitly tell it to save/load a state from disk.
I've been running UCT from an FFI wrapper, which can take several days, and it would be nice to be able to halt my program and restart my computer (or make it robust against power outages or whatever).
I'd guess this probably isn't much more complicated than saving the RAM query and a few small values like lives and score from the game-specific glue, but the wrapper lacks any real way to load this even if you can grab the values.
Dear all,
I modified sharedLibraryInterfaceExample.cpp in docs/examples by adding the line
ale.set("display_screen", "true"); // or ale.set("display_screen", true);
directly before calling loadRom
in order to display the emulator screen using SDL. However, the program crashes with the following error:
A.L.E: Arcade Learning Environment (version 0.4.4)
[Powered by Stella]
Use -help for help screen.
Warning: couldn't load settings file: ./stellarc
Game console created:
ROM file: ../../atari/roms/pong.bin
Cart Name: Video Olympics (1978) (Atari)
Cart MD5: 60e0ea3cbe0913d39803477945e9e5ec
Display Format: AUTO-DETECT ==> NTSC
ROM Size: 2048
Bankswitch Type: AUTO-DETECT ==> 2K
2015-02-06 09:55:02.372 sharedLibraryInterfaceExample[5124:507] *** Terminating app due to uncaught exception 'NSInternalInconsistencyException', reason: 'Error (1000) creating CGSWindow on line 263'
*** First throw call stack:
(
0 CoreFoundation 0x00007fff8b32125c __exceptionPreprocess + 172
1 libobjc.A.dylib 0x00007fff8956de75 objc_exception_throw + 43
2 CoreFoundation 0x00007fff8b32110c +[NSException raise:format:] + 204
3 AppKit 0x00007fff839f4e95 _NSCreateWindowWithOpaqueShape2 + 1403
4 AppKit 0x00007fff839f3a21 -[NSWindow _commonAwake] + 3720
5 AppKit 0x00007fff838cf400 -[NSWindow _commonInitFrame:styleMask:backing:defer:] + 882
6 AppKit 0x00007fff838ce882 -[NSWindow _initContent:styleMask:backing:defer:contentView:] + 1054
7 AppKit 0x00007fff838ce458 -[NSWindow initWithContentRect:styleMask:backing:defer:] + 45
8 libSDL-1.2.0.dylib 0x00000001093f1ced -[SDL_QuartzWindow initWithContentRect:styleMask:backing:defer:] + 285
9 libSDL-1.2.0.dylib 0x00000001093ef1c4 QZ_SetVideoMode + 1076
10 libSDL-1.2.0.dylib 0x00000001093e633f SDL_SetVideoMode + 527
11 libale.so 0x0000000109244894 _ZN13DisplayScreenC2EP12ExportScreenii + 148
)
libc++abi.dylib: terminating with uncaught exception of type NSException
Abort trap: 6
Running the ALE-binary with -display_screen true works fine. Any ideas what might be going on here? Is displaying the screen through ALEInterface not yet supported?
Thanks
Michael
The java examples should include a rlglue integration example by codec, not only the FIFO. Personally having problems integrating ALE and RL-Glue using java.
Sound is often output with a delay with respect to the displayed screen. Cause unknown.
PR #93 introduced Tetris support. Currently the SDL visualization jitters in Tetris, even though the recorded screens (e.g. via the video recording example) are stable.
On some games it is possible to loop the score. This has always been the case but is a greater issue on games where it is possible to learn policies that plays forever, e.g. on Atlantis. There should be a unifying scheme for dealing with score wrapping in games where this can occur. Either:
Note that there are games, e.g. Krull, where a simple agent can loop the score without achieving anything meaningful. This needs to be taken into consideration when evaluating agents that loop the score.
I'm following the manual instructions for using cmake
I get all the libraries built but at the end of the make I get
Linking CXX executable ale
Undefined symbols for architecture x86_64:
"_main", referenced from:
implicit entry/start for main executable
(maybe you meant: _SDL_main)
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
EventStreamer.cxx is deprecated and contains code which doesn't compile. The Makefiles handle this but under Visual Studio the default build will not compile.
We should move the common portions of these two methods into new methods, for clarity. In particular the copying of the image/RAM.
I guess a useful feature we can look at adding is to provide a way to save (and restore) the internal emulator state. We do not have even have to support it during an episode, but it would be nice if someone could save the TinyMT state to later restore his job. Right now, if one wants to checkpoint his code, he will have to restart TinyMT.
One-liner patch:
diff --git a/src/emucore/Settings.cxx b/src/emucore/Settings.cxx
index a3ac06a..4c502b3 100644
--- a/src/emucore/Settings.cxx
+++ b/src/emucore/Settings.cxx
@@ -300,7 +300,7 @@ void Settings::usage() {
" Environment arguments:\n"
" -max_num_episodes n\n"
" The program will quit after this number of episodes. 0 means never.\n"
" default: 0\n"
" default: 10\n"
" -max_num_frames m\n"
" The program will quit after this number of frames. 0 means never.\n"
" default: 0\n"
CMake would be a good candidate.
Current loadROM sets player-agent, display-screen, etc. in a very static manner. It would be nice to have a more flexible way to parametrize calls to loadROM (rather than rely on arguments to the function). One possibility is to use a struct containing relevant parameters, or directly provide a command-line.
Rewards are not handled correctly when skip_frame is greater than 1. StellaEnvironment::act in stella_environment.cpp correctly returns the sum of rewards for the skipped frames, but the controllers in /src/controllers ignore that return value. Instead they directly access m_settings->getReward() which returns the reward for the latest frame only. You can observe this by running, for example:
./ale -frame_skip 5 -display_screen true roms/asterix.bin
The final rewards on the game screen will be greater than the rewards reported in the terminal.
On Ubuntu 15.04, SDL.h is under /usr/include/SDL. cmake doesn't add that to the include path so compiling just dies:
g++ -Wall -Wno-multichar -Wunused -fno-rtti -fPIC -O3 -fomit-frame-pointer -DRLGENV_NOMAINLOOP -DUNIX -DHAS_ALTIVEC -DUSE_NASM -DBSPF_UNIX -DHAVE_INTTYPES -DWINDOWED_SUPPORT -DHAVE_GETTIMEOFDAY -DSNAPSHOT_SUPPORT -D__USE_SDL -D__USE_RLGLUE -Isrc/controllers -Isrc/os_dependent -I/usr/include -Isrc/environment -Isrc/games -Isrc/emucore -Isrc/emucore/m6502/src -Isrc/emucore/m6502/src/bspf/src -Isrc/common -Isrc/controllers -Isrc/agents -Isrc/environment -c -o src/main.o src/main.cpp
In file included from src/emucore/OSystem.hxx:40:0,
from src/main.cpp:23:
src/emucore/../common/display_screen.h:28:17: fatal error: SDL.h: No such file or directory
#include "SDL.h"
^
compilation terminated.
CMake does not always include the parent path of the SDL directory. The fix is to #include <SDL.h> instead.
None of the arguments specified after the rom file seem to be read by the command line parser.
Try:
./ale -max_num_episodes 4 ~/projects/ale-assets/roms/asterix.bin -player_agent random_agent
or some variation. Not sure if this is intended?
The following is a c++ example program showing the use of the ale_interface. I think it would be useful to include with the documentation.
This is the compile command:
g++ -I /home/matthew/projects/Arcade-Learning-Environment/src/ -L /home/matthew/projects/Arcade-Learning-Environment interfaceExample.cpp -lale -lz
And here is the program interfaceExample.cpp:
#include <iostream>
#include <ale_interface.hpp>
using namespace std;
int main(int argc, char** argv) {
if (argc < 2) {
std::cerr << "Usage: " << argv[0] << " rom_file" << std::endl;
return 1;
}
ALEInterface ale;
// Load the ROM file
ale.loadROM(argv[1], false);
// Get the vector of legal actions
ActionVect legal_actions = ale.getLegalActionSet();
// Play 10 episodes
for (int episode=0; episode<10; episode++) {
float totalReward = 0;
while (!ale.game_over()) {
Action a = legal_actions[rand() % legal_actions.size()];
// Apply the action and get the resulting reward
float reward = ale.act(a);
totalReward += reward;
}
cout << "Episode " << episode << " ended with score: " << totalReward << endl;
ale.reset_game();
}
};
This is a bit more of a long term issue, but I feel like the phosorphor blending should be available in a common class rather than exclusive to fifo_controller.
This is probably low priority, but I feel like the handling of the player_agent flag could be done better. If no player agent is specified or an invalid one is specified the console does not tell you anything or exit with an error. I think it would be helpful to new users if it listed possible player agents when none was specified and told the user if an invalid agent was requested.
Relevant code in bottom of PlayerAgent.cpp
Currently updating the version number is messy business. We should use a #define or other mechanism to centralize the version number. Best would be to first find out how other projects do it.
The current save/load functionality of the StellaEnvironment class does correctly preserve m_frame_number and m_episode_frame_number.
Stellarc isn't being used for much anymore (save for the cpu=low flag, which I suspect most people aren't aware of). We should tidy up around this file, i.e. by doing one or all of the following:
Screen recording is sync'ed with frame skip. This leads to low-quality movies when frame skip is used, e.g. as frames are dropped.
I have issues with SDL that I haven't been able to resolve.
When trying to display screen I get
./ale: symbol lookup error: /usr/local/lib/libSDL-1.2.so.0: undefined symbol: _XGetRequest
The problem seems related to:
PerlGameDev/SDL#228
But my installed version of SDL is the latest on ubuntu 14.04 and seems to be after they patched this problem.
Is this an ale problem?
I've been trying to build the ALE on msys. I fixed a few small compilation errors, which you can see by looking at the diffs on my fork:
https://github.com/Jragonmiris/Arcade-Learning-Environment
It mostly just involved explicitly including <time.h> in a couple places, and deleting an argument to mkdir.
However, I'm getting the following linker error for ale.exe
CMakeFiles/ale-bin.dir/objects.a(ale_interface.cpp.obj):ale_interface.cpp:(.text+0x5cf): undefined reference to `OSystemWin32::OSystemWin32()'
CMakeFiles/ale-bin.dir/objects.a(ale_interface.cpp.obj):ale_interface.cpp:(.text+0x5fd): undefined reference to `SettingsWin32::SettingsWin32(OSystem*)'
collect2.exe: error: ld returned 1 exit status
CMakeFiles/ale-bin.dir/build.make:3509: recipe for target '../ale.exe' failed
make[2]: *** [../ale.exe] Error 1
CMakeFiles/Makefile2:60: recipe for target 'CMakeFiles/ale-bin.dir/all' failed
make[1]: *** [CMakeFiles/ale-bin.dir/all] Error 2
Makefile:75: recipe for target 'all' failed
make: *** [all] Error 2
The makefile was generated with -G "MSYS Makefiles". I'm not sure where to proceed from here, perhaps there are some defines somewhere that are being wonky? Because it seems like the object files for the windows systems likely aren't being built.
The current colour averaging scheme is 4 years old and maps two NSTC frames to a NTSC frame. This implies that the resulting colour space is smaller than it should be. The actual quantization (going from NTSC -> RGB -> NTSC) is also crude.
Without deprecating the existing scheme (which serves the purpose, e.g., of providing frames drawn from a small colour space), it would nice to bring it up to date, perhaps avoiding the re-conversion to NTSC and instead producing a new "average-NTSC" colour space. Alternatives include providing proper RGB averaging routines, but these should probably be done in agent space.
In version 0.4 the Colour Averaging parameter was set as true by default. Currently it is being set as false by default. This is bad because it does not give us backward compatibility.
I will fix this as soon as you agree with this that this has to be changed (I messed it up). I received complaints about people getting worse results in the new version and I assume it is because of this.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.