bam4d / griddly Goto Github PK

View Code? Open in Web Editor NEW

230.0 230.0 24.0 129.6 MB

A grid-world game engine for game AI research

Home Page: https://griddly.readthedocs.io

License: MIT License

CMake 1.62% C++ 67.39% Shell 0.23% GLSL 2.69% Python 17.46% Batchfile 0.06% HTML 0.20% JavaScript 10.28% SCSS 0.08%

griddly's People

Contributors

Stargazers

Watchers

griddly's Issues

CleanRL Integration and Documentation

Add CleanRL documentation and examples.

We already use CleanRL in a bunch of stuff, but there's no simple examples or explanations of these.

A* search can cause objects to teleport

Describe the bug

When A* search runs and there is no calculable path, the best path is used to calculate the action vector from the final possible action. In this case the best path is calculated from the destination object of the search. When there is no path to the source object, this destination vector can be right next to the destination object, causing the source object to teleport to this location.

To Reproduce
Create a level with 2 seperate rooms (no path between) with an object in both rooms and set the A* algorithm for those object to chase an object that is in the opposite room.

Expected behavior
the objects should probably just randomly walk.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

OS: [e.g. mac/linux/windows]
Version [e.g. 1.3.9]

Additional context
Add any other context about the problem here.

Make the item list scrollable, so screens with lower resolution/DPI do not drop items

Describe the bug
When zoomed in or on smaller screens, map items disappear from the UI

To Reproduce
Load Grafter Escape Rooms environment, resize screen so the height is smaller than the items list

Expected behavior
The items should be scrollable so they are still accessible.

Screenshots

Desktop (please complete the following information):
ALL versions/OSes

Cross Platform reproducibility

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

Run a stochastic environment on platform A (i.e mac/windows/linux)
Run the same environment with the same seed on platform B (different from platform A)

The environment may behave differently due to the error stated in the Reddit thread above

Expected behavior
The environment states should be identical throughout any actions

Where to place custom reset callback wrapper class?

So, I took a first pass at the custom reset environment. Where in Griddly would you like this placed? I was thinking: python/griddly/util/rllib/environment/custom_reset_callback_wrapper.py or python/griddly/util/wrappers/custom_reset_callback_wrapper.py

Also, at first I was thinking that we might want to override your custom GymWrapper class. I went back and forth a few times before deciding on gym.Wrapper. Do you have an opinion on that?

import gym


class SetLevelWithCallback(gym.Wrapper):
    """GymWrapper to set the level with a callback function in Griddly.

    The callback_fn should output a string representation of the level.
    """

    def __init__(self, env, callback_fn):
        super(SetLevelWithCallback, self).__init__(env=env)
        self.create_level_fn = callback_fn

    def reset(self, **kwargs):
        level_string = self.create_level_fn()
        assert(isinstance(level_string, str))
        kwargs['level_string'] = level_string
        return self.env.reset(**kwargs)

    def step(self, action):
        return self.env.step(action)


if __name__ == "__main__":
    import gym
    from griddly import GymWrapperFactory
    from griddly import gd

    game = 'zelda_partially_observable'

    fpath = f'./src/Griddly/resources/games/Single-Player/GVGAI/{game}.yaml'

    wrapper = GymWrapperFactory()

    wrapper.build_gym_from_yaml(
        'Zelda-Adv',
        fpath,
        level=1,
        global_observer_type=gd.ObserverType.SPRITE_2D,
        player_observer_type=gd.ObserverType.BLOCK_2D
    )

    env = gym.make("GDY-Zelda-Adv-v0")
    s = env.reset()
    f = lambda: """wwwwwwwwwwwww
w....+++.+g.w
w.+.w+g.+...w
w...++Ag+.3.w
w+..+...+...w
w....w+++...w
w.........g.w
w.3...+++.+ww
wwwwwwwwwwwww"""

    env = SetLevelWithCallback(env, f)
    _ = env.reset()
    env.render(observer='global')
    import time
    time.sleep(2)
    env.close()

[Feature] Probability on behaviours of an action

At the moment probabilities can be set on an action itself, but not the actual behaviours between objects.

We should be able to set it like this:

- Src:
    ...
  Dst:
    ...
  Probability: 0.5

Better support for multiple agents.

Feature
At the moment, multiple agents can be added to GriddlyJS environments, but selecting which one is controllable is currently missing from the user interface.

Possible Implementation
We should improve the user interface when multiple agents are detected, to allow the selection of which agent will be controlled/is currently being controlled. This allows users to easily test multi-agent game scenarios

Assign colours based on the player

The engine gives random colour to the margin of different agents (see below). Would be a good feature to able to assign colours based on the player, so that it is clearer if we have many agents in the environment.

Build python packages for M1 arm64

Currently CI piplines only builds/publishes x86-64 packages for osx.

This can probably be changed by setting OSX_PLATFORM or using -arch amd64 during builds.

https://cmake.org/cmake/help/latest/prop_tgt/OSX_ARCHITECTURES.html#prop_tgt:OSX_ARCHITECTURES

Provide a tutorial for how to use custom assets.

We do not currently have an easy way to select assets from Griddly's internal asset library, or a way to add custom assets.

We would like to provide an interface to select (from the internal library), or upload assets so users have more flexibility in environment design

Feature: Allow "AND" for termination conditions

Docs say this about Win conditions:
If any of these conditions are met, the player associated with this condition will win the game.
It would be useful to be able to use AND, instead of OR. Something like

Termination:
  Win:
    - and: [eq: [players_alive, 1], eq: [plr_done, 0]]
    - gte: [_steps: 100]

Termination:
  Win:
    - and: 
      - eq: [players_alive, 1]
      - eq: [plr_done, 0]
    - gte: [_steps: 100]

would mean the player wins if [a global variable players_alive reaches 1 and this player is not player_done] or steps reaches 1000.

Again, depending on the implementation details, similar structure could be used for action preconditions to let the user decide whether they need AND/OR.

Python 3.10 release

I have successfully built and used Griddly in a Python 3.10 environment. It would be nice to have packages for 3.10 available on PyPI. Unless something is broken that I haven’t seen of course.

State access

Thanks for creating this tool and its environments, they look great! I was wondering if you are planning to provide access to the state of the game. Some sort of "observer" where, for example, rows are objects and columns are their attributes (x,y, type, etc). If not, what would be the easiest way to access this information right now? I'm guessing the Vector observer gives this information partially.

[Feature request] Setting up agent direction at step 0

Currently the direction an agent faces in the beginning of an episode is fixed. This feature will allow to set the direction value for a given level.

EnvPool Support

We want to investigate integrating envpool in some way to speed up parallel execution:

https://arxiv.org/abs/2206.10558

Use OpenMP or SIMD to optimize observers

The vector observer and observers using vulkan for rendering have certain loops which can and should be easily parallelized which would significantly speed up these observers.

RGBA to RGB optimize this:
https://github.com/Bam4d/Griddly/blob/develop/src/Griddly/Core/Observers/Vulkan/VulkanDevice.cpp#L541

Optimize these loops here:
Should probably iterate over the objects and convert the x,y coordinates to make this significantly faster on it's
https://github.com/Bam4d/Griddly/blob/develop/src/Griddly/Core/Observers/VectorObserver.cpp#L58

Vulkan 1.3 on macOS requires `VK_INSTANCE_CREATE_ENUMERATE_PORTABILITY_BIT_KHR` to be set.

Describe the bug
Bug description is related to this portability extension

To Reproduce
Try to run Griddly with vulkan 1.3.XXX

Expected behavior
Should not crash.

Instead Device enumeration fails and Griddly crashes.

Screenshots
Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0 libsystem_kernel.dylib 0x7ff80bd9e112 __pthread_kill + 10
1 libsystem_pthread.dylib 0x7ff80bdd4214 pthread_kill + 263
2 libsystem_c.dylib 0x7ff80bd20d10 abort + 123
3 libvulkan.1.3.216.dylib 0x103501c40 vkEnumeratePhysicalDevices.cold.1 + 32
4 libvulkan.1.3.216.dylib

Desktop (please complete the following information):

OS: MacOS
Vulkan Verison 1.3.XXX

GDY game functions are triggering in a undesirable order

So, this started with me trying to figure out if the RANGE_BOX_AREA was actually a box because I was noticing some weird behavior where the spider was nearby (within a box of size 2 from) the gnome while the gnome's light was on but the reward I was getting was as if the spider was not nearby (see picture).

I have three functions:
a: exposed to the user; flips a local agent boolean
b: Internal; checks a TRIGGER call, flips a local agent boolean. called when the spider is nearby
c: Internal; returns reward to the user based off of above variables; called everyframe

I eventually noticed that the functions being called were doing so in a bad ordering. I need the function to go a -> b -> c, but griddly runs them as a -> c -> b

If I call function c from b, the ordering is correct, but because the trigger only goes off when the object in question is in range, then I cannot provide reward for if the object is out of range (a necessary case with negative rewards). If you call c from b, then while the spider is not nearby, function c will never get called which means that the gnome can freely turn/leave on its light without punishment.

I also tried calling the trigger function directly from function a, but that didn't do anything to change the behaviour.

Screenshots

<-- this should never have been possible as the spider is always within a box of size 2 in this grideworld.

Desktop (please complete the following information):

Windows Version 1.4.2

yaml:

Version: "0.1"
Environment:
  Name: Particle Sensor Game
  Description: Multiple bang-bang sensors track a particle
  Observers:
    Block2D:
      TileSize: 24
    Isometric:
      TileSize: [ 32, 48 ]
      IsoTileHeight: 16
      IsoTileDepth: 4
      BackgroundTile: oryx/oryx_iso_dungeon/grass-1.png
    Vector:
      IncludePlayerId: true
  Player:
    AvatarObject: gnome
    Count: 1
    Observer:
      TrackAvatar: true
      Height: 5
      Width: 5
      OffsetX: 0
      OffsetY: 0
  Levels:
    - |
      s  .  .   .  .  
      .  .  .   .  . 
      .  .  g   .  .  
      .  .  .   .  .  
      .  .  .   .  .

    - |
      .  .  .  .  .  .  .  .  .  . 
      .  .  .  .  .  .  .  .  .  . 
      .  .  .  .  .  s  .  g  .  . 
      .  .  .  .  .  .  .  .  .  . 
      .  .  .  .  .  .  .  .  .  .
  Termination:
    Win:
      - eq: [spider:count, 0]

Actions:

  - Name: spider_random_movement
    InputMapping:
      Internal: true
    Behaviours:
      - Src:
          Object: spider
          Commands:
            - mov: _dest
            - exec:
                Action: spider_random_movement
                Randomize: true
                Delay: 1
        Dst:
          Object: [_empty, gnome]
      - Src:
          Object: spider
          Commands:
            - exec:
                Action: spider_random_movement
                Randomize: true
                Delay: 1
        Dst:
          Object: _boundary
      - Src:
          Object: spider
          Commands:
           - remove: true
           - reward: 1
        Dst:
          Object: right_exit

#  - Name: move
#    Behaviours:
#      - Src:
#          Object: gnome
#          Commands:
#            - mov: _dest
#        Dst:
#          Object: _empty

  - Name: switch
    InputMapping:
      Inputs:
        1:
          Description: flip switch
          VectorToDest: [ 0, 0 ]
      Relative: true
    Behaviours:
      # turn on spotlight
      - Src:
          Object: gnome
          Preconditions:
            - eq: [spotlight, 0]
          Commands:
            - set: [spotlight, 1]
            - set_tile: 1
            - print: "a: spotlight on"
#             - exec:
#                 Action: count_nearby_spider
        Dst:
          Object: gnome

      # turn off spotlight
      - Src:
          Object: gnome
          Preconditions:
            - eq: [spotlight, 1]
          Commands:
            - set: [spotlight, 0]
            - set_tile: 0
            - print: "a: spotlight off"
#             - exec:
#                 Action: count_nearby_spider
        Dst:
          Object: gnome

    # this does not capture if the spider is not nearby which is important
  - Name: count_nearby_spider
    Probability: 1.0
    Trigger:
      Type: RANGE_BOX_AREA
      Range: 2
    Behaviours:
      # If the spider is within 2 of the gnome and the gnome is on, give point
      - Src:
          Object: gnome
#          Preconditions:
#            - eq: [spotlight, 1]
          Commands:
            - if:
                Conditions:
                  eq:
                    - spotlight
                    - 1
                OnTrue:
                  - set: [spider_counter, 1]
                  - print: "b: spider nearby"
                OnFalse:
                  - set: [spider_counter, 0]
                  - print: "b: nearby spider not seen"
#            - exec:
#                Action: give_feedback
#                ActionId: 1
        Dst:
          Object: spider

  - Name: give_feedback
    InputMapping:
      Inputs:
        '1':
          Description: provide feedback to the agent(s)
          VectorToDest:
            - 0
            - 0
      Internal: true
    Behaviours:
      - Src:
          Object: gnome
          Preconditions:
            - eq:
                - spotlight
                - 1
          Commands:
            - if:
                Conditions:
                  eq:
                    - spider_counter
                    - 1
                OnTrue:
                  - reward: 1
                  - print: "c: spotlight on and spider nearby"
                OnFalse:
                  - reward: -1
                  - print: "c: spotlight on and spider not nearby"
            - exec:
                Action: give_feedback
                ActionId: 1
                Delay: 1
        Dst:
          Object: gnome
      - Src:
          Object: gnome
          Preconditions:
            - eq:
                - spotlight
                - 0
          Commands:
            - if:
                Conditions:
                  eq:
                    - spider_counter
                    - 1
                OnTrue:
                  - reward: 0
                  - print: "c: spotlight off and spider nearby"
                OnFalse:
                  - reward: 0
                  - print: "c: spotlight off and spider not nearby"
            - exec:
                Action: give_feedback
                ActionId: 1
                Delay: 1
        Dst:
          Object: gnome
Objects:
  - Name: gnome
    Z: 2
    MapCharacter: g
    InitialActions:
      - Action: give_feedback
        ActionId: 1
        Delay: 2
    Variables:
      - Name: spotlight
        InitialValue: 0
      - Name: spider_counter
        InitialValue: 0
    Observers:
      Isometric:
        - Image: oryx/oryx_iso_dungeon/avatars/gnome-1.png
        - Image: oryx/oryx_iso_dungeon/avatars/spider-fire-1.png
      Block2D:
        - Shape: square
          Color: [ 0.0, 0.8, 0.0 ]
          Scale: 0.5
        - Shape: triangle
          Color: [0.0, 0.5, 0.2]
          Scale: 0.8

  - Name: spider
    Z: 1
    InitialActions:
      - Action: spider_random_movement
        Randomize: true
    MapCharacter: s
    Observers:
      Isometric:
        - Image: oryx/oryx_iso_dungeon/avatars/spider-1.png
      Block2D:
        - Shape: triangle
          Color: [ 0.2, 0.2, 0.9 ]
          Scale: 0.5

  - Name: water
    MapCharacter: w
    Observers:
      Isometric:
        - Image: oryx/oryx_iso_dungeon/water-1.png
          Offset: [0, 4]
          TilingMode: ISO_FLOOR
      Block2D:
        - Color: [ 0.0, 0.0, 0.8 ]
          Shape: square

  - Name: right_exit
    MapCharacter: e
    Observers:
      Isometric:
        - Image: oryx/oryx_iso_dungeon/water-1.png
          Offset: [0, 4]
          TilingMode: ISO_FLOOR
      Block2D:
        - Color: [ 0.0, 0.0, 0.8 ]
          Shape: square

runner script

import os

from griddly import GymWrapperFactory, gd, GymWrapper
from griddly.RenderTools import VideoRecorder

if __name__ == "__main__":
    wrapper = GymWrapperFactory()

    name = "random_spiders_env"

    current_path = os.path.dirname(os.path.realpath(__file__))

    env = GymWrapper(
        "sensor_game_single_agent.yaml",
        player_observer_type=gd.ObserverType.VECTOR,
        global_observer_type=gd.ObserverType.ISOMETRIC,
        level=1,
        max_steps=1000,
    )
    env.enable_history(True)

    env.reset()

    global_recorder = VideoRecorder()
    global_visualization = env.render(observer="global", mode="rgb_array")
    global_recorder.start("global_video_test.mp4", global_visualization.shape)

    for i in range(1000):
        a = env.action_space.sample()
        obs, reward, done, info = env.step(a)
        
        env.render(observer="global")
        frame = env.render(observer="global", mode="rgb_array")

        global_recorder.add_frame(frame)

        if done:
            env.reset()

    global_recorder.close()
    env.close()

Write and automate python tests

There's over 300 c++ test cases that are run when Griddly builds which is super nice.

However there's no python tests currently :(

Integration tests with python would be a huge benefit!

Observations of dead agents

Describe the bug
Dead agents receiving non-zero observations.

To Reproduce

Open any multi-agent environment
observe the observations of agents that are dead

Expected behavior
I think it is common for dead agents to receives all zeros as observations.

Additional context

Also, restricting the action-space of dead agents to only no-op is common!

GriddlyJS does not report c++ error messages

The compiled WASM does not catch the messages from c++ exceptions and instead returns a useless number which tells you nothing about GDY errors

Feature: Move a player to an arbitrary location (with or without actions)

What

Possibility to give actions any arbitrary location as the destination without enumerating all possible coordinates.

Alternatively, ability to move a player object to arbitrary (unoccupied) coordinates. For the user, the API could look something like this, or whatever is possible given the current implementation:

env.move_object(source_coords, dest_coords)
#or
env.move_player(player_id, dest_coords)

Why

My use case is to build a heatmap in order to validate an algorithm. At the moment, I can do it by moving the agent step by step and calculating the needed value at each point, but this becomes complicated and prone to errors if I want to maintain the agent orientation, avoid occupied locations etc.

I could also imagine a game where a player can for example shoot at any coordinate in the environment. This kind of game mechanic would also benefit from more flexible action target/destination, without enumerating each possibility.

Default Observer should be VECTOR

If the YAML is missing specifications for sprites, blocks and the observers are not explicitly set then the game can crash when it tries to render using missing assets.

By default Griddly should use the VECTOR observer as it will not have this issue.

If users want to create environments witout sprite/block/iso observers then they should need to set the observation mode explicitly.

KeyError: ‘render_modes’ with Gym == 0.25.0

Describe the bug
Trying to create an environment with gym.make results in an exception

File "/Users/ntoxeg/Library/Caches/pypoetry/virtualenvs/narca-622vlgQS-py3.10/lib/python3.10/site-packages/gym/envs/registration.py", line 625, in make
    render_modes = env_creator.metadata["render_modes"]
KeyError: ‘render_modes'

To Reproduce
Steps to reproduce the behavior:
Install latest Gym (0.25.0 in my case).
Invoke something like

env = gym.make(
        "GDY-Drunk-Dwarf-v0",
        player_observer_type=gd.ObserverType.VECTOR,
        level=0,
   )

Expected behavior
Getting an instance of an environment.

Desktop (please complete the following information):

OS: macOS
Version 12.5

Test and support RLlib

Griddly does not really have extensive tests for libraries like RLlib.

It would be great to test and document examples using these frameworks!

Windows Build Missing Shaders

Describe the bug
SPRITE, BLOCK, ISOMETRIC Observers broken on Windows

To Reproduce
on windows.

pip install griddly
try to create any environment that uses any renderer other than VECTOR

Expected behavior
Environment should load and be usable

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

OS: windows
Version 0.1.5 -> 0.1.8

Additional context
So Azure updated something that causes the cmake script to not be able to find bash (even though cmake itself is called from bash). This script called by cmake compiles the shaders....

Because the shaders never get built they don't get included in the binary for Griddly and thus breaks Griddly for Windows

[FEATURE]: Using objects attributes as a reward

It will be cool to be able to use object attributes such as count, as a reward.
For example:
In Clusters environment, the reward is proportional to the number of in-place boxes. i.e. to define the reward as
- reward: blue_block:count

Segmentation fault with in subprocess when reset

Resetting an environment in a main process, resetting it, creating a new environment in a subprocess and resetting that one then results in a segmentation fault.

To Reproduce

import multiprocessing as mp
import time

import gym
from griddly import gd


def act():
    env = gym.make("GDY-Clusters-v0")
    env.reset()

    time.sleep(5)
    print("Success")


def main():
    env = gym.make("GDY-Clusters-v0")
    
    # Remove this reset and no SIGSEGV
    env.reset()

    actor = mp.Process(target=act)
    actor.start()
    actor.join()


if __name__ == "__main__":
    main()

Desktop (please complete the following information):

Linux (Fedora 34)
Griddly v1.2.6

Additional context

I need the environment to give me the observation and action spaces, but apparently they are not available until after a reset. I found I could replace the reset in the main process with the following as workaround and this does not fault.

env.game.reset()
env.initialize_spaces()

Here is the top of the backtrace from the coredump. I don't have debugging symbols enabled, but I thought this might be enough. If you need more let me know.

#0  0x00007fa3766347cd in vkGetDeviceQueue () from /lib64/libvulkan.so.1
#1  0x00007fa3767d2b66 in vk::VulkanDevice::initDevice(bool) ()
   from /home/spoonb/workspace/agents/agent/env/lib/python3.8/site-packages/griddly/libs/python_griddly.cpython-38-x86_64-linux-gnu.so
#2  0x00007fa3767d7af3 in griddly::VulkanObserver::lazyInit() ()
   from /home/spoonb/workspace/agents/agent/env/lib/python3.8/site-packages/griddly/libs/python_griddly.cpython-38-x86_64-linux-gnu.so
#3  0x00007fa3767c1f28 in griddly::SpriteObserver::lazyInit() ()
   from /home/spoonb/workspace/agents/agent/env/lib/python3.8/site-packages/griddly/libs/python_griddly.cpython-38-x86_64-linux-gnu.so
#4  0x00007fa3767d85d9 in griddly::VulkanObserver::update() ()
   from /home/spoonb/workspace/agents/agent/env/lib/python3.8/site-packages/griddly/libs/python_griddly.cpython-38-x86_64-linux-gnu.so
#5  0x00007fa3766f2bbd in ?? ()
   from /home/spoonb/workspace/agents/agent/env/lib/python3.8/site-packages/griddly/libs/python_griddly.cpython-38-x86_64-linux-gnu.so
#6  0x00007fa3767118fe in ?? ()
   from /home/spoonb/workspace/agents/agent/env/lib/python3.8/site-packages/griddly/libs/python_griddly.cpython-38-x86_64-linux-gnu.so
#7  0x00007fa3766eb277 in ?? ()
   from /home/spoonb/workspace/agents/agent/env/lib/python3.8/site-packages/griddly/libs/python_griddly.cpython-38-x86_64-linux-gnu.so
...

I have several GPUs. Mesa has an environmental variable (MESA_VK_DEVICE_SELECT) to let me select the Vulkan device and if I select the (typically default) NVIDIA device this occurs, but if I select the AMD device (the one my monitor is connected to) it does NOT occur.

GymWrapper action space creation bug

Found some breaking behaviour in GymWrapper::_create_action_space().

Griddly/python/griddly/GymWrapper.py

Lines 417 to 419 in 3ea1042

 if self.action_space is not None: 

 for old_space, space in zip(self.action_space, action_space): 

 space._np_random = old_space._np_random

If self.action_space is e.g. Discrete(2) while action_space is [Discrete(2), Discrete(2), ...], then the above line will break.

Usually, this is masked where the multiple agents have a MultiDiscrete action space and so the zip will unzip self.action_space of MultiDiscrete([2, 5]) into [Discrete(2), Discrete(5)] and iterate on that so that [MultiDiscrete([2, 5]), MultiDiscrete([2, 5])] for action_space will behave well.

I fixed it by adding:

if isinstance(self.action_space, gym.spaces.Discrete):
    self.action_space = [self.action_space]

between lines 417 and 418.

This occurred when running using a Multi-agent Env where each action has a Discrete action space.


import os
import griddly
from pprint import pprint
from griddly import gd
from griddly.util.rllib.environment.core import RLlibEnv, RLlibMultiAgentWrapper

if __name__ == "__main__":

    env_config = {
        'environment_name': 'sensor',
        "yaml_file": 'sensor_game_multi_agent.yaml',
        # 'yaml_file': "D:\miniconda\envs\sensor\Lib\site-packages\griddly\\resources\games\Multi-Agent\\foragers.yaml",
        "global_observer_type": gd.ObserverType.ISOMETRIC,
        'player_observer_type': gd.ObserverType.BLOCK_2D,
        'max_steps': 250,
        'level': 0,
    }

    env = RLlibEnv(env_config)
    env.enable_history(True)
    # pprint(env.unwrapped.action_input_mappings)
    # pprint(env.action_space)
    # env.on_episode_start(0, 0)
    env = RLlibMultiAgentWrapper(env, env_config)
    env.enable_history(True)

Provide a tutorial on adding TensorflowJS models to GriddlyJS

This documentation comes in two parts,

Firstly we need to educate people on how to convert models from any library into the ONNX format, then convert this to a tensorflowJS model.

Secondly, we need to show step-by-step how to add these models to GriddlyJS to evaluate levels in the user interface.

Allow renderer selection in user interface.

Currently we select the best renderer heuristically in the user interface in order of availability:

e.g [Sprite2D, Block2D]

We should allow the user to specify which renderer should be used so that they can debug different view modes.

Tooling for adding automatic evaluation of levels

Ideally, GriddlyJS should be able to automatically evaluate new levels when they are added to projects.

This would allow a quick turnaround of generating and evaluating datasets against specific policies.

Add Python RLLib tests

RLLib integration has no test coverage. This makes it quite flaky and sometimes breaks during updates.

We need some tests to make sure it is not broken when things like GymWrapper are updated.

Auto-generated documentation error for Zelda

I noticed that Zelda is in the default environment list now! But I also noticed that the auto-generated object description block has an error in the ReadTheDocs site. It says that the key object is a period, when it should be a plus sign.

https://griddly.readthedocs.io/en/latest/games/Zelda/index.html

RLlibEnv: `ModuleNotFoundError: No module named 'python_griddly'`

Describe the bug
When trying to train an RLlib Trainer on a Griddly environment, ray crashes.

To Reproduce

run python learn.py

Expected behavior
I expected the model to train, but it is instead complaining that the griddly_python module cannot be found.

Screenshots
Not exactly screenshots, but here's the relevant code and output:

Code:

import os
import sys

import ray
from ray import tune
from ray.rllib.agents.ppo import PPOTrainer
from ray.rllib.models import ModelCatalog
from ray.tune.registry import register_env

from griddly import gd
from griddly.util.rllib.torch import GAPAgent
from griddly.util.rllib.environment.core import RLlibEnv

if __name__ == '__main__':

    ray.init(num_cpus=8)

    env_name = "ray-griddly-env"

    register_env(env_name, RLlibEnv)
    ModelCatalog.register_custom_model("GAP", GAPAgent)

    max_training_steps = 1000

    config = {
        'framework': 'torch',
        'num_workers': 2,
        'num_envs_per_worker': 4,

        'model': {
            'custom_model': 'GAP',
            'custom_model_config': {}
        },
        'env': env_name,
        'env_config': {
            'record_video_config': {
                'frequency': 100
            },

            'random_level_on_reset': True,
            'yaml_file': 'Single-Player/Mini-Grid/minigrid-doggo.yaml',
            'global_observer_type': gd.ObserverType.SPRITE_2D,
            'max_steps': 1000,
        },
        'entropy_coeff_schedule': [
            [0, 0.01],
            [max_training_steps, 0.0]
        ],
        'lr_schedule': [
            [0, 0.0005],
            [max_training_steps, 0.0]
        ]
    }

    stop = {
        "timesteps_total": max_training_steps,
    }

    result = tune.run(PPOTrainer, config=config, stop=stop)

Output:

$ python learn.py
/home/rule/inbox/venv38/lib/python3.8/site-packages/ray/_private/services.py:238: UserWarning: Not all Ray Dashboard dependencies were found. To use the dashboard please install Ray using `pip install ray[default]`. To disable this message, set RAY_DISABLE_IMPORT_WARNING env var to '1'.
  warnings.warn(warning_message)
== Status ==
Memory usage on this node: 3.7/31.1 GiB
Using FIFO scheduling algorithm.
Resources requested: 3.0/8 CPUs, 0/0 GPUs, 0.0/17.64 GiB heap, 0.0/8.82 GiB objects
Result logdir: /home/rule/ray_results/PPO_2021-08-31_13-19-23
Number of trials: 1/1 (1 RUNNING)
+---------------------------------+----------+-------+
| Trial name                      | status   | loc   |
|---------------------------------+----------+-------|
| PPO_ray-griddly-env_f812d_00000 | RUNNING  |       |
+---------------------------------+----------+-------+


2021-08-31 13:19:25,076	ERROR trial_runner.py:773 -- Trial PPO_ray-griddly-env_f812d_00000: Error processing event.
Traceback (most recent call last):
  File "/home/rule/inbox/venv38/lib/python3.8/site-packages/ray/tune/trial_runner.py", line 739, in _process_trial
    results = self.trial_executor.fetch_result(trial)
  File "/home/rule/inbox/venv38/lib/python3.8/site-packages/ray/tune/ray_trial_executor.py", line 746, in fetch_result
    result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT)
  File "/home/rule/inbox/venv38/lib/python3.8/site-packages/ray/_private/client_mode_hook.py", line 82, in wrapper
    return func(*args, **kwargs)
  File "/home/rule/inbox/venv38/lib/python3.8/site-packages/ray/worker.py", line 1623, in get
    raise value
ray.exceptions.RayActorError: The actor died because of an error raised in its creation task, ray::PPO.__init__() (pid=35587, ip=10.63.143.152)
  Some of the input arguments for this task could not be computed:
ray.exceptions.RaySystemError: System error: No module named 'python_griddly'
traceback: Traceback (most recent call last):
  File "/home/rule/inbox/venv38/lib/python3.8/site-packages/ray/serialization.py", line 254, in deserialize_objects
    obj = self._deserialize_object(data, metadata, object_ref)
  File "/home/rule/inbox/venv38/lib/python3.8/site-packages/ray/serialization.py", line 190, in _deserialize_object
    return self._deserialize_msgpack_data(data, metadata_fields)
  File "/home/rule/inbox/venv38/lib/python3.8/site-packages/ray/serialization.py", line 168, in _deserialize_msgpack_data
    python_objects = self._deserialize_pickle5_data(pickle5_data)
  File "/home/rule/inbox/venv38/lib/python3.8/site-packages/ray/serialization.py", line 158, in _deserialize_pickle5_data
    obj = pickle.loads(in_band)
ModuleNotFoundError: No module named 'python_griddly'
Result for PPO_ray-griddly-env_f812d_00000:
  {}

== Status ==
Memory usage on this node: 3.8/31.1 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/8 CPUs, 0/0 GPUs, 0.0/17.64 GiB heap, 0.0/8.82 GiB objects
Result logdir: /home/rule/ray_results/PPO_2021-08-31_13-19-23
Number of trials: 1/1 (1 ERROR)
+---------------------------------+----------+-------+
| Trial name                      | status   | loc   |
|---------------------------------+----------+-------|
| PPO_ray-griddly-env_f812d_00000 | ERROR    |       |
+---------------------------------+----------+-------+
Number of errored trials: 1
+---------------------------------+--------------+----------------------------------------------------------------------------------------------------------------+
| Trial name                      |   # failures | error file                                                                                                     |
|---------------------------------+--------------+----------------------------------------------------------------------------------------------------------------|
| PPO_ray-griddly-env_f812d_00000 |            1 | /home/rule/ray_results/PPO_2021-08-31_13-19-23/PPO_ray-griddly-env_f812d_00000_0_2021-08-31_13-19-24/error.txt |
+---------------------------------+--------------+----------------------------------------------------------------------------------------------------------------+

(pid=35587) 2021-08-31 13:19:25,074	ERROR serialization.py:256 -- No module named 'python_griddly'
(pid=35587) Traceback (most recent call last):
(pid=35587)   File "/home/rule/inbox/venv38/lib/python3.8/site-packages/ray/serialization.py", line 254, in deserialize_objects
(pid=35587)     obj = self._deserialize_object(data, metadata, object_ref)
(pid=35587)   File "/home/rule/inbox/venv38/lib/python3.8/site-packages/ray/serialization.py", line 190, in _deserialize_object
(pid=35587)     return self._deserialize_msgpack_data(data, metadata_fields)
(pid=35587)   File "/home/rule/inbox/venv38/lib/python3.8/site-packages/ray/serialization.py", line 168, in _deserialize_msgpack_data
(pid=35587)     python_objects = self._deserialize_pickle5_data(pickle5_data)
(pid=35587)   File "/home/rule/inbox/venv38/lib/python3.8/site-packages/ray/serialization.py", line 158, in _deserialize_pickle5_data
(pid=35587)     obj = pickle.loads(in_band)
(pid=35587) ModuleNotFoundError: No module named 'python_griddly'
(pid=35587) 2021-08-31 13:19:25,074	ERROR worker.py:428 -- Exception raised in creation task: The actor died because of an error raised in its creation task, ray::PPO.__init__() (pid=35587, ip=10.63.143.152)
(pid=35587)   Some of the input arguments for this task could not be computed:
(pid=35587) ray.exceptions.RaySystemError: System error: No module named 'python_griddly'
(pid=35587) traceback: Traceback (most recent call last):
(pid=35587)   File "/home/rule/inbox/venv38/lib/python3.8/site-packages/ray/serialization.py", line 254, in deserialize_objects
(pid=35587)     obj = self._deserialize_object(data, metadata, object_ref)
(pid=35587)   File "/home/rule/inbox/venv38/lib/python3.8/site-packages/ray/serialization.py", line 190, in _deserialize_object
(pid=35587)     return self._deserialize_msgpack_data(data, metadata_fields)
(pid=35587)   File "/home/rule/inbox/venv38/lib/python3.8/site-packages/ray/serialization.py", line 168, in _deserialize_msgpack_data
(pid=35587)     python_objects = self._deserialize_pickle5_data(pickle5_data)
(pid=35587)   File "/home/rule/inbox/venv38/lib/python3.8/site-packages/ray/serialization.py", line 158, in _deserialize_pickle5_data
(pid=35587)     obj = pickle.loads(in_band)
(pid=35587) ModuleNotFoundError: No module named 'python_griddly'
Traceback (most recent call last):
  File "learn.py", line 59, in <module>
    result = tune.run(PPOTrainer, config=config, stop=stop)
  File "/home/rule/inbox/venv38/lib/python3.8/site-packages/ray/tune/tune.py", line 555, in run
    raise TuneError("Trials did not complete", incomplete_trials)
ray.tune.error.TuneError: ('Trials did not complete', [PPO_ray-griddly-env_f812d_00000])

Desktop (please complete the following information):

OS: linux (NixOS)
Version 1.2.6

Relevant package versions:

Package                    Version
-------------------------- ------------
griddly                    1.2.6
gym                        0.19.0
ray                        1.6.0
redis                      3.5.3
torch                      1.8.1

Additional context
I'm running this inside a nix shell, which sometimes has problems in setting up the right environment variables, particularly PATH-like variables. Could this simply be an issue of setting the right paths?

Importing griddly prints an empty line

Importing griddly prints an empty line:

⋊> ~/s/enn-incubator on clemens/hyperstate ⨯ pip list | grep griddly                                                                                                                                                                                                     (base) 08:56:04
griddly             1.2.36
⋊> ~/s/enn-incubator on clemens/hyperstate ⨯ python                                                                                                                                                                                                                      (base) 08:56:08
Python 3.7.3 (default, Mar 27 2019, 22:11:17)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import griddly

>>>

This causes a lot of empty lines to get printed in https://github.com/entity-neural-network/incubator when spawning multiple processes that all import griddly.

Segmentation fault when both observers as NONE

The following produces a segmentation fault. Ideally it would produce a warning. Using latest version available with pip (1.2.6).

env = gym.make(
    "GDY-Clusters-v0",
    player_observer_type=gd.ObserverType.NONE,
    global_observer_type=gd.ObserverType.NONE
)

Segmentation fault (core dumped)

OS:

Linux
Fedora 34

Compliance with Gym API

It seems like GymWrapper does not fully comply with the 0.22.0 Gym API (which is what is stated in requirements.txt).
There were lots of subtle changes in the Gym API that people probably don't immediately notice.

Some issues I noticed:

Seeding

Usually, when None is passed as a seed, some (unreproducible) source of entropy is used for seeding. In GymWrapper, a constant seed is used. This can lead to problems for people who just call env.seed(), expecting to get a randomized environment.
env.seed should return a list of seeds that were used. Currently, None is returned
GymWrapper.seed also seeds the action/observation spaces. This is usually not done in Gym and may lead to problems for people who first seed the action spaces, then the entire environment. Imho, this isn't a big issue and I would actually argue that this is the preferable approach, but it's not what Gym does
We usually seed an environment before we reset it. When you use the example from https://griddly.readthedocs.io/en/latest/getting-started/gym/index.html and call env.seed(), we get an exception.
These issues are somewhat secondary because the seed method is being deprecated, see the next point

Reset

Reset should accept the keyword argument seed. If it is not None, the environment will be seeded and then reset. If it is None but the environment has never been seeded before, some source of entropy will be used to seed the environment anyways.
In the new API, we should be able to pass the keyword argument return_info to env.reset. If this is true, a tuple (obs, info) should be returned. Otherwise, just the observation.

Metadata

A new format for the render-related metadata was introduced. The metadata should contain the keys "render_modes" and "render_fps". "render.modes" should be removed.

Rendering

I honestly can't say much about this because I'm not certain I completely understand the implementation in this repository yet.

It looks like the render-mode is fixed to some degree during initialization (ASCII vs pixels (?)), is that right? There is an upcoming API change in Gym that will also use this approach so this is not a big issue. However, one might want to start using the format that will be assumed by Gym (a single render_mode keyword iirc).
In human-rendering, environments usually return a boolean, not a numpy array
The metadata is currently suggesting that human-rendering and rgb_array-rendering are available. However, it seems like ASCII representations is also supported. Is that true? That's also a valid way to render environments but the metadata should be adjusted accordingly.

These are mostly non-breaking changes. The only exception may be the seeding stuff :/

Once these issues are sorted out, the API compliance can be checked via the check_env function from gym.utils.env_checker :)

ASCII render is missing 'dots' which breaks env.reset

when I use the levels generated by the EnvironmentGeneratorGenerator to then reset a regular RLLibEnv environment with the newly created level, I am getting a ValueError: Invalid number of characters... error.

import gym

from griddly import GymWrapperFactory, gd
from griddly.util.environment_generator_generator import EnvironmentGeneratorGenerator
from griddly.util.rllib.environment.core import RLlibEnv

def build_generator(test_name, yaml_file):
    egg = EnvironmentGeneratorGenerator(yaml_file=yaml_file)
    generator_yaml = egg.generate_env_yaml((10,10))
    
    rllib_config_dict = {'environment_name':test_name,
        'yaml_string':generator_yaml,
        'global_observer_type':gd.ObserverType.ASCII,
        'player_observer_type':gd.ObserverType.ASCII,
                        }

    env = RLlibEnv(rllib_config_dict)
    env.reset()
    return env

yaml_path = 'Single-Player/GVGAI/Zelda.yaml'

genv = build_generator('foo', yaml_path)

for i in range(0, 100):
    action = genv.action_space.sample()
    obs, reward, done, info = genv.step(action)

player_ascii_string = genv.render(observer=0)

rllib_config = {'environment_name': 'zelda',
                 'yaml_file': yaml_path,
                 'level': 0,
                 'max_steps': 250,
                 'global_observer_type': gd.ObserverType.SPRITE_2D,
                 'player_observer_type': gd.ObserverType.VECTOR,
                 'random_level_on_reset': False,
                 }

env = RLlibEnv(rllib_config)

state = env.reset(level_string = player_ascii_string)

Screenshots

Desktop (please complete the following information):

OS: windows
Version 1.2.1

Issue Running on HPC

Issue
My team and I have incorporated Griddly in our research and I would like to run some code on our HPC (Redhat cluster). Question: Can I sidestep Griddly's Vulkan dependency if I am running it headless?

Installing Griddly "as is" produces the following error:

Traceback (most recent call last):
File "mypython.py", line 22, in
import griddly
File "/path/myvenv/lib/python3.7/site-packages/griddly/init.py", line 16, in
gd = importlib.import_module('python_griddly')
File "/afs/cad/linux/anaconda3.7/anaconda/lib/python3.7/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ImportError: libvulkan.so.1: cannot open shared object file: No such file or directory

Thank you!

Improve the game descriptions

The descriptions of games in the documentation are pretty lackluster and don't describe the games very well.

Tooling for adding automatic evaluation of levels

Ideally, GriddlyJS should be able to automatically evaluate new levels when they are added to projects.

This would allow a quick turnaround of generating and evaluating datasets against specific policies.

Griddly rollout example

Hello,
I tested most of Griddly functions and I have trained the clusters (10x10) example using GAAgent, from griddly.util.rllib.torch. How ever, I was not able to rollout using the checkpoint. Here is my error:
RuntimeError: Given groups=1, weight of size [32, 10, 3, 3], expected input[1, 312, 240, 3] to have 10 channels, but got 312 channels instead
Do you have and example of code where you are able to use the checkpoint for agent.compute_action(state).

Thanks

Fix Schema

Some schema items for GDY are:

Not well documented: vague descriptions of what the properties can be used for.
Incorrectly specified: Validation for command lists and terminations

Improve responsive design for different browsers/platforms and resolutions

When GriddlyJS is resized, or used on platforms with low-resolution, interface components can go off-screen or become un-usable.

We should improve this by using the screen size tooling (in the chrome debugger) and responsive design paradigms.

A more specific example can be found in #224

Write/Generate documentation for python interfaces

documentation for GDY schema and games is automatically generated by scripts found here

Documentation for the python interface itself could also be automatically generated using sphinx.ext.autodoc.
This would be a super nice addition.

No way to install on Apple Silicon (M1)?

Hi!

I switched to Apple silicon M1 macbook from intel macbook and I'm no longer able to install griddly.

I'm using python 3.8 and miniconda 4.10.0

(dl) btak@akhil-mac-air python % pip install griddly
ERROR: Could not find a version that satisfies the requirement griddly (from versions: none)
ERROR: No matching distribution found for griddly

I cloned the repo (master), went to python dir and ran python setup.py install. The package was installed, but I get the following error when I run my env:

Traceback (most recent call last):
  File "/Users/btak/code/reward_lab/rl_talk/env_demo_sokoban.py", line 2, in <module>
    from griddly import gd
  File "/opt/homebrew/Caskroom/miniforge/base/envs/dl/lib/python3.8/site-packages/griddly-1.2.1-py3.8-macosx-11.0-arm64.egg/griddly/__init__.py", line 16, in <module>
    gd = importlib.import_module('python_griddly')
  File "/opt/homebrew/Caskroom/miniforge/base/envs/dl/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named 'python_griddly'

It is in this line:

# Load the binary
gd = importlib.import_module('python_griddly')

From the lines above that line, it looks like maybe I'm missing 'libs' and 'Debug' folders? I checked and I didn't find those dirs.

# The python libs are found in the current directory
module_path = os.path.dirname(os.path.realpath(__file__))
libs_path = os.path.join(module_path, 'libs')

sys.path.extend([libs_path])

debug_path = os.path.join(module_path, '../../Debug/bin')
sys.path.extend([debug_path])

It's possible that I'm in a completely wrong path. Please help

Return the Orientation Vector in the state information

currently env.get_state() is missing the information about the orientation of objects in the environment.

This should be added.

IndexError when using proximity triggers with large trigger range values.

I am currently trying to create an environment where a team of agents (jellies) compete with enemy units (gnomes) to win a shootout game. The gnomes are part of the environment and make use of proximity triggers to fire on the jellies. They are also using an A* search to home in on the jellies. It seems that when large values of Range are set in this block: ....

 Trigger:
      Type: RANGE_BOX_AREA
      Range: 20  # This is a large value

... the environment fails immediately with the following error:

Traceback (most recent call last):
  File "/Users/tekier/uni/diss/smac_lite/aliens/aliens.py", line 29, in <module>
    obs, reward, done, info = env.step(env.action_space.sample())
  File "/opt/anaconda3/envs/smac_lite/lib/python3.8/site-packages/gym/wrappers/order_enforcing.py", line 37, in step
    return self.env.step(action)
  File "/opt/anaconda3/envs/smac_lite/lib/python3.8/site-packages/gym/wrappers/step_api_compatibility.py", line 52, in step
    step_returns = self.env.step(action)
  File "/opt/anaconda3/envs/smac_lite/lib/python3.8/site-packages/griddly/GymWrapper.py", line 197, in step
    reward, done, info = self.game.step_parallel(action_data)
IndexError: unordered_map::at: key not found

When small values are used for range e.g. 1, it runs for longer but eventually fails with the same error when the gnomes approach the jellies. Basically, no matter what value range is set, it eventually fails with the same error. This makes me think that something is going out of bounds of the map since larger values are potentially making it go out of bounds sooner, which is why its failing more quickly.

The zip I have attached contains everything required to recreate the bug. You just need to run the Python file.

To recreate:
aliens.zip

Run aliens.py.
With any luck, it should just fail with the error I have described above.

My environment:
MacOS Monterey 12.2
Python 3.8.13
Griddly version 1.4.2

Block2D rendering failing with IndexError.

So far, I have been rendering the environment only using gd.ObserverType.ISOMETRIC. I tried to switch to gd.ObserverType.BLOCK_2D by specifying the following sections in the GDY file.

# Specifying this block for each object under the Observers key.
Block2D:
   -Shape: e.g. circle
     Color: [ x, y, z ]
     Scale: 1.0

# Specifying in the Environment section also under Observers
Block2D:
   TileSize: 30

When I run the program though, what I get is a couple of time steps rendering and then the following error:

Traceback (most recent call last):
  File "/Users/tekier/uni/diss/smac_lite/aliens/aliens.py", line 31, in <module>
    obs, reward, done, info = env.step(env.action_space.sample())
  File "/opt/anaconda3/envs/smac_lite/lib/python3.8/site-packages/gym/wrappers/order_enforcing.py", line 37, in step
    return self.env.step(action)
  File "/opt/anaconda3/envs/smac_lite/lib/python3.8/site-packages/gym/wrappers/step_api_compatibility.py", line 52, in step
    step_returns = self.env.step(action)
  File "/opt/anaconda3/envs/smac_lite/lib/python3.8/site-packages/griddly/GymWrapper.py", line 363, in step
    self._player_last_observation[p] = self._get_observation(self._players[p].observe(),
IndexError: map::at:  key not found

Hopefully, this is case of me just missing something from the GDY file, but I've had a look at some examples and it seems like I've got everything I need.

To Recreate
aliens.zip

Download and unzip the attached folder.
Run aliens.py

My environment:
MacOS Monterey 12.2
Python 3.8.13
Griddly version 1.4.3

	if self.action_space is not None:
	for old_space, space in zip(self.action_space, action_space):
	space._np_random = old_space._np_random

bam4d / griddly Goto Github PK

griddly's People

Contributors

Stargazers

Watchers

Forkers

griddly's Issues

What

Why

Seeding

Reset

Metadata

Rendering

Recommend Projects

Recommend Topics

Recommend Org