Giter VIP home page Giter VIP logo

giuliacassara / imitation-learning-over-heterogeneous-agents-with-restraining-bolts Goto Github PK

View Code? Open in Web Editor NEW
3.0 3.0 1.0 32.17 MB

Analysis and implementation of a modified training environment from the paper "Imitation Learning over Heterogeneous Agents with Restraining Bolts." (De Giacomo et al, 2020)

License: GNU General Public License v3.0

Python 100.00%
restraining-bolts imitation-learning heterogeneous-agents reinforcement-learning python

imitation-learning-over-heterogeneous-agents-with-restraining-bolts's Introduction

Imitation-Learning-over-Heterogeneous-Agents-with-Restraining-Bolts

Analysis and implementation of a modified training environment from the paper "Imitation Learning over Heterogeneous Agents with Restraining Bolts." (De Giacomo et all, 2020). The present project consist on a modification of the code from the reference paper.

"A common problem in Reinforcement Learning (RL) is that the reward function is hard to express. This can be overcome by resorting to Inverse Reinforcement Learning (IRL), which consists in first obtaining a reward function from a set of execution traces generated by an expert agent, and then making the learning agent learn the expert’s behavior –this is known as Imitation Learning (IL). Typical IRL solutions rely on a numerical representation of the reward function, which raises problems related to the adopted optimization procedures. We describe an IL method where the execution traces generated by the expert agent, possibly via planning, are used to produce a logical (as opposed to numerical) specification of the reward function, to be incorporated into a device known as Restraining Bolt (RB). The RB can be attached to the learning agent to drive the learning process and ultimately make it imitate the expert. We show that IL can be applied to heterogeneous agents, with the expert, the learner and the RB using different representations of the environment’s actions and states, without specifying mappings among their representations."

Work done for the subject Reasoning Agents of the branch of artificial intelligence in the MSc. Artificial Intelligence and Robotics 2020 in times of coronavirus.

  • Giulia Cassarà
  • Ivan Colantoni
  • Fabian Fonseca
  • Damiano Zappia

Modified environmentBreakout

The environment had a set four modifications in order to stress the boundaries of the implementation: Vertical movement ofthe bricks, horizontal movements of the bricks, movement of the bricks around the plane and the inclusion of the front paddle.

The combination of those modifications presented three new study cases in which the state space gap increases and the robustness of the implementation is tested.

Vertical movement with frontal paddle

The vertical movement of the bricks and the addition of a front paddle have no influence on the functioning of the expert agent, but they considerably modify the behavior of the learner agent since now is capable to cooperate with the front paddle to get more reward.

Horizontal movement with frontal paddle

The Horizontal movement of the bricks increases to a medium level the difficulty for the expert agent, but the learner agent remains stable in cooperation.

Plane movement with frontal paddle

The plane movement of the bricks peaked the difficulty for the expert agent since now is not able to finish the task due to resonance phenomena (which is only visible in the video of the experiment). Nevertheless, the traces obtained from the failed expert are able to generate a DFA that guides the lerner agent to a successful behavior.

Setup

Set your virtual environment up:

pipenv --python=python3.7

Install the dependencies in the main directory:

pipenv install

How to run

Activate the virtual environment in the current shell:

pipenv shell

And execute the code

python3 -m breakout --rows 3 --cols 3
    --output-dir experiments/breakout-output
    --overwrite 

References

  • [1] De Giacomo, Giuseppe, et al. “Imitation Learning over Heterogeneous Agents with Restraining Bolts.” Association for the Advancement of Artificial Intelligence. 2020.
  • [2] Gaon, Brafman. “Reinforcement Learning with Non-Markovian Rewards.” Association for the Advancement of Artificial Intelligence. 2020.

imitation-learning-over-heterogeneous-agents-with-restraining-bolts's People

Contributors

fhfonsecaa avatar giuliacassara avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

ivancolantoni

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.