ucuapps / modelicagym Goto Github PK

Modelica models integration with Open AI Gym

License: GNU General Public License v3.0

Python 78.60% Modelica 21.40%

reinforcement-learning reinforcement-learning-environments openai-gym modelica

modelicagym's Introduction

ModelicaGym: Applying Reinforcement Learning to Modelica Models

This ModelicaGym toolbox was developed to employ Reinforcement Learning (RL) for solving optimization and control tasks in Modelica models. The developed tool allows connecting models using Functional Mock-up Interface (FMI) to OpenAI Gym toolkit in order to exploit Modelica equation-based modelling and co-simulation together with RL algorithms as a functionality of the tools correspondingly. Thus, ModelicaGym facilitates fast and convenient development of RL algorithms and their comparison when solving optimal control problem for Modelica dynamic models.

Inheritance structure of the ModelicaGym toolbox classes and the implemented methods are discussed in details in examples. The toolbox functionality validation is performed on the Cart-Pole balancing problem. This includes physical system model description and it's integration in the toolbox, experiments on selection and influence of the model parameters (i.e. force magnitude, Cart-pole mass ratio, reward ratio, and simulation time step) on the learning process of Q-learning algorithm supported with discussion of the simulation results.

Paper

ArXiv preprint version can be found here.

Repository contains:

modelicagym.environments package for integration of FMU as an environment to OpenAI Gym. FMU is a functional model unit exported from one of the main Modelica tools, e.g. Dymola(proprietary) or JModelica(open source). Currently only FMU's exported in co-simulation mode are supported.
gymalgs.rl package for Reinforcement Learning algorithms compatible to OpenAI Gym environments.

Instalation

Full instalation guide is available here.

You can test working environment by running ./test_setup.py script.

You can install package itself by running pip install git+https://github.com/ucuapps/modelicagym.git (or pip3 install git+https://github.com/ucuapps/modelicagym.git if you have both python versions installed).

Examples

Examples of usage of both packages can be found in examples folder.

Tutorial explains how to integrate FMU using this toolbox in a step-wise manner. CartPole problem is considered as an illustrative example for the tutorial. Code from cart_pole_env.py is referenced and described in details.
cart_pole_env.py is an example how a specific FMU can be integrated to an OpenAI Gym as an environment. Classic cart-pole environment is considered. Corresponding FMU's can be found in the resources folder.
cart_pole_q_learner.py is an example of Q-learning algorithm application. Agent is trained on the Cart-pole environment simulated with an FMU. Its' integration is described in previous example.
Running examples is expected without modelicagym package installation. To run cart_pole_q_learner.py one just has to clone the repo. The advised way to run examples is with PyCharm IDE. It automatically adds project root to the PYTHONPATH.

If one wants to run example from the command line, they should update PYTHONPATH with project root:

:<work_dir>$ git clone https://github.com/ucuapps/modelicagym.git
:<work_dir>$ export PYTHONPATH=$PYTHONPATH:<work_dir>/modelicagym
:<work_dir>$ cd modelicagym/examples
:<work_dir>/modelicagym/examples $ python3 cart_pole_q_learner.py

modelicagym's People

Contributors

Stargazers

Watchers

modelicagym's Issues

Simulation with multiple FMUs

As I understood by reading the tutorial on how to integrate FMU, it is only possible to use one FMU for control in this framework. My aim is to control a battery model exported as FMU, which is located in a large power network. For this purpose, I need my observation space to include quantities outside of the battery FMU. For example, if my battery is located in a bus with a photovoltaic connected (which is also another FMU) to be able to include its power in the observation space.

Is something like that possible? Would it be possible with some modifications?

Differences between linux and windows

I would like to ask if I can use modelicagym for windows and whether there are any disadvantages to using windows compared to linux.

Set options["result_handling"] = "memory" as default

Hi,
I would suggest to set options["result_handling"] = "memory" as default in modelica_base_env/do_simulation.
Writing to disk is a rather expensive operation (my code was running 15x slower because of this) and changing the setting would require overriding the do_simulation (I don't see any kwargs that passes such option to the underlying PyFMI model, right?)

While for PyFMI was making sense to leave a different default, I think that ModelicaGym users might not be interested in having a variable dump, especially if this comes at such an high performance cost.

new setup.py corrent with examples?

When running from the example dir I get ModuleNotFoundError: No module named 'gymalgs'
setup_test is working.
when changed back to ..gymalgs.rl I get
from ..gymalgs.rl import QLearner ValueError: attempted relative import beyond top-level package
can load modelicagym.gymalgs.rl but then I get ModuleNotFoundError: No module named 'examples' somewhere down the line

Modym

Hi!

I'd like to share the work I've been doing highly inspired in modelicagym.

https://github.com/Wuerike/modym

Installation

The installation procedure is not working.

Mainly I guess because svn co https://svn.jmodelica.org/trunk JModelica is off.

Could you work it around? Im trying do it here too.

Another question, can this app run on windows?

Inconsistent return type in `modelica_base_env`

Hi,
the step() function usually returns a tuple which first argument is the state.
The state type turns out to be set, for the default case, by get_state(result) of the do_simulation() function and it is a tuple (modelica_base_env@241).

However, in case the step functions catches a done(), then the step function returns a np.array(self.state); thus being different from the previous case.

Am I correct?
And: why returning a tuple instead of an array which is much more comfortable to use? Is it because the FMU might return integers instead of all-reals?

Thanks!