Giter VIP home page Giter VIP logo

pc-gym's Introduction

Max Bloor - PhD Student

Centre for Process Systems Engineering, Imperial College London 🧪💻

⚙ Currently working on the use of reinforcement learning to tune controllers for chemical processes.

Twitter

LinkedIn

Google Scholar

pc-gym's People

Contributors

ilyaorson avatar josetorraca avatar mawbray avatar maximilianb2 avatar trsav avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

trsav

pc-gym's Issues

Problem tracks for demo day

Potentially prepare 2 problem tracks for the audience

  • For PSE backgrounds: a walkthrough/playground to setup a model as an RL problem and how to optimize it.
  • For CS backgrounds: a challenge to optimize an RL policy to achieve the same level of performance as an oracle.

These could be based on the same model to reduce the workload.

Organised collaboration workflow

Hey guys, I noticed there has been many commits that do not have a very clear purpose recently.
This makes it hard to understand the state of the code and the things that need to be made.
I took the liberty to move these to a specific branch and revert back to organise better and plan ahead, hope you do not mind!

It would be good to move on from here with a more organised workflow so that we can collaborate better and the repo ends up in an attractive state when it gets released.

IMHO the GitHub Flow branching strategy is the best suited for research development, followed by GitLab Flow for maintenance after the library is released (this one might be overkill for a small library though).

Let me know your thoughts! 😄 This would imply that we work on branches different to main and only contribute back to it through pull requests that we can all understand.

Feature timeline

Before the first internal tests

  • Cusomisation Documentation
    • Params
    • Model
    • Constraints
  • Model description inc. hard to operate params/setpoints
  • Example Notebooks
  • Constraint violation plots
  • Reproducibility Metric
  • Multi Timescale model
  • Jose pipeline model

Feature Ideas

  • Policy evaluation
    • Learning curve plot
    • cross-validation
    • Plot custom constraints
  • Customisation
    • Reward function
    • Update MPC to use the control/Custom constraints as currently only does state
  • Oracle
    • IMC Tuned FB controller (i.e. if MPC fails to converge this could be
      used as a backup?)
    • Option to allow/disallow disturbance and setpoint foresight
  • Other
    • Ability to specify observable states
    • Leaderboard / Hackathon
    • compatibility with jax parallelisation/vectorization

Done

  • Policy evaluation tool

    • Oracle MPC with perfect model?
    • Return distribution
    • Reproducibility Metric
    • Real plot axis naming
  • Customisation

    • Model parameters
    • Model Dynamics
    • Constraint Functions
  • Model Reformulation as Python classes

    • Allow disturbances for JAX models
    • Expose model details (i.e m.info returns variable names for states, controls etc.)
    • Change SP, Constraints, and disturbances to use variable names instead of '0', '1' etc.
    • Allow for non-sequential definition of disturbances/constraints
    • First Order system and Multistage extraction reformulation

Handle the parameters of models with classes

Here is a proposal for defining a model whose parameters are set at initialisation and which can then be called with the expected signatures for other methods.

from diffrax import diffeqsolve, ODETerm, Dopri5
import jax.numpy as jnp

def f(t, y, args):
    return -y

term = ODETerm(f)
solver = Dopri5()
y0 = jnp.array([2., 3.])
solution = diffeqsolve(term, solver, t0=0, t1=1, dt0=0.1, y0=y0)

# Dataclass version

from dataclasses import dataclass

# frozen: makes the objets immutable after creation
# so parameters can not be modified at runtime
# it also makes the class hashable, as required by Equinox:
# ValueError: Non-hashable static arguments are not supported.

# kw_only: require the parameter names if they want
# to be set when the object is created

@dataclass(frozen=True, kw_only=True)
class Model:
  a:float = 1.0
  def __call__(self, t, y, args):
    return -self.a*y

m = Model(a=2.0)
sol = diffeqsolve(ODETerm(m), solver, t0=0, t1=1, dt0=0.1, y0=y0)

# can also pass the complete or partial parameters from a dict
# params = {"a": 2.0}
# m = Model(**params)

# no performance difference
# jax.jit seems to have no effect

# term = ODETerm(f)
# term = ODETerm(Model())
# term = ODETerm(jax.jit(Model()))
# %timeit sol = diffeqsolve(term, solver, t0=0, t1=1, dt0=0.1, y0=y0)

Add more unique plots and optimality gap information

Our MPC oracle setting allows us to provide more information about the performance of RL policies:

  • Optimality gaps
    • Overall gap in reward
    • Gap in value function per state.
    • Gap in Q function per state-action pair.
  • Identify local optima by comparing control trajectories of MPC oracle and RL policy.
  • State and action distributions of trained policies (example).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.