Giter VIP home page Giter VIP logo

sabblium's Introduction

SaBBLium: A Flexible and Simple Library for Learning Sequential Agents (including Reinforcement Learning)

TL;DR :

SaBBLium is a lightweight library extending PyTorch modules for developing sequential decision models. It can be used for Reinforcement Learning (including model-based with differentiable environments, multi-agent RL, etc...), but also in a supervised/unsupervised learning settings (for instance for NLP, Computer Vision, etc...). It is derived from SaLinA and BBRL

  • It allows to write very complex sequential models (or policies) in few lines
  • main difference with BBRL and SaLinA is that SaBBLium is compatible with Gymnasium:
    • No more NoAutoResetGymAgent or AutoResetGymAgent just a GymAgent that can be used in both cases depending on wether the Gymnasium environment contains an AutoResetWrapper or not.
    • You should now use env/stopped instead of env/done as a stop variable
  • No multiprocessing / no remote agent or workspace yet
  • You can easily save and load your models with agent.save and Agent.load by making them inherit from SerializableAgent, if they are not serializable, you have to override the serialize method.
  • An ImageGymAgent has been added with adapted serialization
  • Many typos have been fixed and type hints have been added

Citing SaBBLium

SaBBLium being inspired from SaLinA, please use this bibtex if you want to cite SaBBLium in your publications:

Please use this bibtex if you want to cite this repository in your publications:

Link to the paper: SaLinA: Sequential Learning of Agents

    @misc{salina,
        author = {Ludovic Denoyer, Alfredo de la Fuente, Song Duong, Jean-Baptiste Gaya, Pierre-Alexandre Kamienny, Daniel H. Thompson},
        title = {SaLinA: Sequential Learning of Agents},
        year = {2021},
        publisher = {Arxiv},salina_cl
        howpublished = {\url{https://gitHub.com/facebookresearch/salina}},
    }

Quick Start

  • Just clone the repo and
  • with pip 21.3 or newer : pip install -e .

For development, set up pre-commit hooks:

  • Run pip install pre-commit
    • or conda install -c conda-forge pre-commit
    • or brew install pre-commit
  • In the top directory of the repo, run pre-commit install to set up the git hook scripts
  • Now pre-commit will run automatically on git commit!
  • Currently isort, black are used, in that order

Organization of the repo

Dependencies

SaBBLium utilizes PyTorch, Hydra for configuring experiments, and Gymnasium for reinforcement learning environments.

Note on the logger

We provide a simple Logger that logs in both TensorBoard format and wandb, but also as pickle files that can be re-read to make tables and figures. See logger. This logger can be easily replaced by any other logger.

Description

Sequential Decision Making is much more than Reinforcement Learning

  • Sequential Decision Making is about interactions:
  • Interaction with data (e.g. attention-models, decision tree, cascade models, active sensing, active learning, recommendation, etc….)
  • Interaction with an environment (e.g. games, control)
  • Interaction with humans (e.g. recommender systems, dialog systems, health systems, …)
  • Interaction with a model of the world (e.g. simulation)
  • Interaction between multiple entities (e.g. multi-agent RL)

What SaBBLium is

  • A sandbox for developing sequential models at scale.

  • A small (300 hundred lines) 'core' code that defines everything you will use to implement agents involved in sequential decision learning systems.

    • It is easy to understand and use since it keeps the main principles of pytorch, just extending nn.Module to Agent in order to handle the temporal dimension.
  • A set of agents that can be combined (like pytorch modules) to obtain complex behaviors

  • A set of references implementations and examples in different domains Reinforcement Learning, Imitation Learning, Computer Vision, with more to come...

What SaBBLium is not

  • Yet another reinforcement learning framework: SaBBLium is focused on sequential decision-making in general. It can be used for RL (which is our main current use-case), but also for supervised learning, attention models, multi-agent learning, planning, control, cascade models, recommender systems, among other use cases.
  • A library: SaBBLium is just a small layer on top of pytorch that encourages good practices for implementing sequential models. Accordingly, it is very simple to understand and use, while very powerful.
  • A framework: SaBBLium is not a framework, it is just a set of tools that can be used to implement any kind of sequential decision-making system.

License

SaBBLium is released under the MIT license. See LICENSE for additional details about it.

sabblium's People

Contributors

arlaz avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.