Giter VIP home page Giter VIP logo

bed's Introduction

Behavior cloning for Error Discovery

This repository contains the code for the paper "Self Supervised Detection of Incorrect Human Demonstrations: A Path Toward Safe Imitation Learning by Robots in the Wild" by Noushad Sojib & Momotaz Begum.

BED is a BC model with an additional parameter vector $w$ of length $|D|$. It utilize a loss function that penalize different kinds of inconsistency and help to learn $w_i\approx1$ for good demos and $w_i\approx0$ for bad demos. As bad demos add more loss the the total loss, discarding them (by assigning $w_i=0$) will reduce the total loss and help to detect the bad demos.

Installation

    def forward(self, **inputs):
        enc_outputs = self.nets["encoder"](**inputs) 
        self.last_latent = enc_outputs[-1,:]    #add this line
        self.latent=enc_outputs                 #add this line
        mlp_out = self.nets["mlp"](enc_outputs)
        return self.nets["decoder"](mlp_out)

Download the data

  • Layman V1.0 can data download
  • Full dataset download link will be available soon.

Training BED

Run the following command to train the BED model

python bed_training_path.py --config path/config.json --m 0.8 --accelerate 40 --gscale 5

Example: Train BED on can 80% data. You can press Ctrl+C for early stopping.

python bed_training_path.py --config path/configs/can/bed_can_510_p20b.json --m 0.8 --accelerate 40 --gscale 5

Explnation of the arguments:

  • --config: path to the configuration file
  • --m: percentage of demos we want to keep
  • --accelerate: use higher learning rate after this epoch for faster convergence
  • --gscale: importance of path loss

Expected w: As there are 150 demos total, we expect 30 of them will get $w\approx0$ and 120 of them will get $w\approx1$. Rounding will make them binary. Here is the expected w vector before rounding:

w:  [ 1.    1.    1.    1.    1.    1.    1.    1.    1.    1.    1.    1.
  1.    1.    1.    1.    1.    1.    1.    1.    0.49  1.    1.    1.
  1.    1.    1.    1.    1.    1.    1.    1.    1.    1.    1.    1.
  1.    1.    1.    1.    1.    1.    1.    1.    1.    1.    1.    1.
  1.    1.    1.    1.    1.    1.    1.    1.    1.    0.96  1.    1.
  1.    1.    1.    1.    1.    1.    1.    1.    1.    1.    1.    1.
  0.73  1.    1.    1.    1.    0.99  1.    1.    1.    1.    1.    1.
  1.    1.    0.99  1.    1.    0.99  0.91  1.    1.    1.    1.    1.
  1.    0.62  0.99  0.99  1.    1.    1.    0.48  0.08  0.72  1.    1.
  0.84  1.    1.    1.    1.    1.    0.99  0.99  1.    1.    1.    1.
 -0.   -0.   -0.    0.51 -0.   -0.   -0.   -0.   -0.   -0.   -0.   -0.
 -0.   -0.   -0.   -0.   -0.   -0.   -0.    0.   -0.    0.59 -0.    0.
  0.37 -0.   -0.   -0.    0.    1.  ]

View the training log located in logs to see how training looks like. It tooks 47 minutes to train on a Single NVIDIA A40 GPU.

Acknowledgement

bed's People

Contributors

nsojib avatar

Watchers

 avatar Madison Clark-Turner avatar Mostafa Hussein avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.